Method and device for determining an environment map by a server using motion and orientation data

Abstract

Method for determining an environment map comprising, server-side receiving of motion data of a mobile device, server-side receiving of orientation data of a camera of the mobile device and server-side receiving of the respective image of the camera associated with the received motion data and orientation data, server-side evaluation of the received image together with the motion data and the orientation data for creating a server-side point cloud, the server-side point cloud forming at least in parts the environment map.

Claims

1. A method for determining an environment map comprising: server-side receiving of motion data from a mobile device; server-side receiving of orientation data of a camera of the mobile device; server-side receiving of an image of the camera associated with the motion data and the orientation data; and server-side evaluating the image together with the motion data and the orientation data to create a server-side point cloud, wherein the server-side point cloud forms the environment map at least in parts.

2. The method according to claim 1, wherein a mobile station-side point cloud created by the mobile device is received on the server-side, and the server-side point cloud is combined with the mobile station-side point cloud to form a new server-side point cloud.

3. The method according to claim 2, wherein during the combination of the mobile station-side point cloud with the server-side point cloud: points of the server-side point cloud are at least partially supplemented by points of the mobile station-side point cloud, and/or point coordinates of the points of the server-side point cloud are at least partially converted with aid of point coordinates of the points of the mobile-station-side point cloud and/or vice versa.

4. The method according to claim 1, wherein point coordinates represent a point cloud, points of the point cloud correspond respectively to a point coordinate.

5. The method according to claim 4, wherein the point coordinates are represented by vectors.

6. The method according to claim 2, wherein point coordinates of points of the mobile station-side point cloud are changed with aid of the motion data and/or the orientation data evaluated on the server-side, in particular in that a drift of the point coordinates of the points on the mobile station-side point cloud is corrected.

7. The method according to claim 1, wherein the orientation data is corrected on the server-side with aid of the motion data.

8. The method according to claim 1, wherein a feature recognition and/or a feature description is carried out on the server-side in the image, in particular by means of multi-scale oriented patches (MOPS) or scale-invariant feature transform (SIFT).

9. The method according to claim 1, wherein features are learned on the server-side during feature recognition and/or a feature description, in particular in that deep learning is performed.

10. The method according to claim 1, wherein a bundle adjustment is performed on the server-side in the image.

11. The method according to claim 2, wherein an outlier removal is carried out on the server-side in the mobile station-side point cloud, in particular by means of statistical evaluation.

12. The method according to claim 2, wherein the mobile station-side point cloud is generated by means of visual, motion-related odometry.

13. The method according to claim 2, wherein a loop closure is determined with aid of the server-side point cloud, and in that a drift of points of the mobile-station-side point cloud is corrected on a basis of the loop closure, in particular a linear adjustment of the drift being carried out.

14. The method according to claim 1, wherein depending on the server-side evaluation of the image, setting data for a mobile-station-side camera are determined and transmitted to the mobile device.

15. The method according to claim 2, wherein the new server-side point cloud is transmitted to the mobile device.

16. The method according to claim 1, wherein the server-side point cloud is enriched with anchors.

17. The method according to claim 1, wherein the server-side point cloud is received by the mobile device, and in that an augmented reality application is initialized in the mobile device using the server-side point cloud.

18. The method according to claim 1, wherein mobile station point clouds are received from a plurality of mobile devices at the server-side, in that the mobile station point clouds are used to adapt the server-side point cloud.

19. The method according to claim 1, wherein at least two instances are determined for the server-side point cloud, wherein in each instance at least one feature is learned by feature recognition and/or a feature description on the server-side and the respective instances together with features are transmitted to a mobile device.

20. A server arranged for determining an environment map comprising: a receiving device arranged for receiving motion data of a mobile device, orientation data of a camera of the mobile device and an image of the camera assigned to the motion data and the orientation data; and a computing device arranged for evaluating the image together with the motion data and the orientation data to generate a server-side point cloud, the server-side point cloud forming the environment map at least in parts.

Description

BRIEF DESCRIPTION OF THE FIGURES

(1) In the following, the subject matter is explained in more detail with reference to a drawing showing embodiments. In the drawing show:

(2) FIG. 1 a system for determining an environment map;

(3) FIG. 2 a representation of features and their recognition in images as well as an assignment to points/point coordinates;

(4) FIG. 3 an arrangement for acquiring motion data, orientation data and/or point clouds from a mobile device;

(5) FIG. 4 a sequence for determining a server-side point cloud;

(6) FIG. 5a an adjustment of mobile station side points coordinates;

(7) FIG. 5b an adjustment of mobile station side point coordinates depending on a drift;

(8) FIG. 6 a flow for optimizing a mobile station side point cloud;

(9) FIG. 7 a method for generating adjustment data for a mobile device;

(10) FIG. 8 a method for transmitting a point cloud to a mobile device;

(11) FIG. 9 a method for determining the position of a mobile device.

DETAILED DESCRIPTION

(12) With the aid of the present method, it is possible to create a central point cloud based on motion data, orientation data and/or point clouds acquired on the mobile device side, in order to thus solve positioning tasks in an optimized manner.

(13) On a mobile device, visual and regress-based odometry can be performed to create a mobile station-side point cloud. This is well known and is supported by programming interfaces such as ARKit from Apple® and ARCore from Google®. The disadvantage of local methods is that although they have a good relative positioning capability, they perform erroneous positioning in large-scale environments and are neither robust nor accurate enough to be fed to industrial applications. Moreover, the mobile station side point clouds are limited, especially due to limited memory as well as limited computational power.

(14) Therefore, the subject matter is based on the idea to process the information collected on the mobile station side on the server side in such a way that an environment map and a positioning therein is possible even in large environments over a large number of mobile devices with a high accuracy and great robustness.

(15) It is proposed that data available from an odometry performed on the mobile station side is used and enriched on the server side. For this purpose, mobile stations 2 are connected to a central server 6 via a wide area network 4, as shown in FIG. 1. Bidirectional communication is possible.

(16) According to the subject matter, at least movement data and orientation data are transmitted from the mobile stations 2 to the server 6 via the wide area network 4. In addition, a transmission of the point cloud, in particular the so-called “raw feature points” can also be transmitted. The motion data is also described as IMU (inertial measurement stater) and the orientation data is also described as pose transform data.

(17) On the server side, a server-side point cloud is calculated with the received data in a SLAM system (Simultaneous Localization and Mapping). To create the point cloud, it is necessary to assign point coordinates via vectors to points. For this purpose, an evaluation of the image information, the orientation information and the motion information is performed. This is shown schematically in FIG. 2.

(18) FIG. 2 shows a scene in which three salient points 8 are detectable. Such salient points 8 can be described by features that are detectable via suitable feature descriptors. In particular, the features of the points 8 are invariant to changes, such as motion, light change, panning, zooming or the like. It is also possible to define such feature descriptors that have higher-order invariance.

(19) In a first image 10, the points 8 are shown in a certain arrangement with respect to each other. Based on the orientation data 12, it is possible to assign to the points not only coordinates within the image 10, but also, if necessary, coordinates which can be described by vectors 14 having at least one common origin 16. This is possible, in particular, if motion data 16 are also acquired in addition to the orientation data 12 and, for example, the same points 8 are detected in a different assignment to one another in a second image 10 with different orientation data 12. The change in the points 8, in particular their relative assignment to one another in the images 10, together with the orientation information 12 and the motion information 16 makes it possible to calculate the vectors 14 of the point coordinates of the points 8 on the mobile station side and/or on the server side.

(20) In particular, a Cartesian coordinate system 18 is used for this purpose, with the y-axis parallel to gravity, the x-axis parallel to the horizontal, and the z-axis perpendicular to the plane spanned by the x- and y-axes. In particular, the vectors 14 are three-dimensional vectors. A transformation tensor, with which the globally valid vectors 14 can be calculated from the local positions of the points detected in the image, can be determined via the orientation information 12 and the movement information 16.

(21) According to the subject matter, a mobile station 2 first transmits to the server 6 in each case an image 10, the associated orientation information 12, and the motion information 16 captured between two images 10, as shown in FIG. 3. In the server 6, the image data stream thus obtained is evaluated and a server-side point cloud is calculated by means of a SLAM system.

(22) Since, on the one hand, higher computing power is available on the server side and also the storage capacity is basically unlimited, the number of points in a point cloud can likewise be virtually unlimited. This makes it possible to determine a point cloud for a large environment based on information from one or more mobile devices 2. In this regard, it is proposed, for example, that in a step 20, orientation data 12, motion data 16, and images 10 are received from at least one mobile device 2. After reception, for example, feature recognition is performed and points are detected based on feature descriptions. The points thus detected are extracted 22.

(23) Then, an evaluation of the feature descriptions can be performed in step 24. Finally, a point cloud 30 is determined from the extracted points in step 26. This point cloud 30 is stored in a memory on the server side. The point cloud 30 is formed from a plurality of points having a unique identifier and at least point coordinates.

(24) The point cloud 30 calculated on the server side is based in particular on a plurality of information from, for example, a plurality of different mobile devices. For position recognition, it is now possible to compare points recognized on the mobile station side with the points in the point cloud. If there is a match, the position of the mobile device can then be determined by using the orientation data 12 to determine the position of the mobile device 2 in the point cloud and thus in the real environment on the basis of the detected points.

(25) Since the server-side point cloud 30 is more robust and accurate, the position determination is also more accurate. Step 20 can be followed by a step 32 in the evaluation, as shown in FIG. 5a. In this step 32, a comparison of previously acquired images 10 and points 8 with those of the server-side point cloud is performed, for example, to perform a loop closure. With the aid of this loop closure, it is possible to determine whether a mobile device 2 is located at the same position after a certain time and/or whether the same features can be detected in an image. This is of particular interest to correct a drift that occurs after a while. The points detected after loop closure (32) is detected are added to the point cloud 30 in step 34.

(26) If a loop closure is detected, it is possible to use this server-side detected information to eliminate a mobile-station-side drift in the points of the mobile-station-side point cloud. For this purpose, in a step 36, as shown in FIG. 5b, it is first determined by what amount the mobile station-side calculated position of the mobile device 2, which was determined based on the motion data 16 and orientation data 12, deviates from the actual position of the mobile device 2. Since it was detected on the server side that the mobile device 2 has returned to its original position, the position relative to the beginning of an AR session is uniquely defined. If the position calculated on the mobile station side deviates from this, there is a drift whose amount can be determined (36). With the help of this amount of drift determined on the server side, it is possible to adjust the points of the point cloud calculated on the mobile station side (38).

(27) This involves a correction of the point coordinates or vectors 14 of individual points 8. It is also possible to supplement a point cloud calculated on the mobile station side with points of a point cloud calculated on the server side. For this purpose, the point cloud 30 created between steps 20 and 26 is acquired. Furthermore, in a step 40, a mobile station-side point cloud is received. The points of the point cloud received at the mobile station side may be supplemented, corrected, or otherwise modified by the point cloud 30 determined at the server side. The thus optimized point cloud 30 can then be transmitted from the server 6 to the mobile device 2 in a step 42.

(28) Since an evaluation of the image 10 is performed on the server side, a feature recognition is performed in step 24 after the image 10 (20) is received. In this feature detection, the invariance of the features is checked and, in particular, features can be defined using deep learning strategies.

(29) Subsequently, it can be checked whether the feature recognition in an image 10 was good or bad (44) and from this it can be derived whether the acquired image 10 was sufficiently good in its quality for the feature recognition. Depending on this evaluation, at least one setting parameter for the camera (exposure, contrast, color and the like) can be determined in a step 46 and transmitted to the mobile device 2. There, the camera can be adjusted accordingly.

(30) It is also possible to optimize the orientation detected at the mobile station. For this purpose, after receiving 20, a calculation of the orientation data 12 is performed on the server side in a step 48. The calculation of the orientation data 12 can, on the one hand, use the movement data 16 and orientation data 12 received from the mobile device 2 and, on the other hand, a comparison can be made between the points acquired on the mobile station side and the points of the point cloud 30 present on the server side. For example, it is possible that the point cloud on the mobile station side or the points of an image 10 are present in the point cloud 30 on the server side, but the point coordinates differ from each other. It may be determined that this discrepancy in point coordinates exists due to an erroneous conversion tensor, which may have been determined based on erroneous orientation information from the mobile device 2. If this is the case, an estimate of the actual orientation of the camera of the mobile device 2 may be calculated in a step 50. This information can be sent back to the mobile device 2, which can update its orientation information.

(31) Furthermore, it is possible to determine anchors in the point cloud 30 in step 52, which can also be transmitted to the mobile device 2 in step 54.

(32) Device-independent position determination is also possible. Here, points acquired on the mobile station side can be transmitted to the server 6 in a step 56. In a step 58, these points are compared with the points of the point cloud 30 on the mobile station side. On the basis of this comparison, a position determination is possible in which a correspondence in the points is determined. Based on this, a position determination (60) can be made.

(33) The determined position and, if necessary, a point cloud enriched by points of the server-side point cloud 30 can be sent back to the mobile device (62).

(34) The server-side point cloud is continuously checked and supplemented. As information is continuously received from one or more mobile devices on the server side, points are continuously added to the point cloud and/or points are corrected. As a result, the server-side point cloud becomes more accurate and detailed as the time of operation increases. On the server side, bundle adjustment, feature matching and outlier removal can be used to optimize the quality of the individual points in the point cloud. Features or feature descriptions (descriptors) can be continuously modified and added. Depending on the semantic environment, different feature descriptors can be used to optimize the image information and to convert it into points of the point cloud.

(35) In particular, it is possible to check how good and how frequent a position determination of mobile devices is based on the points captured there by comparing them with the point cloud and the respective descriptors can be changed.

(36) The present method thus creates a point cloud on the server side, which has a substantially global validity and can be acquired over a plurality of devices and made available to a plurality of devices.

Method and device for determining an environment map by a server using motion and orientation data

Assignee

Inventors

Cpc classification

Classification Explorer

G06V20/20

PHYSICS

Classification Explorer

G06V10/40

PHYSICS

Classification Explorer

G06T17/05

PHYSICS

International classification

Classification Explorer

G06T17/05

PHYSICS

Classification Explorer

G06V10/40

PHYSICS

Abstract

Claims

Description