Semantic map production system and method
11557137 · 2023-01-17
Assignee
Inventors
Cpc classification
G06F18/213
PHYSICS
G06F18/40
PHYSICS
International classification
G06V30/262
PHYSICS
Abstract
The system includes a metric map creation unit configured to create a metric map using first image data received from a 3D sensor, an image processing unit configured to recognize an object by creating and classifying a point cloud using second image data received from an RGB camera; a probability-based map production unit configured to create an object location map and a spatial semantic map in a probabilistic expression method using a processing result of the image processing unit, a question creation unit configured to extract a portion of high uncertainty about an object class from a produced map on the basis of entropy and ask a user about the portion, and a map update unit configured to receive a response from the user and update a probability distribution for spatial information according to a change in probability distribution for classification of the object.
Claims
1. A semantic map production system for creating a semantic map through a robot that acquires a nearby image using a three-dimensional (3D) sensor and an RGB camera, the semantic map production system comprising: a metric map creation unit configured to create a metric map using first image data received from the 3D sensor; an image processing unit configured to recognize an object by creating and classifying a point cloud using second image data received from the RGB camera; a probability-based map production unit configured to create an object location map and a spatial semantic map in a probabilistic expression method using a processing result of the image processing unit; and a map correction unit comprising: a question creation unit configured to extract a portion of high uncertainty about an object class from a produced map on the basis of entropy and ask a user about the portion; and a map update unit configured to receive a response from the user and update a probabilistic distribution for spatial information according to a change in probabilistic distribution for classification of the object.
2. The semantic map production system of claim 1, wherein the probability-based map production unit estimates a probabilistic distribution of the class and volume of the object recognized by the image processing unit to create the object location map.
3. The semantic map production system of claim 2, wherein, through maximum likelihood estimation, the probability-based map production unit determines that an object corresponding to a parameter that maximizes the likelihood of the object is a preregistered object and determines that an object is an unregistered object when the maximum likelihood is less than or equal to a predetermined threshold.
4. The semantic map production system of claim 3, wherein the probability-based map production unit is configured to: perform a Bayesian update on the probabilistic distribution of the class of the object when the object is determined as a preregistered object; and register the object on the object location map and extract a Gaussian distribution of the object through the mean and variance of a point cloud for the object when the object is determined as an object unregistered on the semantic map.
5. The semantic map production system of claim 1, wherein when no object is recognized by the image processing unit, the probability-based map production unit creates the object location map by extracting an uncertain portion through a 3D layer-wise difference to probabilistically extract a location where an object is likely to be.
6. The semantic map production system of claim 5, wherein the probability-based map production unit extracts data values of a top layer and a bottom layer of a space from image data received from the 3D sensor, expresses, in the top layer, a difference between the top layer and the bottom layer, classifies the point cloud through a clustering algorithm to consider each cluster as an object, compares the object to an object preregistered on the map, and registers the object as a new object when a probability for the comparison is less than or equal to a certain value.
7. The semantic map production system of claim 1, wherein the probability-based map production unit creates the spatial semantic map using a semantic network in which a relationship between object types and a connection relationship between an object type and a spatial type are defined, and extracts the meaning of a space to which the object belongs after reflecting a weight value determined by the semantic network in the object type and the spatial type and a distance between the object and a nearby object.
8. The semantic map production system of claim 1, wherein the question creation unit of the map correction unit applies a first weight value to the entropy of the object, applies a second weight value to the spatial node, and asks the user about the type of the object when an objective function value to which the first weight value and the second weight value are applied is higher than a predetermined value.
9. The semantic map production system of claim 1, wherein the map update unit of the map correction unit performs a Bayesian update on a probabilistic distribution of the class of the object after reflecting a response from the user.
10. A semantic map production method performed by a semantic map production system for creating a semantic map through a robot that acquires a nearby image using a three-dimensional (3D) sensor and an RGB camera, the semantic map production method comprising operations of: (a) creating a metric map using first image data received from the 3D sensor; (b) recognizing an object by creating and classifying a point cloud using second image data received from the RGB camera; (c) creating an object location map and a spatial semantic map in a probabilistic expression method using a result of the object recognition; (d) extracting a portion of high uncertainty about an object class from a produced map on the basis of entropy and asking a user about the portion; and (e) receiving a response from the user and updating a probabilistic distribution for spatial information according to a change in probabilistic distribution for classification of the object.
11. The semantic map production method of claim 10, wherein operation (c) comprises estimating a probabilistic distribution of the class and volume of the recognized object to create the object location map.
12. The semantic map production method of claim 11, wherein through maximum likelihood estimation, it is determined that an object corresponding to a parameter that maximizes the likelihood of the object is a preregistered object, and it is determined that an object is an unregistered object when the maximum likelihood is less than or equal to a predetermined threshold.
13. The semantic map production method of claim 12, wherein a Bayesian update is performed on the probabilistic distribution of the class of the object when the object is determined as an object preregistered on the semantic map, and the object is registered on the semantic map and a Gaussian distribution of the object is extracted through the mean and variance of a point cloud for the object when the object is determined as an object unregistered on the semantic map.
14. The semantic map production method of claim 10, wherein operation (c) comprises, when no object is recognized, creating the object location map by extracting an uncertain portion through a 3D layer-wise difference to probabilistically extract a location where an object is likely to be.
15. The semantic map production method of claim 14, further comprising: extracting data values of a top layer and a bottom layer of a space from image data received from the 3D sensor; expressing, in the top layer, a difference between the top layer and the bottom layer; classifying the point cloud through a clustering algorithm to consider each cluster as an object; and comparing the object to an object preregistered on the map and registering the object as a new object when a probability for the comparison is less than or equal to a certain value.
16. The semantic map production method of claim 10, wherein operation (c) comprises creating the spatial semantic map using a semantic network in which a relationship between object types and a connection relationship between an object type and a spatial type are defined and extracting the meaning of a space to which the object belongs after reflecting a weight value determined by the semantic network in the object type and the spatial type and a distance between the object and a nearby object.
17. The semantic map production method of claim 10, wherein operation (d) comprises applying a first weight value to the entropy of the object, applying a second weight value to the spatial node, and asking the user about the type of the object when an objective function value to which the first weight value and the second weight value are applied is higher than a predetermined value.
18. The semantic map production method of claim 10, wherein operation (e) comprises performing a Bayesian update on a probabilistic distribution of the class of the object after reflecting a response from the user.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The above and other objects, features and advantages of the present invention will become more apparent to those of ordinary skill in the art by describing exemplary embodiments thereof in detail with reference to the accompanying drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
(13) Hereinafter, detailed contents for practicing the present invention will be described with reference to the accompanying drawings. In the following description of the present invention, detailed descriptions of related well-known functions will be omitted when it is determined that the functions are obvious to those skilled in the art and also may unnecessarily obscure the subject matter of the present invention.
(14)
(15) Referring to
(16) The 3D sensor 1 acquires first image data for creating a metric map, and the RGB camera 2 acquires second image data for creating object location information and spatial semantic information.
(17) The semantic map production system according to the present invention includes a metric map creation unit 10, an image processing unit 20, a probability-based map production unit 30, a semantic map database 40, and a map correction unit 50.
(18) The metric map creation unit 10 creates a metric map using the first image data received from the 3D sensor 1. The 3D sensor 1 may include a 3D laser scanner. The 3D sensor 1 may be provided in conjunction with the metric map creation unit 10. A map production algorithm capable of handling 3D Light Detection and Ranging (Lidar) sensors such as GMapping and Berkeley Localization and Mapping (BLAM) may be applied to the metric map creation unit 10. An indoor metric map created by the 3D sensor 1 may be shown in
(19) The image processing unit 20 recognizes an object by creating and classifying a point cloud using the second image data received from the RGB camera 2. The RGB camera 2 is provided in conjunction with the image processing unit 20. The image processing unit 20 receives RGB-D image data including depth information from the RGB camera 2 and performs segmentation on the received RGB-D image data as shown in
(20) The image processing unit 20 may output a class c.sub.t of an object, a confidence value p(c.sub.t) of an object, and a point cloud Q.sub.t for a segmented object which are included in the second image data at time t.
(21) An object location map in which recognized objects are arranged may be stored in the semantic map database 40.
(22) The probability-based map production unit 30 estimates the probabilistic distribution of the volume and class c.sub.t of the object. As shown in
(23) The probability-based map production unit 30 may approximate the volume of object i into a parameter-based probabilistic distribution (Gaussian distribution, etc.) p(ϕ.sub.k)(∫p(ϕ.sub.i)=1). In addition, p(θ.sub.i) may be specified by expressing the probabilistic distribution for class θ.sub.i of each object as a categorical distribution Σ.sub.i=1.sup.np(θ.sub.i)=1 should be satisfied). In this case, the number of types of classes is n.sub.c.
(24) An object location map production method according to the probability-based map production unit 30 may be classified into a case in which the image processing unit 20 recognizes an object and a case in which there is no object recognized by the image processing unit 20 depending on the processing result of the image processing unit 20. The case in which the image processing unit 20 recognizes an object may be classified into a case in which a new object that has not yet been registered on the map is recognized and a case in which an object that is preregistered on the map is recognized.
(25) The probability-based map production unit 30 may determine whether an object recognized using an object searcher is an object registered on the map. The probability-based map production unit 30 determines whether the recognized object is an object registered on the map, returns an index of the registered object when the object is registered, and returns a new index by adding one to the number of previously registered objects when the object is a new object.
(26) The probability-based map production unit 30 may determine that an object corresponding to a parameter that maximizes the likelihood of an object is a preregistered object using maximum likelihood estimation. Meanwhile, the probability-based map production unit 30 may determine that an object having a maximum likelihood equal to or less than a predetermined threshold is an unregistered object.
(27) The probability-based map production unit 30 determines that i that maximizes the likelihood p(z.sub.t|θ.sub.i, ϕ.sub.i) of the object with respect to an observation parameter z.sub.t is a preregistered object and determines that the recognized object is a new object when the maximum p(z.sub.t|θ.sub.i, ϕ.sub.i) does not exceed a certain threshold. The observation parameter z.sub.t is defined as {c.sub.t, Q.sub.t}. c.sub.t denotes the class of an object obtained by the image processing unit 20. Q.sub.t denotes a point cloud, and Q.sub.0:t denotes a point cloud observed from 0 to t.
(28) When it is determined that an object is preregistered on a semantic map, the probability-based map production unit 30 may perform a Bayesian update on the probabilistic distribution of the class of the object. When an object registered on the map is recognized, the probability-based map production unit 30 updates parameters p(θ.sub.i) and ϕ.sub.i. The probability-based map production unit 30 performs the Bayesian update on p(θ.sub.i) and p(ϕ.sub.i).
(29) Information regarding the class of the object may be updated in the Bayesian fashion through the following Equation.
(30)
(31) An object volume parameter ϕ.sub.i may be updated to the mean and variance of P.sub.i and Q.sub.i, which are point clouds that are stored.
(32) When it is determined that an object is not registered on the semantic map, the probability-based map production unit 30 registers the object on the semantic map and extracts the probabilistic distribution of a point cloud for the object. The registered object information may be stored in the semantic map database 40.
(33) That is, when a new object is recognized and the actual class of the object is θ.sub.i, the probability that the observed class of the object is c.sub.i is as follows.
(34)
(35) The probability-based map production unit 30 may extract ϕ.sub.i of through the point cloud Q.sub.t.
(36) When there is no object recognized by the image processing unit 20, the probability-based map production unit 30 probabilistically extracts a location where the object is likely to be.
(37)
(38) Referring to
(39) Referring to
(40) The probability-based map production unit 30 classifies point clouds through a clustering algorithm (DBSCAN, WARD, Spectral clustering, etc.), and considers each cluster as an object. In this case, there may be several objects, and each 2D point cloud is shown as an observed point cloud.
(41) Subsequently, the probability-based map production unit 30 compares each cluster considered as an object to an object preregistered on the map and may register the new object on the map when the probability is less than or equal to a predetermined value.
(42) Specifically, the probability-based map production unit 30 may determine whether a cluster is a preregistered object on the basis of an extracted cluster (similar to the above-described object searcher). Assuming that only 2D points being extracted among probability parameters of a preregistered object is
(43) p(θ.sub.i), which is a parameter of an object model to be newly registered, follows a uniform distribution, and may be computed from the observed point cloud
(44) Next, the probability-based map production unit producing a spatial semantic map using common-sense information will be described in detail below.
(45)
(46) Referring to
(47) The detailed description of spatial semantic map production using common-sense information is as follows.
(48) The probability-based map production unit 30 may extract the meaning of a space from a nearby object by using information about universal human knowledge, such as ConceptNet. For example, when a refrigerator, a sink, and a dining table are close to each other in one space, the probability-based map production unit 30 may infer that the space is a kitchen. Classes for spaces include a kitchen, a living room, a bedroom, a utility room, and a balcony.
(49) The spatial semantic map is a topological graph composed of a node V and an edge E and is expressed as G=(V, E). Each node v∈V stores the location p.sub.v and the class c.sub.v of each node.
(50) A process of creating a spatial semantic map will be described in detail below.
(51) The location of each node is created through an algorithm based on sampling (Voronoi, etc.) as shown in
(52)
(53) Here, p(c.sub.v|o.sub.i) denotes the probability that an object o.sub.i is in a space c.sub.v. p(c.sub.v|o.sub.i) is extracted from a semantic network such as ConceptNet in which common-sense information becomes knowledge. The semantic network has a semantic relationship between words and also a confidence score. As shown in FIG. 9B, in the semantic network, for example, common-sense information indicating that objects such as food, knives, and forks are located in the kitchen, common-sense information indicating that the kitchen is located in a restaurant or apartment, and common-sense information indicating that the kitchen is used to cook or store food are linked to each other through a semantic relationship.)
(54) p(c.sub.v|o.sub.i) is defined as follows.
(55)
(56) Here, “distance” is a distance between an object and node v, “const” is defined as a constant, and S is an error function, which is modeled so that a probability value increases as the “confidence score” increases and decreases as the distance increases.
(57)
(58) Referring to
(59) The question creation unit 51 extracts a portion of high uncertainty about the class of an object from the produced map on the basis of entropy and asks the user about the portion. The question creation unit 51 may apply a first weight value to the entropy of the object, apply a second weight value to the spatial node, and ask the user about the type of the object when an objective function value to which the first weight value and the second weight value are applied is higher than a predetermined value.
(60) The question creation unit 51 defines the objective function for asking the user using the weighted sum of the entropy of the object and the entropy of the spatial node near the object as follows.
f(θ.sub.i)=aH(θi)+βE.sub.e.sub.
(61) The entropy may be computed as follows.
(62)
(63) The question creation unit 51 adjusts the importance between the entropy of the object itself and the entropy of the space near the object with α and β. The question creation unit 51 asks a question such as “What is it?” about an object that maximizes the objective function through i*=ArgMax.sub.i f(θ.sub.i) using an image stored when the map is produced.
(64) The map update unit 52 of the map correction unit 50 may perform the Bayesian update on the probabilistic distribution of the classes of the object after reflecting the response from the user. The map update unit 52 may obtain a human answer to the object and update the categorical distribution of classification information of the object according to the Bayesian rule.
(65) Referring to
(66) The answer to the question is denoted by c, which is the class c of the observed object. According to the Bayesian rule, the probabilistic distribution for the class of the object may be updated as follows.
(67)
(68) p(θ.sub.i) is a probabilistic distribution before asking the question, and p(θ.sub.i|c) is a probabilistic distribution after finding the answer. p(c|θ.sub.i) can be obtained through the answer to the question and follows the equation below.
(69)
(70) ε is a value close to one, and it can be assumed that a user gives a correct answer.
(71) According to the present invention, it is possible to create a semantic map by combining cognitive information obtained through a robot and common-sense information obtained through a user.
(72) Also, it is possible to express a semantic map with a probability-based hierarchical structure, create a question that can effectively reduce the uncertainty of a semantic map, and update the semantic map after reflecting a human answer to the question.
(73) The scope of the present invention is not limited to the description and expression of the embodiments explicitly described above. In addition, the scope of the present invention cannot be limited due to obvious changes or substitutions in the technical field to which the present invention pertains.