Patent classifications
G06V10/426
Learning apparatus, learning method, and learning program, graph structure extraction apparatus, graph structure extraction method, and graph structure extraction program, and learned extraction model
A learning unit derives, from a target image including at least one tubular structure, in a case where an image for learning and ground-truth data of a graph structure included in the image for learning are input to an extraction model which extracts a feature vector of a plurality of nodes constituting a graph structure of the tubular structure, a loss between nodes on the graph structure included in the image for learning on the basis of an error between a feature vector distance between nodes belonging to the same graph structure and a topological distance which is a distance on a route of the graph structure between the nodes, and performs learning of the extraction model on the basis of the loss.
Learning apparatus, learning method, and learning program, graph structure extraction apparatus, graph structure extraction method, and graph structure extraction program, and learned extraction model
A learning unit derives, from a target image including at least one tubular structure, in a case where an image for learning and ground-truth data of a graph structure included in the image for learning are input to an extraction model which extracts a feature vector of a plurality of nodes constituting a graph structure of the tubular structure, a loss between nodes on the graph structure included in the image for learning on the basis of an error between a feature vector distance between nodes belonging to the same graph structure and a topological distance which is a distance on a route of the graph structure between the nodes, and performs learning of the extraction model on the basis of the loss.
CONFLATION OF GEOSPATIAL POINTS OF INTEREST AND GROUND-LEVEL IMAGERY
Techniques are described that include accessing information about points of interest and images of scenes within the area of interest; encoding the information about each scene image as a respective scene-image vector; encoding the information about each point of interest as a respective point-of-interest vector; constructing a joint semantic graph having nodes and edges by (i) attributing to each node a respective point-of-interest vector or a respective scene-image vector, (ii) determining semantic distances between pairs of point-of-interest vectors, pairs of scene-image vectors, and pairs formed from a point-of-interest vector and a scene-image vector, and (iii) connecting each node with respective edges to a predetermined number of nearest-neighbor nodes having respective vectors with lowest semantic distances to each other. The constructed joint semantic graph can be used to enrich and/or clean the information about the points of interest and/or the images of scenes within the area of interest.
Multi-granularity alignment for visual question answering
In one embodiment, a method includes accessing an image and a natural-language question regarding the image and extracting, from the image, a first set of image features at a first level of granularity and a second set of image features at a second level of granularity. The method further includes extracting, from the question, a first set of text features at the first level of granularity and a second set of text features at the second level of granularity; generating a first output representing an alignment between the first set of image features and the first set of text features; generating a second output representing an alignment between the second set of image features and the second set of text features; and determining an answer to the question based on the first output and the second output.
System and method for improving communication productivity
A method, computer readable storage medium, and system are disclosed for improving communication productivity in a conference between two or more subjects, wherein at least one of the two or more subjects participates in the conference from a first location and one or more of the two or more subjects participate in the meeting from a second location. The method includes capturing, at least one first three-dimensional (3D) stream of data and at least one second three-dimensional (3D) stream of data on each of the two or more subjects participating in the conference; generating a synchrony score for the two or more subjects, wherein the synchrony score is calculated by comparing time series of skeletal data of each of the two or more subjects to one another for a defined period of time; and using the synchrony score to generate an engagement index between the two or more subjects.
Depth mapping with enhanced resolution
A method for depth mapping includes receiving an image of a pattern of spots that has been projected onto a scene, which includes a hand having fingers. The image is processed in order to segment and find a three-dimensional (3D) location of the hand. Based on the spots appearing on the hand in the 3D location, a first depth value that is characteristic of the hand and a second depth value that is characteristic of a background of the scene behind the hand are computed. The spots in a vicinity of the hand in the image between the first and second depth values are sorted in order to extract separate, respective contours of each of the fingers. The respective contours are processed in order to identify a posture of the hand and fingers.
Methods and devices for labeling and/or matching
Devices, such as computer readable media, and methods, such as automated methods, for labeling and/or matching. Some of the devices and methods are particularly useful for anatomical labeling of human airway trees. Some of the devices and methods are particularly useful for matching branch-points of human airway trees from represented in two or more graphs.
Method and system for recognizing faces
A method and a system for recognizing faces have been disclosed. The method may comprise: retrieving a pair of face images; segmenting each of the retrieved face images into a plurality of image patches, wherein each patch in one image and a corresponding one in the other image form a pair of patches; determining a first similarity of each pair of patches; determining, from all pair of patches, a second similarity of the pair of face images; and fusing the first similarity determined for the each pair of patches and the second similarity determined for the pair of face images.
Information processing apparatus, information processing method, and non-transitory computer readable medium
An information processing apparatus includes a memory, an accepting unit, a determining unit, and a selecting unit. The memory stores a template collection. The memory associatively stores, for each template, the template and a degree of first impression similarity indicating an impression of the template. The accepting unit accepts an image. The determining unit determines an impression of the accepted image. The selecting unit selects, from the template collection, a template that is in harmony with the image by using a degree of second impression similarity indicating the impression of the image, and the degree of first impression similarity.
MEASUREMENT APPARATUS, METHOD AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM
A measurement apparatus which includes a processor is provided. The processor is configured to calculate, based on a distance image of a measurement target object with at least a joint, a position of a first portion of the measurement target object which corresponds to a non-joint portion or a terminal portion, and a position of a second portion of the measurement target object different from the first portion, and calculate, based on a first line connecting the calculated positions, a joint angle related to a joint of a first measurement target of the measurement target object.