G06V20/647

SYSTEMS AND METHODS FOR SIX-DEGREE OF FREEDOM POSE ESTIMATION OF DEFORMABLE OBJECTS

A method for estimating a pose of a deformable object includes: receiving, by a processor, a plurality of images depicting the deformable object from multiple viewpoints; computing, by the processor, one or more object-level correspondences and a class of the deformable object depicted in the images; loading, by the processor, a 3-D model corresponding to the class of the deformable object; aligning, by the processor, the 3-D model to the deformable object depicted in the plurality of images to compute a six-degree of freedom (6-DoF) pose of the object; and outputting, by the processor, the 3-D model and the 6-DoF pose of the object.

Systems and methods for computer-based labeling of sensor data captured by a vehicle

Examples disclosed herein may involve (i) based on an analysis of 2D data captured by a vehicle while operating in a real-world environment during a window of time, generating a 2D track for at least one object detected in the environment comprising one or more 2D labels representative of the object, (ii) for the object detected in the environment: (a) using the 2D track to identify, within a 3D point cloud representative of the environment, 3D data points associated with the object, and (b) based on the 3D data points, generating a 3D track for the object that comprises one or more 3D labels representative of the object, and (iii) based on the 3D point cloud and the 3D track, generating a time-aggregated, 3D visualization of the environment in which the vehicle was operating during the window of time that includes at least one 3D label for the object.

WEAK MULTI-VIEW SUPERVISION FOR SURFACE MAPPING ESTIMATION

One or more two-dimensional images of a three-dimensional object may be analyzed to estimate a three-dimensional mesh representing the object and a mapping of the two-dimensional images to the three-dimensional mesh. Initially, a correspondence may be determined between the images and a UV representation of a three-dimensional template mesh by training a neural network. Then, the three-dimensional template mesh may be deformed to determine the representation of the object. The process may involve a reprojection loss cycle in which points from the images are mapped onto the UV representation, then onto the three-dimensional template mesh, and then back onto the two-dimensional images.

METHOD FOR DETERMINING MITOCHONDRIAL EVENTS

A method of determining the location and quantity of mitochondrial fission, fusion and depolarisation events that occur in a cell is provided. Using a three-dimensional time lapse image sequence of a cell, the method identifies which of the mitochondria in a cell had depolarised or undergone fission or fusion in the interval between the acquisition of the earlier and later images, indicates the locations of the fission, fusion and depolarisation events, and generates a count of the number of mitochondrial fission, fusion and/or depolarisation events. The method can be used to diagnose a disease or condition associated with mitochondrial dysfunction, such as neurodegenerative disease, cancer or ischaemic heart disease. The method can further be used to screen a compound or composition for use in preventing or treating a disease or condition associated with mitochondrial dysfunction. The method can be computer-implemented, and a computer program product is provided.

NEURAL NETWORK BASED FACIAL ANALYSIS USING FACIAL LANDMARKS AND ASSOCIATED CONFIDENCE VALUES

Systems and methods for more accurate and robust determination of subject characteristics from an image of the subject. One or more machine learning models receive as input an image of a subject, and output both facial landmarks and associated confidence values. Confidence values represent the degrees to which portions of the subject's face corresponding to those landmarks are occluded, i.e., the amount of uncertainty in the position of each landmark location. These landmark points and their associated confidence values, and/or associated information, may then be input to another set of one or more machine learning models which may output any facial analysis quantity or quantities, such as the subject's gaze direction, head pose, drowsiness state, cognitive load, or distraction state.

Adaptive sensing based on depth

A microscope for adaptive sensing may comprise an illumination assembly, an image capture device configured to collect light from a sample illuminated by the assembly, and a processor. The processor may be configured to execute instructions which cause the microscope to capture, using the image capture device, an initial image set of the sample, identify, in response to the initial image set, an attribute of the sample, determine, in response to identifying the attribute, a three-dimensional (3D) process for sensing the sample, and generate, using the determined 3D process, an output image set comprising more than one focal plane. Various other methods, systems, and computer-readable media are also disclosed.

METHOD OF TRAINING A MACHINE LEARNING ALGORITHM TO IDENTIFY OBJECTS OR ACTIVITIES IN VIDEO SURVEILLANCE DATA
20230081908 · 2023-03-16 ·

A method of training a machine learning algorithm to identify objects or activities in video surveillance data comprises generating a 3D simulation of a real environment from video surveillance data captured by at least one video surveillance camera installed in the real environment. Objects or activities are synthesized within the simulated 3D environment and the synthesized objects or activities within the simulated 3D environment are used as training data to train the machine learning algorithm to identify objects or activities, wherein the synthesized objects or activities within the simulated 3D environment used as training data are all viewed from the same viewpoint in the simulated 3D environment.

User authentication apparatus, user authentication method and training method for user authentication

A user authentication method and a user authentication apparatus acquire an input image including a frontalized face of a user, calculate a confidence map including confidence values, for authenticating the user, corresponding to pixels with values maintained in a depth image of the frontalized face of the user among pixels included in the input image, extract a second feature vector from a second image generated based on the input image and the confidence map, acquire a first feature vector corresponding to an enrolled image, and perform authentication of the user based on a correlation between the first feature vector and the second feature vector.

Method and system for generating depth information of street view image using 2D map
11482009 · 2022-10-25 · ·

A method for generating depth information of a street view image using a two-dimensional (2D) image includes calculating distance information of an object on a 2D map using the 2D map corresponding to a street view image; extracting semantic information on the object from the street view image; and generating depth information of the street view image based on the distance information and the semantic information.

Methods and apparatus for human pose estimation from images using dynamic multi-headed convolutional attention

An apparatus for 3D human pose estimation using dynamic multi-headed convolutional attention mechanism is presented. The apparatus contains two dynamic multi-headed convolutional attention mechanism with spatial attention and another with temporal attention that leverages the spatial attention mechanism to extract frame-wise inter-joint dependencies by analyzing sections of limbs that are related. The temporal attention mechanism extracts global inter-frame relationships by analyzing correlations between the temporal profile of joints. The temporal profile mechanism leads to a more diverse temporal attention map while achieving substantial parameter reduction.