Patent classifications
G06V20/647
GRASP GENERATION FOR MACHINE TENDING
A robotic grasp generation technique for machine tending applications. Part and gripper geometry are provided as inputs, typically from CAD files. Gripper kinematics are also defined as an input. Preferred and prohibited grasp locations on the part may also be defined as inputs, to ensure that the computed grasp candidates enable the robot to load the part into a machining station such that the machining station can grasp a particular location on the part. An optimization solver is used to compute a quality grasp with stable surface contact between the part and the gripper, with no interference between the gripper and the part, and allowing for the preferred and prohibited grasp locations which were defined as inputs. All surfaces of the gripper fingers are considered for grasping and collision avoidance. A loop with random initialization is used to automatically compute many hundreds of diverse grasps for the part.
THREE-DIMENSIONAL SENSING SYSTEM
A three-dimensional sensing system includes a plurality of scanners each emitting a light signal to a scene to be sensed and receiving a reflected light signal, according to which depth information is obtained. Only one scanner executes transmitting corresponding light signal and receiving corresponding reflected light signal at a time.
EMERGENCY VEHICLE DETECTION SYSTEM AND METHOD
In an embodiment, a method includes: receiving ambient sound; determining if the ambient sound includes a siren; in accordance with determining that the ambient sound includes a siren, determining a first location associated with the siren; receiving a camera image; determining if the camera image includes a flashing light; in accordance with determining that the camera image includes a flashing light, determining a second location associated with the flashing light; 3D data; determining if the 3D data includes an object; in accordance with determining that the 3D data includes an object, determining a third location associated with the object; determining a presence of an emergency vehicle based on the siren, detected flashing light and detected object; determining an estimated location of the emergency vehicle based on the first, second and third locations; and initiating an action related to the vehicle based on the determined presence and location.
People counting and tracking systems and methods
Various techniques are provided for counting and/or tracking objects within a field of view of an imaging system, while excluding certain objects from the results. A monitoring system may count or track people identified in captured images while utilizing an employee identification system including a wireless signal receiver to identify and remove the employees from the result. The system includes algorithms for separating employee counts from customer counts, thereby offering enhanced tracking analytics.
FOOD INVENTORY METHOD AND SYSTEM
A food inventory tracking system within a refrigerated food storage compartment includes cameras configured for capturing images of food, load cell sensors for taking weight measurements of food and a computing system for reading images from the cameras and weight measurements from the load cell sensors, generating a three-dimensional location map of an interior of the refrigerated food storage compartment, mapping the images and weight measurements to segments of the three-dimensional location map, such that each segment of the three-dimensional location map is associated with images and weight measurements, identifying a type of food item within said segments based on the images and weight measurements associated with said segment, calculating a current amount of said food item based reporting to a user the type of said food item and the amount of said food item.
Assembly body change detection method, device and medium based on attention mechanism
An assembly change detection method based on attention mechanism, including: establishing a three-dimensional model of an assembly body, adding a tag to each part in the three-dimensional model, setting several assembly nodes, obtaining depth images of the three-dimensional model under each assembly node in different viewing angles, and obtaining a change tag image of a added part at each assembly node; selecting two depth images at front and back moments in different viewing angles as training samples; performing semantic fusion, feature extraction, attention mechanism processing and metric learning sequentially on the training samples, training a detection model, continuously selecting training samples to train the detection model, saving model parameters with optimal similarity during training, completing training; and obtaining depth images of successive assembly nodes during assembling the assembly body, inputting depth images into trained detection model, and outputting change image of added part of the assembly body during assembly.
Accurate video event inference using 3D information
Techniques for inferring whether an event is occurring in 3D space based on 2D image data and for maintaining a camera's calibration are disclosed. An image of an environment is accessed. Input is received, where the input includes a 2D rule imposed against a ground plane. The 2D rule includes conditions indicative of an event. A bounding box is generated and encompasses a detected object. A point within the bounding box is projected from a 2D-space image plane of the image into 3D space to generate a 3D-space point. Based on the 3D-space point, a 3D-space ground contact point is generated. That 3D-space ground contact point is reprojected onto the ground plane of the image to generate a synthesized 2D ground contact point. A location of the synthesized 2D ground contact point is determined to satisfy the conditions.
DEVICE AND METHOD FOR TRAINING A MACHINE LEARNING MODEL FOR RECOGNIZING AN OBJECT TOPOLOGY OF AN OBJECT FROM AN IMAGE OF THE OBJECT
A method for training a machine learning model for recognizing an object topology of an object from an image of the object. The method includes obtaining a 3D model of the object, wherein the 3D model comprises a mesh of vertices connected by edges, wherein each edge has a weight which specifies proximity of two vertices connected by the edge in the object; determining a descriptor for each vertex of the mesh by searching descriptors for the vertices which minimize the sum, over pairs of connected vertices, of distances between the descriptors of the pair of vertices weighted by the weight of the edge between the pair of vertices; generating training data image pairs, wherein each training data image pair comprises a training input image showing the object and a target image; and training the machine learning model by supervised learning using the training data image pairs as training data.
METHOD, SYSTEM AND APPARATUS FOR MONOCULAR DEPTH ESTIMATION
Broadly speaking, this disclosure generally relates to methods, systems and apparatuses for performing monocular depth estimation, i.e. depth estimation using a single camera. In particular, this disclosure relates to a method for generating a training dataset for training a machine learning, ML, model using federated learning to perform depth estimation. Advantageously, the method to generate a training dataset enables a diverse training dataset to be generated while maintaining user data privacy. This disclosure also provides methods for training the ML model using the generated training dataset. Advantageously, the methods determine whether a community ML model that is trained by client devices needs to be retrained, and/or whether a global ML model, which is used to generate the community ML model, needs to be retrained.
Method and device for image processing
A method for image processing includes: receiving a third two-dimensional image and a depth image corresponding to the third two-dimensional image, wherein the third two-dimensional image and the depth image include a face; establishing a three-dimensional model of the face according to the depth image; rotating the three-dimensional model of the face by a first angle; projecting the three-dimensional model of the face rotated by the first angle to an image coordinate system of the third two-dimensional image; and building a three-dimensional model of a background region of the third two-dimensional image, processing a background region of an image projected to the image coordinate system of the third two-dimensional image to obtain a fourth image.