G06V10/809

Expression recognition method, apparatus, electronic device, and storage medium

Embodiments of the present disclosure provide an expression recognition method, apparatus, electronic device and storage medium. An expression recognition model includes a convolutional neural network model, a fully connected network model and a bilinear network model. During an expression recognition process, after an image to be recognized is pre-processed to obtain a facial image and a key point coordinate vector, the facial image is computed by the convolutional neural network model to output a first feature vector, the key point coordinate vector is computed by the fully connected network model to output a second feature vector, the first feature vector and the second feature vector are computed by the bilinear network model to obtain second-order information, and an expression recognition result in turn is obtained according to the second-order information. During this process, robustness of gestures and illuminations is better, and accuracy of expression recognition is improved.

FUSION MODEL TRAINING USING DISTANCE METRICS
20210319270 · 2021-10-14 ·

A method and a system are presented for controlling a performance of a fusion model. The method includes obtaining a first set and a second set of candidate models for a first and second neural networks, respectively. Each of the first and second set of candidate models is pre-trained with a first source and a second source, respectively. For each possible pairing of one candidate model from the first neural network and one candidate model from the second neural network, a model distance D.sub.m is determined. A subset of possible pairings of one first candidate model and one second candidate model is selected based on the model distance D.sub.m between them. Using the subset of possible parings, the first neural network and the second neural network are combined to generate two branches for a fusion model neural network.

Device and Method of Handling Video Content Analysis
20210319225 · 2021-10-14 ·

A computing device for handling video content analysis, comprises a preprocessing module, for receiving a first plurality of frames and for determining whether to delete at least one of the first plurality of frames according to an event detection, to generate a second plurality of frames according to the determination for the first plurality of frames; a first deep learning module, for receiving the second plurality of frames and for determining whether to delete at least one of the second plurality of frames according to a plurality of features of the second plurality of frames, to generate a third plurality of frames according to the determination for the second plurality of frames; and a second deep learning module, for receiving the third plurality of frames, to generate a plurality of prediction outputs of the third plurality of frames.

User interface configured to facilitate user annotation for instance segmentation within biological samples

Novel tools and techniques are provided for implementing digital microscopy imaging using deep learning-based segmentation via multiple regression layers, implementing instance segmentation based on partial annotations, and/or implementing user interface configured to facilitate user annotation for instance segmentation. In various embodiments, a computing system might generate a user interface configured to collect training data for predicting instance segmentation within biological samples, and might display, within a display portion of the user interface, the first image comprising a field of view of a biological sample. The computing system might receive, from a user via the user interface, first user input indicating a centroid for each of a first plurality of objects of interest and second user input indicating a border around each of the first plurality of objects of interest. The computing system might train an AI system to predict instance segmentation of objects of interest in images of biological samples.

Network architecture for ego-motion estimation

System, methods, and other embodiments described herein relate to estimating ego-motion. In one embodiment, a method for estimating ego-motion based on a plurality of input images in a self-supervised system includes receiving a source image and a target image, determining a depth estimation D.sub.t based on the target image, determining a depth estimation D.sub.s based on a source image, and determining an ego-motion estimation in a form of a six degrees-of-freedom (6 DOF) transformation between the target image and the source image by inputting the depth estimations (D.sub.t, D.sub.s), the target image, and the source image into a two-stream network architecture trained to output the 6 DOF transformation based at least in part on the depth estimations (D.sub.t, D.sub.s), the target image, and the source image.

SYSTEM AND METHOD FOR GENERATING REALISTIC SIMULATION DATA FOR TRAINING AN AUTONOMOUS DRIVER
20210312244 · 2021-10-07 · ·

A method for training a model for generating simulation data for training an autonomous driving agent, comprising: analyzing real data, collected from a driving environment, to identify a plurality of environment classes, a plurality of moving agent classes, and a plurality of movement pattern classes; generating a training environment, according to one environment class; and in at least one training iteration: generating, by a simulation generation model, a simulated driving environment according to the training environment and according to a plurality of generated training agents, each associated with one of the plurality of agent classes and one of the plurality of movement pattern classes; collecting simulated driving data from the simulated environment; and modifying at least one model parameter of the simulation generation model to minimize a difference between a simulation statistical fingerprint, computed using the simulated driving data, and a real statistical fingerprint, computed using the real data.

MAP UPDATE SYSTEM, DATA TRANSMISSION DEVICE, AND DATA TRANSMISSION METHOD
20210312228 · 2021-10-07 · ·

A map update system includes a data transmission device mounted on a vehicle and a map server which stores map data. The data transmission device generates sensor data representing a road environment of surroundings of the vehicle in a predetermined position, calculates a matching degree between the road environment of the surroundings of the vehicle and a road environment in the predetermined position represented by the map data, and causes a communication circuit mounted on the vehicle to transmit the sensor data and information representing the matching degree to the map server. The map server transmits the map data by utilizing sensor data, among the sensor data received via a communication device, having a matching degree less than a matching degree threshold with a higher priority than sensor data having a matching degree or greater than the matching degree threshold.

Vision-based frictionless self-checkouts for small baskets

A vison-based self-checkout terminal is provided. Purchased items are placed on a base and multiple cameras take multiple images of each item placed on the base. A location for each item placed on the base is determined along with a depth and the dimensions of each item at its given location on the base. Each item's images are then cropped, and item recognition is performed for each item on that item's cropped images with that item's corresponding depth and dimension attributes. An item identifier for each item is obtained along with a corresponding price and a transaction associated with items are completed.

Emotion recognition-based artwork recommendation method and device, medium, and electronic apparatus
11132547 · 2021-09-28 · ·

The present disclosure provides an emotion recognition-based artwork recommendation method and device. The method includes: obtaining a current biometric parameter of a user; determining a current emotion type of the user according to the current biometric parameter; selecting an image of an artwork corresponding to the current emotion type according to the current emotion type; and recommending an image of the artwork to the user by displaying the image of the artwork on the display screen.

IMAGE PROCESSING METHOD AND DEVICE, AND NETWORK TRAINING METHOD AND DEVICE
20210279892 · 2021-09-09 ·

An image processing method and a device, and a network training method and a device are provided. The image processing method includes determining a guide group arranged on an image to be processed and directed at a target object, the guide group comprising at least one guide point, and the guide point being used to indicate the position of a sampling pixel, and the magnitude and direction of the motion speed of the sampling pixel; and on the basis of the guide point in the guide group and the image to be processed, performing optical flow prediction to obtain the motion of the target object in the image to be processed.