G06V10/806

ELECTRONIC DEVICE, SPEECH RECOGNITION METHOD THEREFOR, AND MEDIUM
20240038238 · 2024-02-01 ·

Embodiments of this application provide a speech recognition method. The speech recognition method includes: obtaining a facial depth image and a to-be-recognized voice of a user, where the facial depth image is an image collected by using a depth camera; recognizing a mouth shape feature from the facial depth image, and recognizing a voice feature from a to-be-recognized audio; and fusing the voice feature and the mouth shape feature into an audio-video feature, and recognizing, based on the audio-video feature, a voice uttered by the user.

METHOD FOR EMOTION RECOGNITION BASED ON HUMAN-OBJECT TIME-SPACE INTERACTION BEHAVIOR
20240037992 · 2024-02-01 · ·

An emotion recognition method includes the following steps: acquiring video data of a human-object interaction behavior process; performing data labeling on the positions of a person and an object and the interaction behaviors and emotions expressed by the person; constructing a feature extraction model based on deep learning, extracting features of interaction between the person and the object in a time-space dimension, and detecting the position and category of the human-object interaction behavior; mapping the detected interaction behavior category into a vector form through a word vector model; and finally, constructing a fusion model based on deep learning, fusing the interaction behavior vector and the time-space interaction behavior features, and identifying the emotion expressed by the interaction person.

METHOD, SYSTEM AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM FOR SEARCHING SIMILAR PRODUCTS USING A MULTI TASK LEARNING MODEL

A method of searching for similar products using a multi-task learning (MTL) model is provided. The method includes: converting, by using a multi-task learning model utilizing a unified backbone network, each of an original image including a fashion item into a single vector including at least one multi-task attribute; and generating a visual search database by storing the original image and the single vector of the original image together.

Method of matching image and apparatus thereof, device, medium and program product
11886492 · 2024-01-30 · ·

Embodiments of the present disclosure provide a method of matching image, an apparatus of matching image, a device, and a computer-readable storage medium. The method includes: acquiring an image to be matched; determining a key point feature similarity between any image in an image library and the image to be matched, and determining a color feature similarity between the any image and the image to be matched; determining a fusion similarity between the any image and the image to be matched according to the key point feature similarity and the color feature similarity; and determining whether an image matching the image to be matched exists in the image library or not according to the fusion similarity between each of at least one image in the image library and the image to be matched.

System and method for camera radar fusion

A method for camera radar fusion includes receiving, by the processor, radar object detection data for an object and modeling, by a processor, a three dimensional (3D) physical space kinematic model, including updating 3D coordinates of the object, to generate updated 3D coordinates of the object, in response to receiving the radar object detection data for the object. The method also includes transforming, by the processor, the updated 3D coordinates of the object to updated two dimensional (2D) coordinates of the object, based on a 2D-3D calibrated mapping table and modeling, by the processor, a two dimensional (2D) image plane kinematic model, while modeling the 3D physical space kinematic model, where modeling the 2D image plane kinematic model includes updating coordinates of the object based on the updated 2D coordinates of the object.

Recognition of objects in images with equivariance or invariance in relation to the object size

A method for recognizing at least one object in at least one input image. In the method, a template image of the object is processed by a first convolutional neural network (CNN) to form at least one template feature map; the input image is processed by a second CNN to form at least one input feature map; the at least one template feature map is compared to the at least one input feature map; it is evaluated from the result of the comparison whether and possibly at which position the object is contained in the input image, the convolutional neural networks each containing multiple convolutional layers, and at least one of the convolutional layers being at least partially formed from at least two filters, which are convertible into one another by a scaling operation.

OPTOELECTRONIC SENSOR AND METHOD FOR A SAFE EVALUATION OF MEASUREMENT DATA
20190391294 · 2019-12-26 ·

An optoelectronic sensor for detecting objects in a monitored zone is provided having at least one light receiver for generating measurement data from received light from the monitored zone and having a safe evaluation unit that has at least two processing channels for a redundant processing of the measurement data and having a comparison unit for comparing processing results of the processing channels to uncover errors in a processing channel 30a-b. The processing channels are here each configured for a determination of a signature with respect to their processing results; and the comparator unit is configured for a comparison of the signatures.

EXPRESSION RECOGNITION METHOD, APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM
20190392202 · 2019-12-26 ·

Embodiments of the present disclosure provide an expression recognition method, apparatus, electronic device and storage medium. An expression recognition model includes a convolutional neural network model, a fully connected network model and a bilinear network model. During an expression recognition process, after an image to be recognized is pre-processed to obtain a facial image and a key point coordinate vector, the facial image is computed by the convolutional neural network model to output a first feature vector, the key point coordinate vector is computed by the fully connected network model to output a second feature vector, the first feature vector and the second feature vector are computed by the bilinear network model to obtain second-order information, and an expression recognition result in turn is obtained according to the second-order information. During this process, robustness of gestures and illuminations is better, and accuracy of expression recognition is improved.

Biometric authentication

A method comprising using at least one hardware processor for: providing a set of development supervectors representing features of biometric samples of multiple subjects, the biometric samples being of at least a first and a second different biometric modalities; providing at least a first and a second enrollment supervectors representing features of at least a first and a second enrollment biometric samples of a target subject correspondingly, wherein the at least first and second enrollment samples are of the at least first and the second different biometric modalities correspondingly; providing at least a first and a second verification supervectors representing features of at least a first and a second verification biometric samples of the target subject correspondingly, wherein the at least first and second verification samples are of the at least first and second different biometric modalities correspondingly; concatenating the development supervectors to a set of development generic supervector, the at least first and second enrollment supervectors to a single enrollment generic supervector and the at least first and second verification supervectors to a single verification generic supervector; and verifying an identity of the target subject based on a fused score calculated for the verification generic supervector, wherein the fused score is calculated based on the enrollment generic supervector and the set of development generic supervectors.

FISH SCHOOL DETECTION METHOD AND SYSTEM THEREOF, ELECTRONIC DEVICE AND STORAGE MEDIUM
20240104900 · 2024-03-28 ·

A fish school detection method and a system thereof, an electronic device and a storage medium are provided, the method includes inputting a to-be-detected fish school image into a fish school detection model; the fish school detection model including a feature extraction layer, a feature fusion layer and a feature recognition layer; extracting feature information of the to-be-detected fish school image based on the feature extraction layer, and determining a fish school feature map and an attention feature map based on an attention mechanism; fusing the fish school feature map and the attention feature map based on the feature fusion layer to determine a target fusion feature map; and determining a target fish school detection result based on the feature recognition layer and the target fusion feature map. Interference from environmental factors on detection results is eliminated, so as to effectively improve accuracy of the fish detection.