Patent classifications
G06V10/806
Image recognition method, apparatus, device, and computer storage medium
The present application discloses an image recognition method, apparatus, device, and a computer storage medium, which is related to a technical field of artificial intelligence, and in particular, to a technical field of image processing. The method includes: performing organ recognition on a human face image and marking positions of the human facial five sense organs in the human face image, obtaining a marked human face image; inputting the marked human face image into a backbone network model and performing feature extraction, obtaining defect features of the marked human face image outputted by different convolutional neural network levels of the backbone network model; and fusing the defect features of different levels that are located in a same area of the human face image, obtaining a defect recognition result of the human face image.
Systems and methods for utilizing models to identify a vehicle accident based on vehicle sensor data and video data captured by a vehicle device
A device may receive sensor data and video data associated with a vehicle, and may process the sensor data, with a rule-based detector model, to determine whether a probability of a vehicle accident satisfies a first threshold. The device may preprocess acceleration data of the sensor data to generate calibrated acceleration data, and may process the calibrated acceleration data, with an anomaly detector model, to determine whether the calibrated acceleration data includes anomalies. The device may filter the sensor data to generate filtered sensor data, and may process the filtered sensor data and anomaly data, with a decision model, to determine whether the probability of the vehicle accident satisfies a second threshold. The device may process the filtered sensor data, the anomaly data, and the video data, with a machine learning model, to determine whether the vehicle accident has occurred, and may perform one or more actions.
Facial expression recognition
Systems and techniques are provided for facial expression recognition. In some examples, a system receives an image frame corresponding to a face of a person. The system also determines, based on a three-dimensional model of the face, landmark feature information associated with landmark features of the face. The system then inputs, to at least one layer of a neural network trained for facial expression recognition, the image frame and the landmark feature information. The system further determines, using the neural network, a facial expression associated with the face.
Method, system, device and medium for landslide identification based on full polarimetric SAR
A method, a system, a device and a medium for landslide identification based on full Polarimetry Synthetic Aperture Radar (full PoISAR) are provided. The method mainly includes: registering target full PoISAR data with target optical remote sensing data and target digital elevation model data to obtain a first registration result and a second registration result; determining a polarization feature, a decomposition feature, and a terrain feature of a target area according to registration results; determining a texture feature and a hue feature of the target area according to the target full PoISAR data; determining a spectrum feature of the target area according to the target optical remote sensing data; fusing abovementioned multi-dimensional features to obtain a target fusion feature; and inputting the target fusion feature into a landslide mass identification model for identifying a landslide mass, so as to determine a landslide area in the target area.
IMAGE PROCESSING APPARATUS AND IMAGE PROCESSING METHOD THEREOF
An image processing apparatus used in a vehicle includes an image acquisition device that obtains an input image through a device in a vehicle or an external server, a feature extraction device that extracts feature data from the input image, and image recognition logic that recognizes an object from the feature data. The feature extraction device generates a transform image using the input image by means of a generative adversarial network (GAN), extracts first feature data associated with content from the input image, extracts second feature data associated with a style from the transform image, and learns an image recognition model based on the first feature data and the second feature data.
SINGLE STREAM MULTI-LEVEL ALIGNMENT FOR VISION-LANGUAGE PRETRAINING
A method is provided for pretraining vision and language models that includes receiving image-text pairs, each including an image and a text describing the image. The method encodes an image into a set of feature vectors corresponding to input image patches and a CLS token which represents a global image feature. The method parses, by a text tokenizer, the text into a set of feature vectors as tokens for each word in the text. The method encodes the CLS token from the NN based visual encoder and the tokens from the text tokenizer into a set of features by a NN based text and multimodal encoder that shares weights for encoding both the CLS token and the tokens. The method accumulates the weights from multiple iterations as an exponential moving average of the weights during the pretraining until a predetermined error threshold is reduced to be under a threshold amount.
SYSTEM AND METHOD FOR 3D OBJECT DETECTION USING MULTI-RESOLUTION FEATURES RECOVERY USING PANOPTIC SEGMENTATION INFORMATION
A system and method for 3D object detection using multi-resolution features recovery using panoptic segmentation information. Panoptic segmentation predictions from a panoptic segmentation network and intermediate feature maps from one or more early layers of an object detection network are received. Feature vectors are retrieved from the intermediate feature maps using the panoptic segmentation predictions. The retrieved feature vectors are combined with feature maps from one or more late layers of the object detection network for generating object detection predictions.
Gesture recognition using multiple antenna
Various embodiments wirelessly detect micro gestures using multiple antenna of a gesture sensor device. At times, the gesture sensor device transmits multiple outgoing radio frequency (RF) signals, each outgoing RF signal transmitted via a respective antenna of the gesture sensor device. The outgoing RF signals are configured to help capture information that can be used to identify micro-gestures performed by a hand. The gesture sensor device captures incoming RF signals generated by the outgoing RF signals reflecting off of the hand, and then analyzes the incoming RF signals to identify the micro-gesture.
METHOD OF ROAD DETECTION BASED ON INTERNET OF VEHICLES
A method of road detection based on Internet of Vehicles is provided, the method is applied to vehicle terminals and includes: obtaining a target road image captured by an image collection terminal and inputting it into an improved YOLOv3 network, performing feature extraction by using backbone network of dense connection to obtain feature images with different scales; performing feature fusion of top-to-down and dense connection to the feature images by using an improved feature pyramid networks (FPN) to obtain prediction results; obtaining attribute information of the target road image according to the prediction results; the attribute information includes positions and categories of objects in the target road image; the improved YOLOv3 is formed by based on YOLOv3 network, replacing residual modules of backbone network to dense connection modules, increasing feature extraction scale, optimizing feature fusion mode of FPN, performing pruning and performing network recovery processing guided by knowledge distillation.
6D POSE MEASUREMENT METHOD FOR MECHANICAL PARTS BASED ON VIRTUAL CONTOUR FEATURE POINTS
A 6D (six degree of freedom) pose measurement method for mechanical parts based on virtual contour feature points, wherein multiple lights are used for lighting in different times for photographing, so that the success rate of recognition of geometric features in images is increased. On the basis of successful matching of lines in a real image and a template image, whether an intersection point exists between spatial lines corresponding to matching lines is calculated; when an intersection point exits between the lines, coordinates of the intersection point of planar lines in the real image and the template image are resolved and are saved as a matching point pair; and then, ellipse features in the real image and the template image are detected, and two centers are matched according to the distances between the centers and the matching lines and are saved as a center pair if successfully matched; and finally, a 2D-3D relationship of the real image is established according to a 2D-3D coordinate relationship of the template image, and the pose of a part is resolved by a PnP algorithm.