Patent classifications
G06V10/806
Control method and device for mobile platform, and computer readable storage medium
A control method for a mobile platform includes obtaining a captured image, identifying one or more candidate first characteristic parts from the captured image, determining a second characteristic part of a target object in the captured image, determining one or more matching parameters each corresponding to one of the one or more candidate first characteristic parts based on the one or more candidate first characteristic parts and the second characteristic part, determining a target first characteristic part of the target object from the one or more candidate first characteristic parts based on the one or more matching parameters, and switching from tracking the second characteristic part to tracking the target first characteristic part in response to a tracking parameter of the target object meeting a preset tracking condition.
Multi-modal detection engine of sentiment and demographic characteristics for social media videos
A system and method for determining a sentiment, a gender and an age group of a subject in a video while the video is being played back. The video is separated into visual data and audio data, the video data is passed to a video processing pipeline and the audio data is passed to both an acoustic processing pipeline and a textual processing pipeline. The system and method performs, in parallel, a video feature extraction process in the video processing pipeline, an acoustic feature extraction process in the acoustic processing pipeline, and a textual feature extraction process in the textual processing pipeline. The system and method combines a resulting visual feature vector, acoustic feature vector, and a textual feature vector into a single feature vector, and determines the sentiment, the gender and the age group of the subject by applying the single feature vector to a machine learning model.
IMAGE PROCESSING SYSTEM USING RECURRENT NEURAL NETWORKS
A method and system is described which attempts to address the technical problems involved in analyzing images using advanced computer systems and making decisions about the future of a damaged automobile based on the images.
SURFACE SENSING PROBE AND METHODS OF USE
Disclosed is a surface sensing apparatus, one embodiment having a source of coherent radiation capable of outputting wavelength emissions to create a first illumination state to illuminate a surface and create a first speckle pattern, an emission deviation facility capable of influencing the emission to illuminate the surface and create a second illumination state and a second speckle pattern, and a sensor capable of sensing a representation of the first and a second speckle intensity from the first and second speckle pattern. Also disclosed are methods of sensing properties of the surface, one embodiment comprising the steps of illuminating the surface having a first surface state with the source of coherent radiation emission, sensing a first speckle intensity from the surface, influencing a relationship of the surface to the emission to create a second surface state and sensing a second speckle intensity from the surface at the second surface state.
Gesture recognition using multiple antenna
Various embodiments wirelessly detect micro gestures using multiple antenna of a gesture sensor device. At times, the gesture sensor device transmits multiple outgoing radio frequency (RF) signals, each outgoing RF signal transmitted via a respective antenna of the gesture sensor device. The outgoing RF signals are configured to help capture information that can be used to identify micro-gestures performed by a hand. The gesture sensor device captures incoming RF signals generated by the outgoing RF signals reflecting off of the hand, and then analyzes the incoming RF signals to identify the micro-gesture.
POINT CLOUD SEGMENTATION METHOD, COMPUTER-READABLE STORAGE MEDIUM, AND COMPUTER DEVICE
This application relates to a point cloud segmentation method, a computer-readable storage medium, and a computer device. The method includes encoding a to-be-processed point cloud to obtain a shared feature, the shared feature referring to a feature shared at a semantic level and at an instance level; decoding the shared feature to obtain a semantic feature and an instance feature respectively; adapting the semantic feature to an instance feature space and fusing the semantic feature with the instance feature, to obtain a semantic-fused instance feature of the point cloud, the semantic-fused instance feature representing an instance feature fused with the semantic feature; dividing the semantic-fused instance feature of the point cloud, to obtain a semantic-fused instance feature of each point in the point cloud; and determining an instance category to which each point belongs according to the semantic-fused instance feature of each point.
3D IMAGE CLASSIFICATION METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
The disclosure provides a three-dimensional (3D) image classification method and apparatus, a device, and a storage medium. The method includes: obtaining a 3D image, the 3D image including first-dimensional image information, second-dimensional image information, and third-dimensional image information; extracting a first image feature corresponding to planar image information from the 3D image; extracting a second image feature corresponding to the third-dimensional image information from the 3D image; fusing the first image feature and the second image feature, to obtain a fused image feature corresponding to the 3D image; and determining a classification result corresponding to the 3D image according to the fused image feature corresponding to the 3D image.
METHOD AND APPARATUS FOR DETERMINING FOOTPRINT IDENTITY USING DIMENSION REDUCTION ALGORITHM
A method of determining footprint identity using a dimension reduction algorithm according to an embodiment includes: pre-processing to process three-dimensional (3D) image data about footprints of a first person and a second person and convert the 3D image data into one-dimensional (1D) data about the footprints of the first person and the second person; calculating a distribution of cross-correlation coefficients between two pieces of 1D data about footprints of the first person (SC: same footwear correlation) and a distribution of cross-correlation coefficients between the 1D data about the footprints of the first person and the second person (DC: difference footwear correlation); and calculating a likelihood ratio based on the SC and the DC to determine the degree of correspondence between the footprints of the first person and the second person.
REPRESENTATION LEARNING FROM VIDEO WITH SPATIAL AUDIO
A computer system is trained to understand audio-visual spatial correspondence using audio-visual clips having multi-channel audio. The computer system includes an audio subnetwork, video subnetwork, and pretext subnetwork. The audio subnetwork receives the two channels of audio from the audio-visual clips, and the video subnetwork receives the video frames from the audio-visual clips. In a subset of the audio-visual clips the audio-visual spatial relationship is misaligned, causing the audio-visual spatial cues for the audio and video to be incorrect. The audio subnetwork outputs an audio feature vector for each audio-visual clip, and the video subnetwork outputs a video feature vector for each audio-visual clip. The audio and video feature vectors for each audio-visual clip are merged and provided to the pretext subnetwork, which is configured to classify the merged vector as either having a misaligned audio-visual spatial relationship or not. The subnetworks are trained based on the loss calculated from the classification.
METHOD AND APPARTAUS FOR DATA EFFICIENT SEMANTIC SEGMENTATION
A method and system for training a neural network are provided. The method includes receiving an input image, selecting at least one data augmentation method from a pool of data augmentation methods, generating an augmented image by applying the selected at least one data augmentation method to the input image, and generating a mixed image from the input image and the augmented image.