Patent classifications
G06V10/806
BINOCULAR MATCHING METHOD AND APPARATUS, DEVICE AND STORAGE MEDIUM
Embodiments of the present application disclose a binocular matching method, including: obtaining an image to be processed, where the image is a two-dimensional (2D) image including a left image and a right image; constructing a three-dimensional (3D) matching cost feature of the image by using extracted features of the left image and extracted features of the right image, where the 3D matching cost feature includes a group-wise cross-correlation feature, or includes a feature obtained by concatenating the group-wise cross-correlation feature and a connection feature; and determining the depth of the image by using the 3D matching cost feature. The embodiments of the present application also provide a binocular matching apparatus, a computer device, and a storage medium.
System and method for audio-visual speech recognition
Disclosed herein is method of performing speech recognition using audio and visual information, where the visual information provides data related to a person's face. Image preprocessing identifies regions of interest, which is then combined with the audio data before being processed by a speech recognition engine.
Method and apparatus for detecting human face
The present disclosure discloses a method and apparatus for detecting a human face. A specific embodiment of the method comprises: acquiring a to-be-detected image; inputting the to-be-detected image into a pre-trained first convolutional neural network to obtain facial feature information, the first convolutional neural network being used to extract a facial feature; inputting the to-be-detected image into a pre-trained second convolutional neural network to obtain semantic feature information, the second convolutional neural network being used to extract a semantic features of the image; and analyzing the facial feature information and the semantic feature information to generate a face detection result. This embodiment improves accuracy of a detection result of a blurred image.
SHIP IDENTITY RECOGNITION METHOD BASED ON FUSION OF AIS DATA AND VIDEO DATA
Disclosed is a ship identity recognition method based on the fusion of AIS data and video data, comprising: collecting a ship sample to train a ship target classifier; performing, using the ship target classifier, ship target detection on a video frame collected by a gimbal camera; performing a comparison with a recognized ship library to filter a recognized ship; acquiring AIS data and filtering same across time and spatial scales; predicting the current position of an AIS target using a linear extrapolation method and converting the current position to an image coordinate system; performing position matching between a target to be matched and the converted AIS target; and performing feature extraction on the successfully matched target and storing the extracted feature, together with ship identity information, into the recognized ship library. Experimental results show that the present invention can quickly and accurately extract a surveillance video and perform identity recognition on the ship target, effectively reduces labor costs, and has a broad application prospect in the fields such as ship transportation and port management.
Target detection method and device
Embodiments of the present application disclose a target detection method and device, and relate to the technical field of video processing. The method comprises: obtaining an image sequence to be detected from a video to be detected according to an image sequence determining algorithm based on video timing (S101), extracting a first CNN feature of the image sequence to be detected based on a pre-trained CNN model, performing feature fusion on the first CNN feature based on a second CNN feature to obtain a first fused CNN feature of the image sequence to be detected (S102); inputting the first fused CNN feature into the first-level classifier, and obtaining first candidate target regions of the image sequence to be detected from an output of the first-level classifier (S103); determining a first input region of the second-level classifier based on the first candidate target regions (S104); obtaining a third CNN feature of the first input region based on the first fused CNN feature (S105); inputting the third CNN feature into the second-level classifier, and obtaining a target detection result for the image sequence to be detected based on the output of the second-level classifier (S106).
Method and System for Determining an Activity of an Occupant of a Vehicle
A computer implemented method for determining an activity of an occupant of a vehicle comprises the following steps carried out by computer hardware components: capturing sensor data of the occupant using at least one sensor; determining respective two-dimensional or three-dimensional coordinates for a plurality of pre-determined portions of the body of the occupant based on the sensor data; determining at least one portion of the sensor data showing a pre-determined body part of the occupant based on the sensor data and the two-dimensional or three-dimensional coordinates; and determining the activity of the occupant based on the two-dimensional or three-dimensional coordinates and the at least one portion of the sensor data.
METHOD AND APPARATUS FOR PROCESSING IMAGE
Embodiments of the present disclosure disclose a method and apparatus for processing an image. A specific embodiment of the method includes: acquiring a feature map of a target image, where the target image contains a target object; determining a local feature map of a target size in the feature map; combining features of different channels in the local feature map to obtain a local texture feature map; and obtaining location information of the target object based on the local texture feature map.
MULTI-PERSON POSE RECOGNITION METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM
In a multi-person pose recognition method, a to-be-recognized image is obtained, and a circuitous pyramid network is constructed. The circuitous network pyramid includes parallel phases, and each phase includes downsampling network layers, upsampling network layers, and a first residual connection layer to connect the downsampling and upsampling network layers. The phases are interconnected by a second residual connection layer. The circuitous pyramid network is traversed, by extracting a feature map for each phase, and the feature map of the last phase is determined to be the feature map of the to-be-recognized image. Multi-pose recognition is then performed on the to-be-recognized image according to the feature map to obtain a pose recognition result for the to-be-recognized image.
Method and System for Prediction and Mitigation of Spontaneous Combustion in Coal Stock Piles
A method for predicting conditions associated with a coal stock pile is described. The method includes collecting aerial data for a site including one or more coal stock piles. Using the aerial data, the method includes performing localization of the site to identify boundaries of the coal stock piles and extracting multi-spectral features. The method also includes obtaining additional data associated with the coal stock piles from at least one data source and merging the aerial data with the additional data. Using the merged data and the extracted multi-spectral features, the method includes analyzing a status of the coal stock piles by a prediction module to predict at least one of an impending combustion event or a severe condition associated with the coal stock piles. In response to the predicted at least one impending combustion event or severe condition, the method includes implementing a response.
MOBILE DEVICE NAVIGATION SYSTEM
Location mapping and navigation user interfaces may be generated and presented via mobile computing devices. A mobile device may detect its location and orientation using internal systems, and may capture image data using a device camera. The mobile device also may retrieve map information from a map server corresponding to the current location of the device. Using the image data captured at the device, the current location data, and the corresponding local map information, the mobile device may determine or update a current orientation reading for the device. Location errors and updated location data also may be determined for the device, and a map user interface may be generated and displayed on the mobile device using the updated device orientation and/or location data.