Patent classifications
G06F18/253
CAMERA-RADAR SENSOR FUSION USING LOCAL ATTENTION MECHANISM
Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for processing sensor data. In one aspect, a method includes obtaining image data representing a camera sensor measurement of a scene; obtaining radar data representing a radar sensor measurement of the scene; generating a feature representation of the image data; generating a respective initial depth estimate for each of a subset of the plurality of pixels; generating a feature representation of the radar data; for each of the subset of the plurality of pixels, generating a respective adjusted depth estimate for the pixel using the initial depth estimate for the pixel and the radar feature vectors for a corresponding subset of the plurality of radar reflection points; generating a fused point cloud that includes a plurality of three-dimensional data points; and processing the fused point cloud to generate an output that characterizes the scene.
Activity classification based on multi-sensor input
A method for classifying activity based on multi-sensor input includes receiving, from two or more sensors, sensor data indicating activity within a building, determining, for each of the two or more sensors and based on the received sensor data, (i) an extracted feature vector for activity within the building and (ii) location data, labelling each of the extracted feature vectors with the location data, generating, using the extracted feature vectors, an integrated feature vector, detecting a particular activity based on the integrated feature vector, and in response to detecting the particular activity, performing a monitoring action.
Gesture recognition using multiple antenna
Various embodiments wirelessly detect micro gestures using multiple antenna of a gesture sensor device. At times, the gesture sensor device transmits multiple outgoing radio frequency (RF) signals, each outgoing RF signal transmitted via a respective antenna of the gesture sensor device. The outgoing RF signals are configured to help capture information that can be used to identify micro-gestures performed by a hand. The gesture sensor device captures incoming RF signals generated by the outgoing RF signals reflecting off of the hand, and then analyzes the incoming RF signals to identify the micro-gesture.
Spoofing detection apparatus, spoofing detection method, and computer-readable recording medium
A spoofing detection apparatus comprises obtaining, from an image capture apparatus, a first image frame that includes the face of a subject person obtained when a light-emitting apparatus is emitting light and a second image frame that includes the face of the subject person obtained when the light-emitting apparatus is turned off, extracting, from the first image frame, information specifying a face portion of the subject person, and extract, from the second image frame, information specifying a face portion of the subject person, extracting a portion that includes a bright point formed by reflection in an iris region of an eye of the subject person, from the first image frame, extracts a portion corresponding to the portion that includes the bright point, from the second image frame, and calculates a feature that is independent of the position of the bright point, and determining authenticity of subject person based on the feature.
Method for detecting <i>Ophiocephalus argus </i>cantor under intra-class occulusion based on cross-scale layered feature fusion
Disclosed is a method for detecting Ophiocephalus argus cantor under intra-class occulusion based on cross-scale layered feature fusion, including image collecting, image processing and network model, where collected images are labeled, image sizes are adjusted to obtain input images, and the input images are input into an object detection network, integrated by convolution and inserted into cross-scale layered feature fusion modules, characterized by including dividing all features input into the cross-scale layered feature fusion modules into n layers, composed of s feature mapping subsets, and fusing features of each feature mapping subset with that of other feature mapping subsets, and connecting; carrying out convolution operation, outputting training result; adjusting network parameters by a loss function to obtain parameters for a network model; inputting final output candidate boxes into a non-maximum suppression module to screen correct prediction boxes, so that prediction result is obtained.
RADIO FREQUENCY ENVIRONMENT AWARENESS WITH EXPLAINABLE RESULTS
A Deep-Learning (DL) explainable AI system for Radio Frequency (RF) machine learning applications with expert driven neural explainability of input signals combines three algorithms (A1, A2, and A3). A1 is a neural network that learns to classify spectrograms. During training, A1 learns to map a spectrogram to its paired label. It outputs a label estimate from a spectrogram. Labels account for device number and spectrum utilization. The neural network is built on two-dimensional dilated causal convolutions to account for frequency and time dimensions of spectrogram data. A2 is a user-defined function that converts an input spectrogram into a vector that quantifies human-identifiable elements of the spectrogram. A3 is a random forest feature extraction algorithm. It takes as input the outputs of A2 and A1. From these, A3 learns which elements in the vector output by A2 were most important for choosing the labels output from A1.
Answering questions during video playback
In implementations of answering questions during video playback, a video system can receive a question related to a video at a timepoint of the video during playback of the video, and determine audio sentences of the video that occur within a segment of the video that includes the timepoint. The video system can generate a classification vector from words of the question and the audio sentences, and determine an answer to the question utilizing the classification vector. The video system can obtain answer candidates, and the answer to the question can be selected as one of the answer candidates based on matching the classification vector to one of the answer vectors.
OBJECT IDENTIFICATION
Object identification may be provided herein. A feature extractor may extract a first set of visual features, extract a second set of visual features, concatenate the first set of visual features, the second set of visual features, and a set of bounding box information, determine a number of object features and a global feature for a scene, and receive ego-vehicle feature information associated with an ego-vehicle. An object classifier may receive the number of object features, the global feature, and the ego-vehicle feature information, generate relational features with respect to relationships between each of the number of objects from the scene, and classify each of the number of objects from the scene based on the number of object features, the relational features, the global feature, the ego-vehicle feature information, and an intention of the ego-vehicle.
DEEP LEARNING SYSTEM FOR DETERMINING AUDIO RECOMMENDATIONS BASED ON VIDEO CONTENT
Embodiments are disclosed for determining an answer to a query associated with a graphical representation of data. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving an input including an unprocessed audio sequence and a request to perform an audio signal processing effect on the unprocessed audio sequence. The one or more embodiments further include analyzing, by a deep encoder, the unprocessed audio sequence to determine parameters for processing the unprocessed audio sequence. The one or more embodiments further include sending the unprocessed audio sequence and the parameters to one or more audio signal processing effects plugins to perform the requested audio signal processing effect using the parameters and outputting a processed audio sequence after processing of the unprocessed audio sequence using the parameters of the one or more audio signal processing effects plugins.
Method and apparatus for vehicle damage assessment, electronic device, and computer storage medium
A method and apparatus for vehicle damage assessment, an electronic device, and a computer-readable storage medium are provided. The method may include: extracting, from an input image, a first feature characterizing a part of a vehicle and a second feature characterizing a damage type of the vehicle; integrating the first feature and the second feature to generate a third feature characterizing a corresponding relation between the part and the damage type; converting the third feature into a characteristic vector; and determining a damage recognition result based on the characteristic vector. According to the technical solution of the disclosure, users can rapidly and accurately learn about the damage condition of the vehicle by providing pictures or videos of the damaged vehicle, thus providing an objective basis for subsequent damage assessment, claim settlement, and repair.