Patent classifications
G06V20/46
Machine Learning Architecture for Imaging Protocol Detector
Systems and methods disclosed herein use a first machine learning architecture and a second machine learning architecture where the first machine learning architecture executes on a first processor and receives a first image representing a mouth of a user, determines user feedback for outputting to the user based on a first machine learning model, and outputs the user feedback for capturing a second image representing the mouth of the user. The second machine learning architecture executes on a second processor and receives the first image and the second image, and generates a 3D model of at least a portion of a dental arch of the user based on the first image and the second image where the 3D model is generated based on a second machine learning model of the second machine learning architecture.
METHOD AND SYSTEM FOR AUTOMATIC PRE-RECORDATION VIDEO REDACTION OF OBJECTS
A system and a method for automatic video redaction are provided herein. The method may include: receiving, an input video comprising a sequence of frames captured by a camera, wherein the input video includes live video obtained directly from the camera, wherein recordation of the video directly from the camera is disabled; performing visual analysis of the input video, to detect portions of the frames of the input video in which one of a plurality of predefined objects or a descriptor thereof is detected; generating a redacted input video by replacing the portions of the frames with new portions of another visual content; and recording the redacted input video on a data storage device, wherein the generating of thethe redacted input video, is carried out by a computer processor, after the input video is captured by the camera and before the recording of the redacted input video on the data storage device.
HUMAN-OBJECT INTERACTION DETECTION
A human-object interaction detection method, a neural network and a training method therefor is provided. The human-object interaction detection method includes: extracting a plurality of first target features and one or more first motion features from an image feature of an image to be detected; fusing each first target feature and some of the first motion features to obtain enhanced first target features; fusing each first motion feature and some of the first target features to obtain enhanced first motion features; processing the enhanced first target features to obtain target information of a plurality of targets including human targets and object targets; processing the enhanced first motion features to obtain motion information of one or more motions, where each motion is associated with one human target and one object target; and matching the plurality of targets with the one or more motions to obtain a human-object interaction detection result.
METHOD AND APPARATUS FOR VIDEO RECOGNITION
Broadly speaking, the present techniques generally relate to a method and apparatus for video recognition, and in particular relate to a computer-implemented method for performing video recognition using a transformer-based machine learning, ML, model. Put another way, the present techniques provide new methods of image processing in order to automatically extract feature information from a video.
VIDEO PROCESSING METHOD, APPARATUS AND SYSTEM
The present disclosure provides video processing methods, apparatuses and systems. The method includes: obtaining a to-be-processed video, where the to-be-processed video is obtained by performing feature removal processing for one or more objects in an original video; obtaining a feature restoration processing request for one or more to-be-processed objects; according to the feature restoration processing request for the one or more to-be-processed objects, obtaining feature image information corresponding to the one or more to-be-processed objects, where the feature image information for one of the one or more to-be-processed objects includes pixel position information of all or part of features for the one of the one or more to-be-processed objects in the original video; according to the feature image information for the one or more to-be-processed objects, performing feature restoration processing for the one or more to-be-processed objects in the to-be-processed video.
HUMAN-OBJECT INTERACTION DETECTION
A human-object interaction detection method, a neural network and a training method therefor is provided. The human-object interaction detection method includes: performing first target feature extraction on an image feature of an image; performing first interaction feature extraction on the image feature; processing a plurality of first target features to obtain target information of a plurality of detected targets; processing one or more first interaction features to obtain motion information of a motion, human information of a human target corresponding to each motion, and object information of an object target corresponding to each motion; matching the plurality of detected targets with one or more motions; and updating human information of a corresponding human target based on target information of a detected target matching the corresponding human target, and updating object information of a corresponding object target based on target information of a detected target matching the corresponding object target.
HUMAN-OBJECT INTERACTION DETECTION
A human-object interaction detection method, a neural network and a training method therefor is provided. The human-object interaction detection method includes: performing first target feature extraction on image features of an image to obtain first target features; performing first interaction feature extraction on image features to obtain first interaction features and scores thereof; determining at least some first interaction features in the first interaction features based on the score of each of the first interaction features; determining first motion features based on the at least some first interaction features and the image features; processing the first target features to obtain target information of targets in the image; processing the first motion features to obtain motion information of one or more motions in the image; and matching the targets with the motions to obtain a human-object interaction detection result.
SYSTEM AND METHOD FOR CALIBRATING A TIME DIFFERENCE BETWEEN AN IMAGE PROCESSOR AND AN INTERTIAL MEASUREMENT UNIT BASED ON INTER-FRAME POINT CORRESPONDENCE
Systems and methods are used for calibrating a time difference between an image signal processor (ISP) and an inertial measurement unit (IMU) of an image capture device. An image capture device includes a lens, an image sensor, an IMU, and an ISP. The image sensor detects images as frames and the IMU captures motion data. The ISP detects one or more key points on the frames and matches the one or more key points between the frames. The ISP computes one or more calibration parameters. The one or more calibration parameters are based on the matched key points and a time difference between the ISP and the IMU. The ISP performs a calibration using the calibration parameters.
SIMULATION OF LIKENESSES AND MANNERISMS IN EXTENDED REALITY ENVIRONMENTS
In one example, a method performed by a processing system including at least one processor includes obtaining video footage of a first subject, creating a profile for the first subject, based on features extracted from the video footage, obtaining video footage of a second subject different from the first subject, adjusting movements of the second subject in the video footage of the second subject to mimic movements of the first subject as embodied in the profile for the first subject, to create video footage of a modified second subject, verifying that the video footage of the modified second subject is consistent with a policy specified in the profile for the first subject, and rendering a media including the video footage of the modified second subject when the video footage of the modified second subject is consistent with the policy specified in the profile for the first subject.
APPARATUS OF SELECTING VIDEO CONTENT FOR AUGMENTED REALITY, USER TERMINAL AND METHOD OF PROVIDING VIDEO CONTENT FOR AUGMENTED REALITY
A video content selecting apparatus for augmented reality is provided. The apparatus includes a communication interface; and an operation processor configured to perform: (a) collect a plurality of video contents through the Internet; (b) extract feature information and metadata for each of the plurality of video contents, and generate a hash value corresponding to the feature information by using a predetermined hashing function; (c) manage a database to include at least the hash value and the metadata of each of the plurality of video contents; (d) receive object information corresponding to an object in a real-world environment from a user terminal through the communication interface; (e) search the database based on the object information and select a video content corresponding to the object information from among the plurality of video contents; and (f) transmit the metadata of the selected video content to the user terminal through the communication interface.