Patent classifications
G06V10/7753
Infrared image sequence-based sleep quality evaluation system and method comprising performing evaluation by a classifier and counting the sleep quality evaluation results to determine the largest number to judge good or poor sleep quality
An infrared image sequence-based sleep quality evaluation system and method. The method comprises: obtaining a plurality of respiratory infrared image sequences to be evaluated, one respiratory infrared image sequence comprising a plurality of respiratory infrared image frames to be evaluated; performing sleep quality evaluation on each respiratory infrared image sequence in the plurality of respiratory infrared image sequences by means of a classifier to obtain a sleep quality evaluation result corresponding to each respiratory infrared image sequence; and counting the number of different sleep quality evaluation results according to the sleep quality evaluation results respectively corresponding to the plurality of respiratory infrared image sequences, and determining the sleep quality evaluation result with the largest number as a sleep quality evaluation result of a user. Contactless sleep monitoring can be carried out on a user, monitoring costs are reduced at the same time, and evaluation accuracy of sleep quality is improved.
Techniques for unsupervised anomaly classification using an artificial intelligence model
A method for operating a computing system on at least one processor includes performing search space reduction on input data using a first trained artificial intelligence model to generate relevant regions in the input data. The method also includes generating region proposals in the relevant regions using a second trained artificial intelligence model. The method further includes performing unsupervised anomaly classification on the region proposals using a third trained artificial intelligence model to classify each of the region proposals as normal or as an anomaly. The method further includes performing contextual filtering on the region proposals classified as anomalies to determine if any of the region proposals classified as anomalies are contextually normal using a fourth trained artificial intelligence model.
Systems, methods, and apparatuses for implementing advancements towards annotation efficient deep learning in computer-aided diagnosis
Embodiments described herein include systems for implementing annotation-efficient deep learning in computer-aided diagnosis. Exemplary embodiments include systems having a processor and a memory specially configured with instructions for learning annotation-efficient deep learning from non-labeled medical images to generate a trained deep-learning model by applying a multi-phase model training process via specially configured instructions for pre-training a model by executing a one-time learning procedure using an initial annotated image dataset; iteratively re-training the model by executing a fine-tuning learning procedure using newly available annotated images without re-using any images from the initial annotated image dataset; selecting a plurality of most representative samples related to images of the initial annotated image dataset and the newly available annotated images by executing an active selection procedure based on the which of a collection of un-annotated images exhibit either a greatest uncertainty or a greatest entropy; extracting generic image features; updating the model using the generic image features extracted; and outputting the model as the trained deep-learning model for use in analyzing a patient medical image. Other related embodiments are disclosed.
TARGET DETECTION MODEL TRAINING METHOD AND APPARATUS, AND ELECTRONIC DEVICE
A method and an apparatus for training target detection models and an electronic device are provided. The method includes: predicting unlabeled training data through a teacher model and a student model to obtain a first prediction result output by the teacher model and a second prediction result output by the student model; determining a target pseudo-label category to which the first prediction result belongs according to the confidence in the first prediction result; calculating a current pseudo-label loss based on the first prediction result, the second prediction result, and the pseudo-label loss function corresponding to the target pseudo-label category; and updating the student model according to the current pseudo-label loss and updating the teacher model based on the updated student model, and returning to predicting the unlabeled training data through the teacher model and the student model until a preset training end condition is met.
System and method for classifying images with a combination of nearest-neighbor-based label propagation and kernel principal component analysis
Described is a system for detecting and classifying new patterns of objects and images for applications where labeled data is scarce. In operation, the system trains a neural network with unlabeled images and extracts features with the neural network from both the unlabeled images and a set of labeled images to generate a feature space. Labels are propagated in the feature space using nearest neighbors, allowing for modeling of a per-class simplified distribution. An object in a new test image can then be classified using reconstruction error based on the per-class simplified distributions.
Training of 3D lane detection models for automotive applications
The present invention relates to a method for training artificial neural network configured for 3D lane detection based on unlabelled image data from camera. The method includes generating a first set of 3D lane boundaries in first coordinate system based on first image, generating a second set of 3D lane boundaries in second coordinate system based on second image, transforming at least one of the second set of 3D lane boundaries and first set of 3D lane boundaries based on positional data associated with first image and second image, evaluating the first set of 3D lane boundaries against second set of 3D lane boundaries in common coordinate system in order to find matching lane pairs of first set of 3D lane boundaries and second set of 3D lane boundaries, and updating one or more model parameters of an artificial neural network based on a spatio-temporal consistency loss.
Image to world space transformation for ground-truth generation in autonomous systems and applications
In various examples, image space coordinates of an image from a video may be labeled, projected to determine 3D vehicle space coordinates, then transformed to 3D world space coordinates using known 3D world space coordinates and relative positioning between the coordinate spaces. For example, 3D vehicle space coordinates may be temporally correlated with known 3D world space coordinates measured while capturing the video. The known 3D world space coordinates and known relative positioning between the coordinate spaces may be used to offset or otherwise define a transform for the 3D vehicle space coordinates to world space. Resultant 3D world space coordinates may be used for one or more labeled frames to generate ground truth data. For example, 3D world space coordinates for left and right lane lines from multiple frames may be used to define lane lines for any given frame.
Training machine learning models based on unlabeled data
A method of labeling data and training a model is provided. The method includes obtaining a set of images. The set of images includes a first subset and a second subset. The first subset is associated with a first set of labels. The method also includes generating a set of pseudo labels for the set of images and a second set of labels for the second subset based on the first subset, the second subset, a first machine learning model, and a domain adaption model. The method further includes generating second machine learning model. The second machine learning model is generated based on the set of images, the set of pseudo labels, the first set of labels, and the second set of labels. The second set of labels is updated based on one or more inferences generated by the second machine learning model.
Method and system for semi-supervised state transition detection for object tracking
A system determines an input video and a first annotated image from the input video which identifies an object of interest. The system initiates a tracker based on the first annotated image and the input video. The tracker generates, based on the first annotated image and the input video, information including: a sliding window for false positives; a first set of unlabeled images from the input video; and at least two images with corresponding labeled states. A semi-supervised classifier classifies, based on the information, the first set of unlabeled images from the input video. If a first unlabeled image is classified as a false positive, the system reinitiates the tracker based on a second annotated image occurring in a frame prior to a frame with the false positive. The system generates an output video comprising the input video displayed with tracking on the object of interest.
Systems and methods for video models with procedure understanding
Embodiments described herein provide systems and methods for training video models to perform a task from an input instructional video. A procedure knowledge graph (PKG) may be generated with nodes representing procedure steps, and edges representing relationships between the steps. The PKG may be generated based on text and/or video training data which includes procedures (e.g., instructional videos). Using the PKG, a video model may be trained using the PKG to provide supervisory training signals for a number of tasks. Once the model is trained, it may be fine-tuned for a specific task which benefits from the model being trained in a way that makes the model embed procedural information when encoding videos.