Patent classifications
G06V10/7753
UNSUPERVISED PROMPT LEARNING FOR DATA PRE-SELECTION WITH VISION-LANGUAGE MODELS
A method of performing data pre-selection for an object detection system includes receiving a first dataset that includes unlabeled data corresponding to one or more images, providing the first dataset and a plurality of learnable prompt vectors to a pre-training model. The learnable prompt vectors include text inputs. The method further includes generating, using the pre-training model, an unsupervised learning prompt based on the first dataset and the plurality of learnable prompt vectors. The unsupervised learning prompt corresponds to a multi-modal feature of the one or more images of the first dataset. The method further includes extracting features from either of the first dataset and a second dataset based on the unsupervised learning prompt, selecting and labeling a subset of instances of the extracted features, and generating and outputting a labeled dataset based on the labeled subset of instances.
Generative adversarial network medical image generation for training of a classifier
Mechanisms are provided to implement a machine learning training model. The machine learning training model trains an image generator of a generative adversarial network (GAN) to generate medical images approximating actual medical images. The machine learning training model augments a set of training medical images to include one or more generated medical images generated by the image generator of the GAN. The machine learning training model trains a machine learning model based on the augmented set of training medical images to identify anomalies in medical images. The trained machine learning model is applied to new medical image inputs to classify the medical images as having an anomaly or not.
SYSTEM AND METHOD FOR ONE-SHOT ANATOMY LOCALIZATION WITH UNSUPERVISED VISION TRANSFORMERS FOR THREE-DIMENSIONAL (3D) MEDICAL IMAGES
A method for performing one-shot anatomy localization includes obtaining a medical image of a subject. The method includes receiving a selection of both a template image and a region of interest within the template image, wherein the template image includes one or more anatomical landmarks assigned a respective anatomical label. The method includes inputting both the medical image and the template image into a trained vision transformer model. The method includes outputting from the trained vision transformer model both patch level features and image level features for both the medical image and the template image. The method still further includes interpolating pixel level features from the patch level features for both the medical image and the template image. The method includes utilizing the pixel level features within the region of interest of the template image to locate and label corresponding pixel level features in the medical image.
Systems and methods for contrastive learning of visual representations
Systems, methods, and computer program products for performing semi-supervised contrastive learning of visual representations are provided. For example, the present disclosure provides systems and methods that leverage particular data augmentation schemes and a learnable nonlinear transformation between the representation and the contrastive loss to provide improved visual representations. Further, the present disclosure also provides improvements for semi-supervised contrastive learning. For example, computer-implemented method may include performing semi-supervised contrastive learning based on a set of one or more unlabeled training data, generating an image classification model based on a portion of a plurality of layers in a projection head neural network used in performing the contrastive learning, performing fine-tuning of the image classification model based on a set of one or more labeled training data, and after performing the fine-tuning, distilling the image classification model to a student model comprising a relatively smaller number of parameters than the image classification model.
Training of machine learning systems for image processing
A computer-implemented method for training a machine learning system including: initializing parameters of the machine learning system and a metaparameter. Repeatedly carrying out the following as a loop: providing a batch of training data points and manipulating the provided training data points or a training method for optimizing the parameters of the machine learning system or a structure of the machine learning system based on the metaparameter. Ascertaining a cost function as a function of instantaneous parameters of the machine learning system and of the instantaneous metaparameters. Adapting the instantaneous parameters as a function of an ascertained first gradient, which has been ascertained with respect to the instantaneous parameters via the ascertained cost function for the training data points, and adapting the metaparameter as a function of a second gradient, which has been ascertained with respect to the metaparameter used in the preceding step via the ascertained cost function.
Systems and Methods for Contrastive Learning of Visual Representations
Systems, methods, and computer program products for performing semi-supervised contrastive learning of visual representations are provided. For example, the present disclosure provides systems and methods that leverage particular data augmentation schemes and a learnable nonlinear transformation between the representation and the contrastive loss to provide improved visual representations. Further, the present disclosure also provides improvements for semi-supervised contrastive learning. For example, computer-implemented method may include performing semi-supervised contrastive learning based on a set of one or more unlabeled training data, generating an image classification model based on a portion of a plurality of layers in a projection head neural network used in performing the contrastive learning, performing fine-tuning of the image classification model based on a set of one or more labeled training data, and after performing the fine-tuning, distilling the image classification model to a student model comprising a relatively smaller number of parameters than the image classification model.
ADAPTIVE HUMAN INSTANCE SEGMENTATION WITH STEREO VIEW CONSISTENCY
A system stores first and second images generated by first and second cameras; applies a segmentation model to the first image to generate a first segmentation mask identifying object instances; applies the segmentation model to the second image to generate a second segmentation mask identifying the object instances; projects the first segmentation mask to a viewpoint of the second camera to generate a first projected segmentation mask; converts the first projected segmentation mask and the second segmentation mask to first and second semantic masks, respectively; and computes a first similarity value based on the first and second semantic masks. This may be repeated exchanging the first and second images to compute a second similarity value. The system determines a loss value based on the first similarity value and the second similarity value and trains the segmentation model based on the loss value.
System and method for rare object localization and search in overhead imagery
A feature extractor and novel training objective are provided for content-based image retrieval. For example, a computer-implemented method includes applying a query image and a search image to a neural network of a feature extraction network of a computing device, the query image indicating an object to be searched for in the search image. The feature extraction network includes the neural network, a spatial feature neural network receiving a first output of the neural network pertaining to the search image, and an embedding network receiving a second output of the neural network pertaining to the query image. The method includes generating spatial search features from the spatial feature neural network, generating a query feature from the embedding network, applying the query feature to an artificial neural network (ANN) index, and determining an optimal matching result of an object in the search image based on an operation using the ANN index.
SEMI-AUTOMATIC LABELLING OF DATASETS
An unlabelled or partially labelled target dataset is modelled with a machine learning model for classification (or regression). The target dataset is processed by the machine learning model; a subgroup of the target dataset is prepared for presentation to a user for labelling or label verification; label verification or user re-labelling or user labelling of the subgroup is received; and the updated target dataset is re-processed by the machine learning model. User labelling or label verification combined with modelling an unclassified or partially classified target dataset with a machine learning model aims to provide efficient labelling of an unlabelled component of the target dataset.
Image enhancement using self-examples and external examples
Systems and methods are provided for image enhancement using self-examples in combination with external examples. In one embodiment, an image manipulation application receives an input image patch of an input image. The image manipulation application determines a first weight for an enhancement operation using self-examples and a second weight for an enhancement operation using external examples. The image manipulation application generates a first interim output image patch by applying the enhancement operation using self-examples to the input image patch and a second interim output image patch by applying the enhancement operation using external examples to the input image patch. The image manipulation application generates an output image patch by combining the first and second interim output image patches as modified using the first and second weights.