G06V10/7753

Applying self-confidence in multi-label classification to model training

A computer model is trained to classify regions of a space (e.g., a pixel of an image or a voxel of a point cloud) according to a multi-label classification. To improve the model's accuracy, the model's self-confidence is determined with respect to its own predictions of regions in a training space. The self-confidence is determined based on the class predictions, such as a difference between the highest-predicted class and a second-highest-predicted class. When these are similar, it may reflect areas for potential improvement by focusing training on these low-confidence areas. Additional training may be performed by including modified training data in subsequent training iterations that focuses on low-confidence areas. As another example, additional training may be performed using the self-confidence to modify a classification loss used to refine parameters of the model.

SYSTEMS AND METHODS FOR DETERMINING REGIONS OF INTEREST IN HISTOLOGY IMAGES

A method and apparatus is provided for determining one or more regions of interest in an input histology image. Such methods can include receiving an input histology image, and tiling the input histology image into a set of tiles. In various embodiments, the method can also include, for each tile, extracting a feature of that tile by applying a trained feature extractor. The trained feature extractor can be trained with an unsupervised machine learning algorithm using a training set of images. The method can also include clustering the extracted features to assign each of the set of tiles to one of a plurality of regions of interest for each tile, and outputting the plurality of regions of interest.

METHOD FOR MANAGING ANNOTATION JOB, APPARATUS AND SYSTEM SUPPORTING THE SAME
20200152316 · 2020-05-14 ·

A computing device obtains information about a medical slide image, and determines a dataset type of the medical slide image and a panel of the medical slide image. The computing device assigns to an annotator account, an annotation job defined by at least the medical slide image, the determined dataset type, an annotation task, and a patch that is a partial area of the medical slide image. The annotation task includes the determined panel, and the panel is designated as one of a plurality of panels including a cell panel, a tissue panel, and a structure panel. The dataset type indicates a use of the medical slide image and is designated as one of a plurality of uses including a training use of a medical learning model and a validation use of the machine learning model.

REGULARIZED MULTI-METRIC ACTIVE LEARNING SYSTEM FOR IMAGE CLASSIFICATION
20200151518 · 2020-05-14 · ·

A regularized multi-metric active learning (AL) image classification system which includes three main parts. First, a regularized multi-metric learning process is utilized to jointly learn distinct metrics for different types of image features from remotely sensed image data. The regularizer incorporates the unlabeled data based on the neighborhood relationship, which helps avoid overfitting at early stages of AL, when the quantity of training data is particularly small. Then, as AL proceeds, the regularizer is also updated through similarity propagation, thus taking advantage of informative labeled samples. Finally, multiple features are projected into a common feature space, in which a batch-mode AL strategy combining uncertainty and diversity is utilized in conjunction with k-nearest neighbor (kNN) classification to enrich the set of labeled samples.

Training variational autoencoders to generate disentangled latent factors

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a variational auto-encoder (VAE) to generate disentangled latent factors on unlabeled training images. In one aspect, a method includes receiving the plurality of unlabeled training images, and, for each unlabeled training image, processing the unlabeled training image using the VAE to determine the latent representation of the unlabeled training image and to generate a reconstruction of the unlabeled training image in accordance with current values of the parameters of the VAE, and adjusting current values of the parameters of the VAE by optimizing a loss function that depends on a quality of the reconstruction and also on a degree of independence between the latent factors in the latent representation of the unlabeled training image.

Data augmentation for image classification tasks

A computer-implemented method and systems are provided for performing machine learning for an image classification task. The method includes selecting, by a processor operatively coupled to one or more databases, a first and a second image from one or more training sets in the one or more databases. The method further includes overlaying, by the processor, the second image on the first image to form a mixed image, by averaging an intensity of each of a plurality of co-located pixel pairs in the first and the second image. The method also includes training, by the processor, a machine learning process configured for the image classification task using the mixed image to augment data used by the machine learning process for the image classification task.

SYSTEM FOR SIMPLIFIED GENERATION OF SYSTEMS FOR BROAD AREA GEOSPATIAL OBJECT DETECTION
20200089930 · 2020-03-19 ·

A system for simplified generation of systems for analysis of satellite images to geolocate one or more objects of interest. A plurality of training images labeled for a study object or objects with irrelevant features loaded into a preexisting feature identification subsystem causes automated generation of models for the study object. This model is used to parameterize pre-engineered machine learning elements that are running a preprogrammed machine learning protocol. Training images with the study are used to train object recognition filters. This filter is used to identify the study object in unanalyzed images. The system reports results in a requestor's preferred format.

Semi-supervised learning with group constraints

A computer-implemented method for classification of data by a machine learning system using a logic constraint for reducing a data labeling requirement. The computer-implemented method includes: generating a first embedding space from a first partially labeled training data set, wherein in the first embedding space, content-wise related training data of the first partially labeled training data are clustered together, determining at least two clusters in the first embedding space formed from the first partially labeled training data, and training a machine learning model based, at least in part, on a second partially labeled training data set and the at least two clusters, wherein the at least two clusters are used as training constraints.

Tissue nodule detection and tissue nodule detection model training method, apparatus, device, and system

This application relates to a tissue nodule detection and tissue nodule detection model training method, apparatus, device, storage medium and system. The method for training a tissue nodule detection model includes: obtaining source domain data and target domain data, the source domain data comprising a source domain image and an image annotation, the target domain data comprising a target image, and the image annotation being used for indicating location information of a tissue nodule in the source domain image; performing feature extraction on the source domain image using a neural network model to obtain a source domain sampling feature, performing feature extraction on the target image using the neural network model to obtain a target sampling feature, and determining a model result according to the source domain sampling feature using the neural network model; determining a distance parameter between the source domain data and the target domain data according to the source domain sampling feature and the target sampling feature, the distance parameter being a parameter describing a magnitude of a data difference between the source domain data and the target domain data; determining, according to the model result and the image annotation, a loss function value corresponding to the source domain image; and training the neural network model to obtain a tissue nodule detection model by iteratively reducing a combination of the loss function value and the distance parameter. In this way, the detection accuracy can be improved.

Generative adversarial network medical image generation for training of a classifier

Mechanisms are provided to implement a machine learning training model. The machine learning training model trains an image generator of a generative adversarial network (GAN) to generate medical images approximating actual medical images. The machine learning training model augments a set of training medical images to include one or more generated medical images generated by the image generator of the GAN. The machine learning training model trains a machine learning model based on the augmented set of training medical images to identify anomalies in medical images. The trained machine learning model is applied to new medical image inputs to classify the medical images as having an anomaly or not.