G06V10/7753

Methods for training auto-labeling device and performing auto-labeling by using hybrid classification and devices using the same
11023776 · 2021-06-01 · ·

A method for training an auto-labeling device is provided. The method includes: (a) inputting a training image to a pre-trained feature extraction module to generate a feature, (b) inputting the feature to a pre-trained first classification module to output a first class score and a first uncertainty score, inputting the feature to a pre-trained second classification module, to output a second class score and a second uncertainty score, generating a scaled second uncertainty score by applying a scale parameter to the second uncertainty score, and then inputting the feature to a fitness estimation module to output a fitness value; and (c) (i) updating the scale parameter by using an uncertainty loss generated based on the first uncertainty score and the scaled second uncertainty score, and (ii) training the fitness estimation module by using a cross-entropy loss generated based on the uncertainty loss and the fitness value.

IMAGE LEARNING DEVICE, IMAGE LEARNING METHOD, NEURAL NETWORK, AND IMAGE CLASSIFICATION DEVICE
20210158100 · 2021-05-27 · ·

An object of the invention is to provide an image learning device, an image learning method, a neural network, and an image classification device which can support appropriate classification of an image.

In the image learning device according to an aspect of the invention, the neural network performs a first task of classifying a recognition target in a medical image and outputting a classification score as an evaluation result, and a second task different from the first task. The neural network updates a weight coefficient on the basis of a comparison result between the classification score output for the medical image of a first image group and a ground truth classification label, and does not reflect the classification score output for the medical image of a second image group in an update of the weight coefficient, for the first task. The neural network updates the weight coefficient on the basis of the evaluation result output for the medical image of the first image group and the evaluation result output for the medical image of the second image group, for the second task.

Data Object Classification Using an Optimized Neural Network

A system includes a computing platform having a hardware processor and a memory storing a software code and a neural network (NN) having multiple layers including a last activation layer and a loss layer. The hardware processor executes the software code to identify different combinations of layers for testing the NN, each combination including candidate function(s) for the last activation layer and candidate function(s) for the loss layer. For each different combination, the software code configures the NN based on the combination, inputs, into the configured NN, a training dataset including multiple data objects, receives, from the configured NN, a classification of the data objects, and generates a performance assessment for the combination based on the classification. The software code determines a preferred combination of layers for the NN including selected candidate functions for the last activation layer and the loss layer, based on a comparison of the performance assessments.

AUTOMATIC LABELING OF OBJECTS IN SENSOR DATA

Aspects of the disclosure provide for automatically generating labels for sensor data. For instance first sensor data for a first vehicle is identified. The first sensor data is defined in both a global coordinate system and a local coordinate system for the first vehicle. A second vehicle is identified based on a second location of the second vehicle within a threshold distance of the first vehicle within the first timeframe. The second vehicle is associated with second sensor data that is further associated with a label identifying a location of an object, and the location of the object is defined in a local coordinate system of the second vehicle. A conversion from the local coordinate system of the second vehicle to the local coordinate system of the first vehicle may be determined and used to transfer the label from the second sensor data to the first sensor data.

DEPTH DATA MODEL TRAINING WITH UPSAMPLING, LOSSES, AND LOSS BALANCING

Techniques for training a machine learned (ML) model to determine depth data based on image data are discussed herein. Training can use stereo image data and depth data (e.g., lidar data). A first (e.g., left) image can be input to a ML model, which can output predicted disparity and/or depth data. The predicted disparity data can be used with second image data (e.g., a right image) to reconstruct the first image. Differences between the first and reconstructed images can be used to determine a loss. Losses may include pixel, smoothing, structural similarity, and/or consistency losses. Further, differences between the depth data and the predicted depth data and/or differences between the predicted disparity data and the predicted depth data can be determined, and the ML model can be trained based on the various losses. Thus, the techniques can use self-supervised training and supervised training to train a ML model.

Simulation Architecture for On-Vehicle Testing and Validation

In one embodiment, a computing system of a vehicle generates perception data based on sensor data captured by one or more sensors of the vehicle. The perception data includes one or more representations of physical objects in an environment associated with the vehicle. The computing system further determines simulated perception data that includes one or more representations of virtual objects within the environment and generates modified perception data based on the perception data and the simulated perception data. The modified perception data includes at least one of the one or more representations of physical objects and the one or more representations of virtual objects. The computing system further determines a path of travel for the vehicle based on the modified perception data, which includes the one or more representations of the virtual objects.

Systems and methods for contrastive learning of visual representations

Systems, methods, and computer program products for performing semi-supervised contrastive learning of visual representations are provided. For example, the present disclosure provides systems and methods that leverage particular data augmentation schemes and a learnable nonlinear transformation between the representation and the contrastive loss to provide improved visual representations. Further, the present disclosure also provides improvements for semi-supervised contrastive learning. For example, computer-implemented method may include performing semi-supervised contrastive learning based on a set of one or more unlabeled training data, generating an image classification model based on a portion of a plurality of layers in a projection head neural network used in performing the contrastive learning, performing fine-tuning of the image classification model based on a set of one or more labeled training data, and after performing the fine-tuning, distilling the image classification model to a student model comprising a relatively smaller number of parameters than the image classification model.

Transforming source domain images into target domain images

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using an image processing neural network system. One of the systems includes a domain transformation neural network implemented by one or more computers, wherein the domain transformation neural network is configured to: receive an input image from a source domain; and process a network input comprising the input image from the source domain to generate a transformed image that is a transformation of the input image from the source domain to a target domain that is different from the source domain.

ANNOTATION METHOD AND DEVICE, AND STORAGE MEDIUM

An annotation method and device and a storage medium are provided. The annotation method includes operations as follows. A first probability value that a first sample image is annotated with an Nth tag when the first sample image is annotated with an Mth tag is determined based on first tag information of a first image set. M and N are unequal and are positive integers. The first probability value is added to second tag information of a second sample image annotated with the Mth tag in a second image set.

SYSTEMS AND METHODS FOR GENERATING ANNOTATIONS OF STRUCTURED, STATIC OBJECTS IN AERIAL IMAGERY USING GEOMETRIC TRANSFER LEARNING AND PROBABILISTIC LOCALIZATION
20210133997 · 2021-05-06 ·

In some embodiments, aerial images of a geographic area are captured by an autonomous vehicle. In some embodiments, the locations of structures within a subset of the aerial images are manually annotated, and geographical locations of the manual annotations are determined based on pose information of the camera. In some embodiments, a machine learning model is trained using the manually annotated aerial images. The machine learning model is used to automatically generate annotations of other images of the geographic area, and the geographical locations determined from the manual annotations are used to determine an accuracy probability of the automatic annotations. The automatic annotations determined to be accurate may be used to re-train the machine learning model to increase its precision and recall.