G06V10/7753

MODEL TRAINING METHOD AND APPARATUS FOR IMAGE RECOGNITION, NETWORK DEVICE, AND STORAGE MEDIUM
20210042580 · 2021-02-11 ·

A model training method and apparatus for image recognition, and a non-transitory storage medium are provided. The model training method includes: obtaining a multi-label image training set including a plurality of training images each annotated with a plurality of sample labels; selecting target training images from the multi-label image training set for training a current model; performing label prediction on each target training image using the current model, to obtain a plurality of predicted labels of the each target training image; obtaining a cross-entropy loss function corresponding to the plurality of sample labels of the each target training image, a positive label loss being greater than a negative label loss and having a weight greater than 1; converging the predicted labels and the sample labels of the each target training image according to the cross-entropy loss function, and updating parameters of the current model, to obtain a trained model.

CREATIVE GAN GENERATING MUSIC DEVIATING FROM STYLE NORMS

A method and system for generating music uses artificial intelligence to analyze existing musical compositions and then creates a musical composition that deviates from the learned styles. Known musical compositions created by humans are presented in digitized form along with a style designator to a computer for analysis, including recognition of musical elements and association of particular styles. A music generator generates a draft musical composition for similar analysis by the computer. The computer ranks such draft musical composition for correlation with known musical elements and known styles. The music generator modifies the draft musical composition using an iterative process until the resulting musical composition is recognizable as music but is distinctive in style.

METHODS AND SYSTEMS FOR IDENTIFYING INTERNAL CONDITIONS IN JUVENILE FISH THROUGH NON-INVASIVE MEANS
20210056689 · 2021-02-25 ·

Methods and systems are disclosed for improvements in aquaculture that allow for increasing the number and harvesting efficiency of fish in an aquaculture setting by identifying and predicting internal conditions of the juvenile fish based on external characteristics that are imaged through non-invasive means.

Medical image classification based on a generative adversarial network trained discriminator

Mechanisms are provided to implement a generative adversarial network (GAN). A discriminator of the GAN is configured to discriminate input medical images into a plurality of classes including a first class indicating a medical image representing a normal medical condition, a second class indicating an abnormal medical condition, and a third class indicating a generated medical image. A generator of the GAN generates medical images and a training medical image set is input to the discriminator that includes labeled medical images, unlabeled medical images, and generated medical images. The discriminator is trained to classify training medical images in the training medical image set into corresponding ones of the first, second, and third classes. The trained discriminator is applied to a new medical image to classify the new medical image into a corresponding one of the first class or second class. The new medical image is either labeled or unlabeled.

Domain adaptation for instance detection and segmentation

Systems and methods for domain adaptation are provided. The system aligns image level features between a source domain and a target domain based on an adversarial learning process while training a domain discriminator. The system selects, using the domain discriminator, unlabeled samples from the target domain that are far away from existing annotated samples from the target domain. The system selects, based on a prediction score of each of the unlabeled samples, samples with lower prediction scores. The system annotates the samples with the lower prediction scores.

SHUFFLE, ATTEND, AND ADAPT: VIDEO DOMAIN ADAPTATION BY CLIP ORDER PREDICTION AND CLIP ATTENTION ALIGNMENT
20210064883 · 2021-03-04 ·

A method for performing video domain adaptation for human action recognition is presented. The method includes using annotated source data from a source video and unannotated target data from a target video in an unsupervised domain adaptation setting, identifying and aligning discriminative clips in the source and target videos via an attention mechanism, and learning spatial-background invariant human action representations by employing a self-supervised clip order prediction loss for both the annotated source data and the unannotated target data.

Unsupervised Learning-Based Reference Selection for Enhanced Defect Inspection Sensitivity
20210090229 · 2021-03-25 ·

An optical characterization system and a method of using the same are disclosed. The system comprises a controller configured to be communicatively coupled with one or more detectors configured to receive illumination from a sample and generate image data. One or more processors may be configured to receive images of dies on the sample, calculate dissimilarity values for all combinations of the images, perform a cluster analysis to partition the combinations of the images into two or more clusters, generate a reference image for a cluster of the two or more clusters using two or more of the combinations of the images in the cluster; and detect one or more defects on the sample by comparing a test image in the cluster to the reference image for the cluster.

IMAGE PROCESSING METHOD, IMAGE PROCESSING DEVICE, AND STORAGE MEDIUM
20210089824 · 2021-03-25 ·

The present disclosure discloses an image processing method and related device thereof. The method includes: acquiring an image to be processed; and performing a feature extraction process on the image to be processed using a target neural network so as to obtain target feature data of the image to be processed, wherein parameters of the target neural network are time average values of parameters of a first neural network which is obtained from training under supervision by a training image set and an average network, and parameters of the average network are time average values of parameters of a second neural network which is obtained from training under supervision by the training image set and the target neural network. A corresponding device is also disclosed. Feature data of image to be processed are obtained via the feature extraction process performed on the image to be processed.

OCCUPANCY PREDICTION NEURAL NETWORKS
20210064890 · 2021-03-04 ·

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a future occupancy prediction for a region of an environment. In one aspect, a method comprises: receiving sensor data generated by a sensor system of a vehicle that characterizes an environment in a vicinity of the vehicle as of a current time point, wherein the sensor data comprises a plurality of sensor samples characterizing the environment that were each captured at different time points; processing a network input comprising the sensor data using a neural network to generate an occupancy prediction output for a region of the environment, wherein: the occupancy prediction output characterizes, for one or more future intervals of time after the current time point, a respective likelihood that the region of the environment will be occupied by an agent in the environment during the future interval of time.

Detecting Backdoor Attacks Using Exclusionary Reclassification

Embodiments relate to a system, program product, and method for processing an untrusted data set to automatically determine which data points there are poisonous. A neural network is trained network using potentially poisoned training data. Each of the training data points is classified using the network to retain the activations of at least one hidden layer, and segment those activations by the label of corresponding training data. Clustering is applied to the retained activations of each segment, and a clustering assessment is conducted to remove an identified cluster from the data set, form a new training set, and train a second neural model with the new training set. The removed cluster and corresponding data are applied to the trained second neural model to analyze and classify data in the removed cluster as either legitimate or poisonous.