Patent classifications
G06V10/7753
Target domain characterization for data augmentation
Methods, systems, and processor-readable media for training data augmentation. A source domain and a target domain are provided, and thereafter an operation is performed to augment data in the source domain with transformations utilizing characteristics learned from the target domain. The augmented data is then used to improve image classification accuracy in a new domain.
Training a Classifier Model Relating to a Concept and Image Ratings Provided by a User
- Ariel Fuxman ,
- Alexander Kenji Hata ,
- Edward Benjamin Vendrow ,
- Otilia Stretcu ,
- Wenlei Zhou ,
- Krishnamurthy Viswanathan ,
- Aditya Avinash ,
- Gabriel Berger ,
- Andrew Ames Bunner ,
- Javier Alejandro Rey ,
- Wei Qiao ,
- Yintao Liu ,
- Guanzhong Wang ,
- Thomas Nathan Denby ,
- Mehmet Nejat Tek ,
- Neil Gordon Alldrin ,
- Enming Luo ,
- Chun-Ta Lu
A computer-implemented method includes receiving an input from a user relating to a concept, automatically obtaining a first set of images from an unlabeled dataset of images based on the input, and obtaining a first rating via the user for each image from the first set of images. The method further includes training a classifier model relating to the concept based on the first set of images rated by the user, automatically obtaining a second set of images from the unlabeled dataset of images based on the classifier model trained based on the first set of images, and obtaining a second rating via the user for each image from the second set of images. The classifier model relating to the concept is retrained based on the first set of images rated by the user and the second set of images rated by the user to obtain an updated classifier model.
REALISTIC NEURAL NETWORK BASED IMAGE STYLE TRANSFER
A mobile device can implement a neural network-based style transfer scheme to modify an image in a first style to a second style. The style transfer scheme can be configured to detect an object in the image, apply an effect to the image, and blend the image using color space adjustments and blending schemes to generate a realistic result image. The style transfer scheme can further be configured to efficiently execute on the constrained device by removing operational layers based on resources available on the mobile device.
SYSTEM FOR LEARNING NEW VISUAL INSPECTION TASKS USING A FEW-SHOT META-LEARNING METHOD
Systems and methods described herein which can involve for a first input of a plurality of labeled images of a new domain task, processing the first plurality of labeled images through a plurality of backbone snapshots, each of the backbone snapshots representative of a model trained across a plurality of other domain tasks, each of the plurality of backbone snapshots configured to output a first plurality of features responsive to the input; processing a second input of second plurality of unlabeled images through the plurality of backbone snapshots to output a second plurality of features responsive to the second input; and generating a representative model for the new domain task from the clustering and transformation of the first plurality of features and as associated from the clustered and transformed second plurality of features.
ACCELERATED DATA COLLECTION USING TRANSFORMER ENCODING LAYERS FOR DATA SEPARATION
Techniques are disclosed herein that are directed towards using satellite image data to narrow down the search space for statistically significant and/or meaningful ground truth data. Various implementations include techniques for labeling agricultural image data using unsupervised clustering and/or active learning techniques. Additional or alternative implementations include collecting more detailed crop information from locations on the ground with a higher quality (e.g., especially representative of a particular crop) ground truth.
Optical and other sensory processing of complex objects
Systems and methods for optical and other sensory analysis of nutritional and other complex objects are disclosed. For example, techniques may include capturing an RGB-D image of a food using an integrated camera; inputting the RGB-D image into an instance detection network configured to detect food items; segmenting a plurality of food items from the RGB-D image into a plurality of masks, the plurality of masks representing individual food items; classifying a particular food item among the individual food items using a multimodal large language model; estimating a volume of the particular food item by overlaying an RGB image associated with the RGB-D image with a depth-map to create a point cloud; and estimating the calories of the particular food item using the estimated volume and a nutritional database.
Method for Updating Artificial Intelligence Model Data for Smart Vending Machines
The present invention discloses a method for updating artificial intelligence model data for smart vending machines, the method including: acquiring an actual purchase video including an untrained new product and determined by a product recognition algorithm or manual review, the actual purchase video being annotated with target product SKU information; intercepting new product sub-images from the actual purchase video; storing the product partial sub-images in a database of new product sub-images to be processed; initiating verification of all the product partial sub-images in the database of new product sub-images to be processed when the number of the product partial sub-images in the database of new product sub-images to be processed exceeds a given threshold; obtaining a new product replacement image by screening the product partial sub-images in the database of new product sub-images to be processed, and verifying the recognition accuracy of the screened new product replacement image; after the verification passes, introducing the screened new product replacement image into an original new product sample image gallery to replace the product sample image of the untrained product in the product recognition algorithm.
Model training method and apparatus for image recognition, network device, and storage medium
A model training method and apparatus for image recognition, and a non-transitory storage medium are provided. The model training method includes: obtaining a multi-label image training set including a plurality of training images each annotated with a plurality of sample labels; selecting target training images from the multi-label image training set for training a current model; performing label prediction on each target training image using the current model, to obtain a plurality of predicted labels of the each target training image; obtaining a cross-entropy loss function corresponding to the plurality of sample labels of the each target training image, a positive label loss being greater than a negative label loss and having a weight greater than 1; converging the predicted labels and the sample labels of the each target training image according to the cross-entropy loss function, and updating parameters of the current model, to obtain a trained model.
Data object classification using an optimized neural network
A system includes a computing platform having a hardware processor and a memory storing a software code and a neural network (NN) having multiple layers including a last activation layer and a loss layer. The hardware processor executes the software code to identify different combinations of layers for testing the NN, each combination including candidate function(s) for the last activation layer and candidate function(s) for the loss layer. For each different combination, the software code configures the NN based on the combination, inputs, into the configured NN, a training dataset including multiple data objects, receives, from the configured NN, a classification of the data objects, and generates a performance assessment for the combination based on the classification. The software code determines a preferred combination of layers for the NN including selected candidate functions for the last activation layer and the loss layer, based on a comparison of the performance assessments.
Object detection device, learning method, and recording medium
In an object detection device, a plurality of object detection units output a score indicating probability that a predetermined object exists, for each partial region set to image data inputted. The weight computation unit computes weights for merging the scores outputted by the plurality of object detection units, using weight calculation parameters, based on the image data. The merging unit merges the scores outputted by the plurality of object detection units, for each partial region, with the weights computed by the weight computation unit. The target model object detection unit configured to output a score indicating probability that the predetermined object exists, for each partial region set to the image data. The first loss computation unit computes a first loss indicating a difference of the score of the target model object detection unit from a ground truth label of the image data and the score merged by the merging unit. The first parameter correction unit corrects parameters of the target model object detection unit to reduce the first loss.