Patent classifications
G06V10/7753
Unsupervised representation learning with contrastive prototypes
The system and method are directed to a prototypical contrastive learning (PCL). The PCL explicitly encodes the hierarchical semantic structure of the dataset into the learned embedding space and prevents the network from exploiting low-level cues for solving the unsupervised learning task. The PCL includes prototypes as the latent variables to help find the maximum-likelihood estimation of the network parameters in an expectation-maximization framework. The PCL iteratively performs an E-step for finding prototypes with clustering and M-step for optimizing the network on a contrastive loss.
METHOD FOR TRAINING A NEURAL NETWORK TO DELIVER THE VIEWPOINTS OF OBJECTS USING UNLABELED PAIRS OF IMAGES, AND THE CORRESPONDING SYSTEM
A system and a method for training a neural network to deliver the viewpoint of objects, the method comprising minimizing distances between each training image of a first set of training images, the output of the neural network with the viewpoint of this training image, and each pair of a second set of training image pairs, the second image of each pair of the second set of training image pairs with the output of a decoder neural network when the first image of this pair is inputted to an encoder neural network, the second image of this pair is inputted to the neural network to obtain a viewpoint, the obtained encoded image is rotated according to the viewpoint, and the rotated encoded image is decoded.
METHODS AND SYSTEMS FOR MONITORING OBJECTS FOR LABELLING
A graphical user interface (GUI) for forming hierarchically arranged clusters of items and operating thereupon through an electronic device equipped with an input-device and a display-screen is provided. The GUI comprises a first area configured to display a graphical-tree representation having a plurality of hierarchical levels, each of said level corresponds to at least one cluster of content-items formed by execution of a machine-learning classifier over a plurality of input content items. A second area is configured to display a dataset corresponding to the content-items classified within the clusters. A third area is configured to display a plurality of types of content representations with respect to each selected cluster, said representations corresponding to content-items classified within the cluster.
UNSUPERVISED DATA AUGMENTATION FOR MULTIMEDIA DETECTORS
Training data associated with detection of objects within a content asset may be generated in an automated manner. A content asset, such as video content, may be associated with metadata. A relevance score indicating a likelihood of the content asset comprising at least one object may be determined based on the metadata. A portion of the content asset may be identified as containing an instance of the object. The identified portion of the content asset may be a false identification if the relevance score for the content asset fails to satisfy a threshold value, or a positive identification if it satisfies the threshold value. The results, e.g., negative training data if the identified portion of the content asset is a false identification, may be used as negative training data for a multimedia detector that is based on a machine learning model.
RELATIONSHIP MODELING AND KEY FEATURE DETECTION BASED ON VIDEO DATA
A method includes acquiring digital video data that portrays an interacting event, extracting image data, audio data, and semantic text data from the video data, analyzing the extracted data to identify a plurality of video features, and analyzing the plurality of video features to create a relationship graph. The interacting event comprises a plurality of interactions between plurality of individuals and the relationship graph comprises a plurality of nodes and a plurality of edges. Each node of the plurality of nodes represents an individual of the plurality of individuals, and each edge of the plurality of edges extends between two nodes of the plurality of nodes, and the plurality of edges represents the plurality of interactions. The method further comprises determining whether a first key feature is present in the relationship graph, wherein presence of the first key feature is predictive of a positive outcome of the interacting event.
RANDOM SAMPLING CONSENSUS FEDERATED SEMI-SUPERVISED LEARNING
A method and systems for random sampling consensus federated (RSCFed) learning in non-IID settings are provided. The method includes randomly sampling local clients, assigning a current global model to the randomly sampled local clients for initialization at beginning of a synchronization round, conducting local training on the randomly sampled local clients, collecting local models from the randomly sampled local clients and executing distance-reweighted model aggregation (DMA) on the collected local models to obtain a sub-consensus model, repeating above steps multiple times to obtain a set of sub-consensus models, and aggregating a new model based on the sub-consensus models to be next global model.
TARGET DOMAIN CHARACTERIZATION FOR DATA AUGMENTATION
Methods, systems, and processor-readable media for training data augmentation. A source domain and a target domain are provided, and thereafter an operation is performed to augment data in the source domain with transformations utilizing characteristics learned from the target domain. The augmented data is then used to improve image classification accuracy in a new domain.
Data object classification using an optimized neural network
A system includes a computing platform having a hardware processor and a memory storing a software code and a neural network (NN) having multiple layers including a last activation layer and a loss layer. The hardware processor executes the software code to identify different combinations of layers for testing the NN, each combination including candidate function(s) for the last activation layer and candidate function(s) for the loss layer. For each different combination, the software code configures the NN based on the combination, inputs, into the configured NN, a training dataset including multiple data objects, receives, from the configured NN, a classification of the data objects, and generates a performance assessment for the combination based on the classification. The software code determines a preferred combination of layers for the NN including selected candidate functions for the last activation layer and the loss layer, based on a comparison of the performance assessments.
Systems and computer-implemented methods for identifying anomalies in an object and training methods therefor
A system identifies anomalies in an image of an object. An input image of the object containing zero or more anomalies is supplied to an image encoder. The image encoder generates an image model. The image model is applied to an image decoder that forms a substitute non-anomalous image of the object. Differences between the input image and the substitute non-anomalous image identify zero or more areas of the input image that contain the zero or more the anomalies. The system implements a flow-based model and has been trained using (a) a set of augmented anomaly-free images of the object applied at the image encoder and (b) a reconstruction loss calculated based on a norm of differences between each augmented anomaly-free image of the object and a corresponding output image from the image decoder.
METHOD AND APPARATUS FOR TRAINING IMAGE RECOGNITION MODEL, AND IMAGE RECOGNITION METHOD AND APPARATUS
A method for training an image recognition model includes: obtaining training image sets; obtaining a first predicted probability, a second predicted probability, a third predicted probability, and a fourth predicted probability based on the training image sets by using an initial image recognition model; determining a target loss function according to the first predicted probability, the second predicted probability, the third predicted probability, and the fourth predicted probability; and training the initial image recognition model based on the target loss function, to obtain an image recognition model.