Patent classifications
G06N3/096
METHOD AND DEVICE FOR STUDENT TRAINING NETWORKS WITH TEACHER NETWORKS
A method and device for training a neural network are disclosed. The method comprises: selecting, by a training device, a teacher network performing the same functions of a student network; and iteratively training the student network and obtaining a target network, through aligning distributions of features between a first middle layer and a second middle layer corresponding to the same training sample data, so as to transfer knowledge of features of a middle layer of the teacher network to the student network.
LEARNING APPARATUS, LEARNING METHOD AND PROGRAM
A learning device relating to one embodiment includes: an input unit configured to input a plurality of datasets of different feature spaces; a first generation unit configured to generate a feature latent vector indicating a property of an individual feature of the dataset for each of the datasets; a second generation unit configured to generate an instance latent vector indicating the property of observation data for each of observation vectors included in the datasets; a prediction unit configured to predict a solution by a model for solving a machine learning problem of interest by using the feature latent vector and the instance latent vector; and a learning unit configured to learn a parameter of the model by optimizing a predetermined objective function by using the feature latent vector, the instance latent vector and the solution for each of the datasets.
SYSTEMS AND METHODS OF CONTRASTIVE POINT COMPLETION WITH FINE-TO-COARSE REFINEMENT
An electronic apparatus performs a method of recovering a complete and dense point cloud from a partial point cloud. The method includes: constructing a sparse but complete point cloud from the partial point cloud through a contrastive teacher-student neural network; and transforming the sparse but complete point cloud to the complete and dense point cloud. In some embodiments, the contrastive teacher-student neural network has a dual network structure comprising a teacher network and a student network both sharing the same architecture. The teacher network is a point cloud self-reconstruction network, and the student network is a point cloud completion network.
AUTO-CREATION OF CUSTOM MODELS FOR TEXT SUMMARIZATION
A text summarization system auto-generates text summarization models using a combination of neural architecture search and knowledge distillation. Given an input dataset for generating/training a text summarization model, neural architecture search is used to sample a search space to select a network architecture for the text summarization model. Knowledge distillation includes fine-tuning a language model for a given text summarization task using the input dataset, and using the fine-tuned language model as a teacher model to inform the selection of the network architecture and the training of the text summarization model. Once a text summarization model has been generated, the text summarization model can be used to generate summaries for given text.
QUERY VALIDATION WITH AUTOMATED QUERY MODIFICATION
Disclosed herein are embodiments providing query validation with automated query modification. In particular, the embodiments provide a computing system that receives a query and determines the query is sensitive. The computing system iteratively modifies the query until the query is not sensitive by modifying the query to increase a scope of the query, updating estimated query results based on the query as modified, and determining whether the query as modified is sensitive based on the estimated query results as updated. Upon determining that the query as modified is not sensitive, the computing system proceeds with the query as modified. Accordingly, the computing system improves query efficiency by automatically modifying a sensitive query.
METHOD AND SYSTEM FOR TRAINING A NEURAL NETWORK MODEL USING GRADUAL KNOWLEDGE DISTILLATION
Method and system of training a student neural network (SNN) model. A first training phase is performed over a plurality of epochs during which a smoothing factor to teacher neural network (TNN) model outputs to generate smoothed TNN model outputs, a first loss is computed based on the SNN model outputs and the smoothed TNN model outputs, and an updated set of the SNN model parameters is computed with an objective of reducing the first loss in a following epoch of the first training phase. The soothing factor is adjusted over the plurality of epochs of the first training phase to reduce a smoothing effect on the generated smoothed TNN model outputs. A second training phase is performed based on the SNN model outputs and a set of predefined expected outputs for the plurality of input data samples.
METHOD AND SYSTEM FOR TRAINING A NEURAL NETWORK MODEL USING GRADUAL KNOWLEDGE DISTILLATION
Method and system of training a student neural network (SNN) model. A first training phase is performed over a plurality of epochs during which a smoothing factor to teacher neural network (TNN) model outputs to generate smoothed TNN model outputs, a first loss is computed based on the SNN model outputs and the smoothed TNN model outputs, and an updated set of the SNN model parameters is computed with an objective of reducing the first loss in a following epoch of the first training phase. The soothing factor is adjusted over the plurality of epochs of the first training phase to reduce a smoothing effect on the generated smoothed TNN model outputs. A second training phase is performed based on the SNN model outputs and a set of predefined expected outputs for the plurality of input data samples.
Techniques for generating data for an intelligent gesture detector
A method and system for generating training data for training a gesture detection machine-learning (ML) model includes receiving a request to generate training data for the gesture detection model, the training data being associated with a target gesture, retrieving data associated with an original gesture, the original gesture being a gesture made using a body part, retrieving skeleton data associated with the target gesture, the skeleton data displaying a skeleton representative of the body part and the skeleton displaying the target gesture, aligning a location of the body part in the data with a location of the skeleton in the skeleton data, providing the aligned data and the skeleton data to an ML model for generating a target data that displays the target gesture, receiving the target data as an output from the ML model, the target data preserving a visual feature of the data and displaying the target gesture, and providing the target data to the gesture detection ML model.
Financial risk management based on transactions portrait
An approach is provided in which the approach constructs a 3-dimensional (3D) matrix based on a plurality of historical transactions performed by a user. The 3D matrix includes a set of features, a set of rows, and a set of channels. The approach trains a convolutional neural network using the 3D matrix, and then uses the trained convolutional neural network to predict a risk level of a new transaction initiated by the user. The approach transmits an alert message based on the predicted risk level.
SERVER DEVICE AND A METHOD FOR SPATIAL MAPPING OF A MODEL
A server obtains a first model for a first area and first operational data associated with the first area. The server determines a second model, based on the first operational data and the first model, and obtains a first performance parameter indicative of a performance of the first model. The server obtains a second performance parameter indicative of a performance of the second model and determines a model performance parameter based on the first and second performance parameters. The server determines whether the model performance parameter satisfies a first criterion. The server, when the model performance parameter does not satisfy the first criterion, determines whether the second performance parameter satisfies a second criterion. The server, when the second performance parameter does not satisfy the second criterion, determines a second area that is smaller than the first area; and obtains a third model for the second area.