G06N3/08

METHOD OF GENERATING PRE-TRAINING MODEL, ELECTRONIC DEVICE, AND STORAGE MEDIUM

A method of generating a pre-training model, an electronic device and a storage medium, which relate to a field of an artificial intelligence technology, in particular to a computer vision and deep learning technology. The method includes: determining a performance index set corresponding to a candidate model structure set, the candidate model structure set is determined from a plurality of model structures included in a search space, and the search space is a super-network-based search space; determining, from the candidate model structure set, a target model structure corresponding to each chip according to the performance index set, each target model structure is a model structure meeting a performance index condition; and determining, for each chip, the target model structure corresponding to the chip as a pre-training model corresponding to the chip, the chip is configured to run the pre-training model corresponding to the chip.

EXPLAINING A MODEL OUTPUT OF A TRAINED MODEL

The invention relates a computer-implemented method (500) of generating explainability information for explaining a model output of a trained model. The method uses one or more aspect recognition models configured to indicate a presence of respective characteristics in the input instance. A saliency method is applied to obtain a masked source representation of the input instance at a source layer of the trained model (e.g., the input layer or an internal layer), comprising those elements at the source layer relevant to the model output. The masked source representation is mapped to a target layer (e.g., input or internal layer) of an aspect recognition model, and the aspect recognition model is then applied to obtain a model output indicating a presence of the given characteristic relevant to the model output of the trained model. As explainability information, the characteristics indicated by the aspect recognition models are output.

EXPLAINING A MODEL OUTPUT OF A TRAINED MODEL

The invention relates a computer-implemented method (500) of generating explainability information for explaining a model output of a trained model. The method uses one or more aspect recognition models configured to indicate a presence of respective characteristics in the input instance. A saliency method is applied to obtain a masked source representation of the input instance at a source layer of the trained model (e.g., the input layer or an internal layer), comprising those elements at the source layer relevant to the model output. The masked source representation is mapped to a target layer (e.g., input or internal layer) of an aspect recognition model, and the aspect recognition model is then applied to obtain a model output indicating a presence of the given characteristic relevant to the model output of the trained model. As explainability information, the characteristics indicated by the aspect recognition models are output.

MOVEMENT OF TENSOR DATA DURING RESHAPE OPERATION

A method of performing a reshape operation specified in a reshape layer of a neural network model is described. The reshape operation reshapes an input tensor with an input tensor shape to an output tensor with an output tensor shape. The tensor data that has to be reshaped is directly routed between tile memories of the hardware accelerator in an efficient manner. This advantageously optimizes usage of memory space and allows any number and type of neural network models to be run on the hardware accelerator.

DATA PROCESSING METHOD AND APPARATUS
20230048031 · 2023-02-16 ·

Relating to the field of artificial intelligence, and specifically relating to the field of natural language processing, a data processing method includes and an apparatus performs: determining original text samples, where masking processing is not performed on the original text samples; and performing mask processing on the original text samples to obtain mask training samples, where the mask processing makes mask proportions of the mask training samples unfixed, and the mask training samples each are used to train a pretrained language model PLM. Training the PLM by using the mask training samples whose mask proportions are unfixed can enhance mode diversity of the training samples of the PLM. Therefore, features learned by the PLM are also diversified, a generalization capability of the PLM can be improved, and a natural language understanding capability of the PLM obtained through training can be improved.

DATA PROCESSING METHOD AND APPARATUS
20230048031 · 2023-02-16 ·

Relating to the field of artificial intelligence, and specifically relating to the field of natural language processing, a data processing method includes and an apparatus performs: determining original text samples, where masking processing is not performed on the original text samples; and performing mask processing on the original text samples to obtain mask training samples, where the mask processing makes mask proportions of the mask training samples unfixed, and the mask training samples each are used to train a pretrained language model PLM. Training the PLM by using the mask training samples whose mask proportions are unfixed can enhance mode diversity of the training samples of the PLM. Therefore, features learned by the PLM are also diversified, a generalization capability of the PLM can be improved, and a natural language understanding capability of the PLM obtained through training can be improved.

METHOD FOR LEARNING REPRESENTATIONS FROM CLOUDS OF POINTS DATA AND A CORRESPONDING SYSTEM
20230050120 · 2023-02-16 ·

A method for learning representations from clouds of points data includes encoding clouds of points data into at least one representation by creating at least one tensor representation out of the clouds of points data. The method further includes using a loss function that utilizes a noisy reconstruction for reducing overfitting.

METHOD FOR LEARNING REPRESENTATIONS FROM CLOUDS OF POINTS DATA AND A CORRESPONDING SYSTEM
20230050120 · 2023-02-16 ·

A method for learning representations from clouds of points data includes encoding clouds of points data into at least one representation by creating at least one tensor representation out of the clouds of points data. The method further includes using a loss function that utilizes a noisy reconstruction for reducing overfitting.

MOLECULAR GRAPH REPRESENTATION LEARNING METHOD BASED ON CONTRASTIVE LEARNING
20230052865 · 2023-02-16 ·

The present invention is a molecular graph representation learning method based on contrastive learning, the method comprising: obtaining a molecular fingerprint representation of each molecule, and calculating a similarity between each two molecular fingerprints; collecting a full amount of chemical functional group information, and matching a corresponding functional group for each atom in the molecule; using a heterogeneous graph to model a molecular graph; using a RGCN in the structure-aware molecular encoder to encode the representation of each atom in the molecule and the representation of the functional group to which the atom belongs, and mapping the molecule to a feature space through an aggregation function to obtain a structure-aware feature representation; according to the fingerprint similarity between molecules, selecting positive and negative samples, and carrying out a comparative learning in the feature space; obtaining the structure-aware molecular encoder by using the contrastive learning method for training on a large-sample molecular dataset, and applying the structure-aware molecular encoder to a prediction task of downstream molecular attributes. The present invention helps to capture more abundant molecular structure information and solve the problem on molecular property prediction.

MOLECULAR GRAPH REPRESENTATION LEARNING METHOD BASED ON CONTRASTIVE LEARNING
20230052865 · 2023-02-16 ·

The present invention is a molecular graph representation learning method based on contrastive learning, the method comprising: obtaining a molecular fingerprint representation of each molecule, and calculating a similarity between each two molecular fingerprints; collecting a full amount of chemical functional group information, and matching a corresponding functional group for each atom in the molecule; using a heterogeneous graph to model a molecular graph; using a RGCN in the structure-aware molecular encoder to encode the representation of each atom in the molecule and the representation of the functional group to which the atom belongs, and mapping the molecule to a feature space through an aggregation function to obtain a structure-aware feature representation; according to the fingerprint similarity between molecules, selecting positive and negative samples, and carrying out a comparative learning in the feature space; obtaining the structure-aware molecular encoder by using the contrastive learning method for training on a large-sample molecular dataset, and applying the structure-aware molecular encoder to a prediction task of downstream molecular attributes. The present invention helps to capture more abundant molecular structure information and solve the problem on molecular property prediction.