Patent classifications
G06N3/098
Training Method, Apparatus, and Device for Federated Neural Network Model, Computer Program Product, and Computer-Readable Storage Medium
Embodiments of this application provide a training method, apparatus, and device for a federated neural network model, a computer program product, and a computer-readable storage medium. The method includes: performing forward computation processing of sample data through a first model, to obtain an output vector of the first model; performing forward computation processing through the interaction layer of the output vector of the first bottom model, at last one model parameter of an interaction layer, and at least one encrypted model to obtain an output vector of the interaction layer; performing forward computation processing on the output vector of the interaction layer through the second model, to obtain an output vector of the federated neural network model; back propagating the federated neural network model according to a loss result corresponding to the output vector of the federated neural network model, and updating the at least one model parameter of the interaction layer, a parameter of the first bottom model, and a parameter of the top model according to a back propagation processing result; and obtaining a trained federated neural network model based on the updated model parameters of the interaction layer, the first bottom model, and the second model.
DATA PROCESSING METHOD AND DEVICE
Embodiments of this application disclose a data processing method, and relate to the field of artificial intelligence. The method is applied to distributed parallel model training, for example, distributed training of a text translation model, a speech recognition model, a facial recognition model, a three-dimensional reconstruction model, and a virtual reality model. The method can implement hybrid parallelism in a distributed cluster. The method includes: inserting, based on tensor layouts of tensors of at least one operator in a deep neural network model, a redistribution operator between operators that have an input-output dependency relationship, to implement conversion between different tensor layouts; inserting the redistribution operator into a sliced computational graph; and determining an updated sliced computational graph to implement parallel model training of the deep neural network.
SYSTEM AND METHOD OF CONVOLUTIONAL NEURAL NETWORK
A method the following operations: downscaling an input image to generate a scaled image; performing, to the scaled image, a first convolutional neural networks (CNN) modeling process with first non-local operations, to generate global parameters; and performing, to the input image, a second CNN modeling process with second non-local operations that are performed with the global parameters, to generate an output image corresponding to the input image. A system is also disclosed herein.
PRIVACY-PRESERVING MACHINE LEARNING TRAINING BASED ON HOMOMORPHIC ENCRYPTION USING EXECUTABLE FILE PACKAGES IN AN UNTRUSTED ENVIRONMENT
Aspects of the present disclosure provide systems, methods, and computer-readable storage media that support secure training of machine learning (ML) models that preserves privacy in untrusted environments using distributed executable file packages. The executable file packages may include files, libraries, scripts, and the like that enable a cloud service provider configured to provide ML model training based on non-encrypted data to also support homomorphic encryption of data and ML model training with one or more clients, particularly for a diagnosis prediction model trained using medical data. Because the training is based on encrypted client data, private client data such as patient medical data may be used to train the diagnosis prediction model without exposing the client data to the cloud service provider or others. Using homomorphic encryption enables training of the diagnosis prediction model using encrypted data without requiring decryption prior to training.
K-QUANT GRADIENT COMPRESSOR FOR FEDERATED LEARNING
Techniques described herein relate to a method for model updating in a federated learning environment. The method may include distributing, by a model coordinator, a current model to a plurality of client nodes; receiving, by the model coordinator and in response to distributing the current model, a set of gradient K-quant vectors, wherein each gradient K-quant vector of the first set of gradient K-quant vectors is received from one client node of the plurality of client nodes. The gradient K-quant vectors may be compressed representations of gradient vectors. The compression may be performed by determining a bin index value corresponding to the gradient vector values, based on a K value and range received from the model coordinator. The model coordinator may use the gradient K-quant vectors to generate an updated model, and send the updated model to the client nodes for use in the next training cycle.
METHODS AND DECENTRALIZED SYSTEMS THAT EMPLOY DISTRIBUTED MACHINE LEARNING TO AUTOMATICALLY INSTANTIATE AND MANAGE DISTRIBUTED APPLICATIONS
The current document is directed to methods and systems that automatically instantiate complex distributed applications by deploying distributed-application instances across the computational resources of one or more distributed computer systems and that automatically manage instantiated distributed applications. Automatic deployment of multiple instances of a distributed application across computational resources, such as distribution of microservices of a microservice-based application across one or more distributed computer systems, and scaling of instantiated distributed applications are computationally difficult optimization problems that are not amenable to traditional centralized approaches. The current document discloses decentralized, distributed automated methods and systems that instantiate and manage distributed applications. Reinforcement-learning-based agents are installed within the computational resources of one or more distributed computer systems. Distributed-application instances are initially distributed to one or more agents. The agents then exchange distributed-application instances among themselves in order to locally optimize the set of distributed-application instances that they each manage.
Distributed Cache or Replay Service for Massively Scalable Distributed Reinforcement Learning
A computing system for performing distributed large scale reinforcement learning with improved efficiency can include a plurality of actor devices, wherein each actor device locally stores a local version of a machine-learned model, wherein each actor device is configured to implement the local version of the machine-learned model at the actor device to determine an action to take in an environment to generate an experience, a server computing system configured to perform one or more learning algorithms to learn an updated version of the machine-learned model based on the experiences generated by the plurality of actor devices, and a hierarchical and distributed data caching system including a plurality of layers of data caches that propagate data descriptive of the updated version of the machine-learned model from the server computing system to the plurality of actor devices to enable each actor device to update its respective local version of the model.
Distributed Cache or Replay Service for Massively Scalable Distributed Reinforcement Learning
A computing system for performing distributed large scale reinforcement learning with improved efficiency can include a plurality of actor devices, wherein each actor device locally stores a local version of a machine-learned model, wherein each actor device is configured to implement the local version of the machine-learned model at the actor device to determine an action to take in an environment to generate an experience, a server computing system configured to perform one or more learning algorithms to learn an updated version of the machine-learned model based on the experiences generated by the plurality of actor devices, and a hierarchical and distributed data caching system including a plurality of layers of data caches that propagate data descriptive of the updated version of the machine-learned model from the server computing system to the plurality of actor devices to enable each actor device to update its respective local version of the model.
CHECKPOINT STATE STORAGE FOR MACHINE-LEARNING MODEL TRAINING
A method for training a machine-learning model. A plurality of nodes are assigned for training the machine-learning model. Nodes include agents comprising at least an agent processing unit and local memory. Each agent manages, via a local network, one or more workers that include a worker processing unit. Shards of a training data set are distributed for parallel processing by workers at different nodes. Each worker processing unit is configured to iteratively train on minibatches of a shard, and to report checkpoint states indicating updated parameters for storage in local memory. Based at least on recognizing a worker processing unit failing, the failed worker processing unit is reassigned and initialized based at least on a checkpoint state stored in local memory.
CONSTRUCTING PROCESSING PIPELINE AT EDGE COMPUTING DEVICE
A computing system including an edge computing device. The edge computing device may include an edge device processor configured to receive edge device contextual data including computing resource availability data. Based at least in part on the edge device contextual data, the edge device processor may select a processing stage machine learning model of a plurality of processing stage machine learning models and construct a runtime processing pipeline of one or more runtime processing stages including the processing stage machine learning model. The edge device processor may receive a runtime input, and, at the runtime processing pipeline, generate a runtime output based at least in part on the runtime input. The edge device processor may generate runtime pipeline metadata that indicates the one or more runtime processing stages included in the runtime processing pipeline. The edge device processor may output the runtime output and the runtime pipeline metadata.