G06F9/3871

Chip and chip-based data processing method

Embodiments of the present specification provide chips and chip-based data processing methods. In an embodiment, a method comprises: obtaining data associated with one or more neural networks transmitted from a server; for each layer of a neural network of the one or more neural networks, configuring, based on the data, a plurality of operator units based on a type of computation each operator unit performs; and invoking the plurality of operator units to perform computations, based on neurons of a layer of the neural network immediately above, of the data for each neuron to produce a value of the neuron.

Dynamic usage of storage and processing unit allocation

Systems and methods are provided for managing dynamically allocated storage and processing units. The systems and methods include operations for determining, a usage pattern having a peak usage portion and a low usage portion; reserving a first collection of units on a dynamic unit allocation system during the peak usage portion; detecting a transition from the peak usage portion to the low usage portion; in response to detecting the transition, instructing the dynamic unit allocation system to reduce the first collection of units to reserve a second collection of units corresponding to a second amount of the low usage portion; selecting asynchronous tasks that consume a set of units greater than the second collection of units; and during a period of time that the dynamic unit allocation system is reducing the first collection of units, causing the asynchronous tasks to be executed by the dynamic allocation system.

LONG-TERM PROGRAMMATIC WORKFLOW MANAGEMENT
20220164224 · 2022-05-26 ·

A system and method for long-term programmatic workflow execution, including: iteratively, while a suspension event is not detected: with a run, executing the code block using passed variable values from another code block; when a suspension event is detected, suspending run execution and persistently storing the run state; and resuming run execution responsive to receipt of a valid run resumption request.

ASYNCHRONOUS PIPELINE MERGING USING LONG VECTOR ARBITRATION
20220121577 · 2022-04-21 ·

Devices and techniques for asynchronous pipeline merging are described herein. An apparatus, includes a memory controller, which includes merge circuitry; where the memory controller chiplet is configured to perform operations including those to: perform a bitwise logical operation on a first logging bit vector and a second logging bit vector to obtain a result vector, wherein the first logging bit vector is associated with a first pipeline and the second logging bit vector is associated with a second pipeline, and wherein bits in respective index positions of the first and second logging bit vectors represent transactions; select a completed transaction from the result vector using a round-robin technique; and forward the completed transaction from the set of completed transactions to an output pipeline.

METHOD AND SYSTEM TO PROCESS ASYNCHRONOUS AND DISTRIBUTED TRAINING TASKS

This disclosure relates generally relates to method and system to process asynchronous and distributed training tasks. Training a large-scale deep neural network (DNN) model with large-scale training data is time-consuming. The method creates a work queue (Q) with a set of predefined number of tasks comprising a training data. Here, set of central processing units (CPUs) information and a set of graphics processing units (GPUs) information are fetched from the current environment to initiate a parallel process asynchronously on the work queue (Q) to train a set of deep learning models with optimized resources using a data pre-processing technique, to compute a transformed training data and training by using an asynchronous model training technique, the set of deep learning models on each GPU asynchronously with the transformed training data based on a set of asynchronous model parameters.

Handling an input/output store instruction

An input/output store instruction is handled. A data processing system includes a system nest communicatively coupled to at least one input/output bus by an input/output bus controller. The data processing system further includes at least a data processing unit including a core, system firmware and an asynchronous core-nest interface. The data processing unit is communicatively coupled to the system nest via an aggregation buffer. The system nest is configured to asynchronously load from and/or store data to an external device which is communicatively coupled to the input/output bus. The data processing unit is configured to complete the input/output store instruction before an execution of the input/output store instruction in the system nest is completed.

TRACKING ASYNCHRONOUS EVENT PROCESSING

A messaging system receives a registration from a first microservice for one or more event types to publish, and the registration includes an event report policy. The messaging system receives a first event, and the first event is described by the event report policy. The first event is monitored as it is processed by a second microservice. An event report describing the results of the monitoring is delivered to the first microservice.

Method of dispatching instruction data when a number of available resource credits meets a resource requirement

The processor chip can have a pre-execution pipeline sharing a plurality of resources including at least one resource of interest, a resource tracker having more than one credit unit associated with each one of said at least one resource of interest. The method can include: decoding the instruction data to determine a resource requirement including a quantity of virtual credits required from the credit units for the at least one resource of interest, checking the resource tracker for an availability of said quantity of virtual credits and, if the availability of the amount of said virtual credits is established, i) dispatching the instruction data, and ii) subtracting the quantity of said credits from the resource tracker.

HANDLING AN INPUT/OUTPUT STORE INSTRUCTION

An input/output store instruction is handled. A data processing system includes a system nest communicatively coupled to at least one input/output bus by an input/output bus controller. The data processing system further includes at least a data processing unit including a core, system firmware and an asynchronous core-nest interface. The data processing unit is communicatively coupled to the system nest via an aggregation buffer. The system nest is configured to asynchronously load from and/or store data to an external device which is communicatively coupled to the input/output bus. The data processing unit is configured to complete the input/output store instruction before an execution of the input/output store instruction in the system nest is completed.

CHIP AND CHIP-BASED DATA PROCESSING METHOD

Embodiments of the present specification provide chips and chip-based data processing methods. In an embodiment, a method comprises: obtaining data associated with one or more neural networks transmitted from a server; for each layer of a neural network of the one or more neural networks, configuring, based on the data, a plurality of operator units based on a type of computation each operator unit performs; and invoking the plurality of operator units to perform computations, based on neurons of a layer of the neural network immediately above, of the data for each neuron to produce a value of the neuron.