Patent classifications
G06F9/463
PRE-INSTRUCTION SCHEDULING REMATERIALIZATION FOR REGISTER PRESSURE REDUCTION
Examples are disclosed herein that relate to performing rematerialization operation(s) on program source code prior to instruction scheduling. In one example, a method includes prior to performing instruction scheduling on program source code, for each basic block of the program source code, determining a register pressure at a boundary of the basic block, determining whether the register pressure at the boundary is greater than a target register pressure, based on the register pressure at the boundary being greater than the target register pressure, identifying one or more candidate instructions in the basic block suitable for rematerialization to reduce the register pressure at the boundary, and performing a rematerialization operation on at least one of the one or more candidate instructions to reduce the register pressure at the boundary to be less than the target register pressure.
APPARATUS FOR AUTOMATED LOOP CHECKING
An apparatus is configured to be installed on a terminal block to make an electrical connection to at least one I/O loop. The apparatus includes a terminal section having at least one pair of electrical terminals. The electrical terminals are arranged to be connected to the terminal block and to the I/O loop. The apparatus further includes an electronic section electrically connected to the terminal section adapted to communicate with the I/O loop through the terminal section.
Maintenance of local and global lists of task control blocks in a processor-specific manner for allocation to tasks
In a computing storage environment having multiple processor devices, lists of Task Control Blocks (TCBs) are maintained in a processor-specific manner, such that each of the multiple processor devices is assigned a local TCB list. The local TCB list of each of the multiple processor devices is populated with a respective number of TCBs from a global TCB list. The local TCB list of each of the multiple processor devices exchanges TCBs with the global TCB list during processes to maintain the local TCB list of each of the multiple processor devices at the respective number.
DATA PROCESSING METHOD, APPARATUS, AND SERVER
Implementations of the present specification describe a computer-implemented method, medium, and system. In one computer-implemented method, a data reading request sent by a client device is received, where the data reading request includes a code value. When the code value is matched in first code value configuration data, a location value corresponding to the code value is obtained based on the first code value configuration data, where the first code value configuration data includes at least one code value that corresponds to a location value. When the location value satisfies a location value determining condition, block data identified by the location value is obtained. A reading result is sent to the client device based on the block data obtained.
METHOD AND APPARATUS FOR EXECUTION OF NEURAL NETWORK
The present disclosure relates to methods and apparatuses for execution of a neural network. An exemplary method can be implemented by a processing unit. The processing unit can include a command parser configured to dispatch commands and computing tasks and at least one core communicatively coupled with the command parser and configured to process the dispatched computing task. Each core can include a convolution unit, a pooling unit, at least one operation unit and a sequencer communicatively coupled with the convolution unit, the pooling unit, and the at least one operation unit and configured to distribute instructions of the dispatched computing task to the convolution unit, the pooling unit, and the at least one operation unit for execution. The method can include: reading, by the convolution unit, data from a local memory of the at least one operation unit; performing, by the convolution unit, a convolution operation on the data to generate a feature map; and performing, by the pooling unit, a pooling operation on the feature map.
APPARATUS AND SYSTEM FOR EXECUTION OF NEURAL NETWORK
The present disclosure relates to apparatuses and systems for processing a neural network. A processing unit includes: a command parser configured to dispatch commands and computing tasks; and at least one core communicatively coupled with the command parser and configured to process the dispatched computing task, each core comprising: a convolution unit having circuitry configured to perform a convolution operation; a pooling unit having circuitry configured to perform a pooling operation; at least one operation unit having circuitry configured to process data; and a sequencer communicatively coupled with the convolution unit, the pooling unit, and the at least one operation unit, and having circuitry configured to distribute instructions of the dispatched computing task to the convolution unit, the pooling unit, and the at least one operation unit for execution.
GENERALIZED MACHINE LEARNING PIPELINE
A machine learning pipeline includes an input block that receives a dataset from a data source. The dataset includes columns that respectively correspond to different features in the dataset. A feature selection block of the pipeline reduces a size of the dataset by removing a subset of non-correlated features from the dataset, creating a modified dataset having only columns corresponding to correlated features. A model selection block of the pipeline tests performance of a plurality of models against the modified dataset using validation data values. The model selection block selects, from the plurality of models, a candidate model having a measured performance that meets or exceeds measured performances of other models in the plurality of models. An output block of the pipeline provides an output to a computational device that identifies the candidate model as being a preferred model for processing the dataset.
Extended asynchronous data mover functions compatibility indication
A method is provided that is executable by a processor of a computer. Note that the processor is communicatively coupled to a memory of the computer, and the memory stores a response block of a call command. In implementing the method, the processor defines a sub-functions field in the response block of the call command. Further the processor indicates that a set of functions of a set of instructions are installed and available at an interface based on a corresponding sub-functions flag within the sub-functions field being set. Note that the interface is also being executed on the computer and that the set of functions being represented by the corresponding sub-functions flag. The processor further indicates that the set of functions of the set of instructions are not installed based on the corresponding sub-functions flag not being set.
AUTONOMOUS JOB QUEUEING SYSTEM FOR HARDWARE ACCELERATORS
Embodiments may relate to an electronic device that includes a processor communicatively coupled with a hardware accelerator. The processor may be configured to identify, based on an indication of a priority level in a task control block (TCB), a location at which the TCB should be inserted in a queue of TCBs. The hardware accelerator may perform jobs related to the queue of TCBs in an order related to the order of TCBs within the queue. Other embodiments may be described or claimed.
COMPUTE TASK STATE ENCAPSULATION
One embodiment of the present invention sets forth a technique for encapsulating compute task state that enables out-of-order scheduling and execution of the compute tasks. The scheduling circuitry organizes the compute tasks into groups based on priority levels. The compute tasks may then be selected for execution using different scheduling schemes. Each group is maintained as a linked list of pointers to compute tasks that are encoded as task metadata (TMD) stored in memory. A TMD encapsulates the state and parameters needed to initialize, schedule, and execute a compute task.