Patent classifications
G06F9/3836
Allocation and placement of resources for network computation
Techniques for operating a computing system to perform neural network operations are disclosed. In one example, a method comprises receiving a neural network model, determining a sequence of neural network operations based on data dependency in the neural network model, and determining a set of instructions to map the sequence of neural network operations to the processing resources of the neural network processor. The method further comprises determining, based on a set of memory access operations included in the set of instructions, a first set of memory references associated with a first location of an external memory to store the input data and a second set of memory references associated with a second location of the external memory to store the output data, and generating an instruction file including the set of instructions, the first set of memory references and the second set of memory references.
Estimate and control execution time of a utility command
A method, system, and computer program product to plan and schedule executions of various utility tasks of a utility command during a maintain window, the method including receiving a utility command. The method may also include identifying possible utility tasks used to execute the utility command. The method may also include determining preferred utility tasks. The method may also include calculating a degree of parallelism for the preferred utility tasks. The method may also include generating a utility execution plan for the utility command. The method may also include analyzing the utility execution plan against resource constraints of a time window and sub time windows of the time window. The method may also include generating a time window execution plan for each sub time window of the sub time windows. The method may also include updating the utility execution plan with the time window execution plans.
Processor Power Management Using Instruction Throttling
Systems and methods are disclosed for processor power management using instruction throttling. For example, an integrated circuit may include a processor core including a processor pipeline configured to execute instructions; a register configured to store a power dial value that indicates a portion of available clock cycles for throttling of instruction flow through the processor pipeline; and an instruction throttling circuit configured to periodically stall removal of instructions from a queue in the processor pipeline for a number of clock cycles that is determined based on the power dial value.
Computing device and method
The present disclosure provides a computation device. The computation device is configured to perform a machine learning computation, and includes an operation unit, a controller unit, and a conversion unit. The storage unit is configured to obtain input data and a computation instruction. The controller unit is configured to extract and parse the computation instruction from the storage unit to obtain one or more operation instructions, and to send the one or more operation instructions and the input data to the operation unit. The operation unit is configured to perform operations on the input data according to one or more operation instructions to obtain a computation result of the computation instruction. In the examples of the present disclosure, the input data involved in machine learning computations is represented by fixed-point data, thereby improving the processing speed and efficiency of training operations.
HANDLING OF SINGLE-COPY-ATOMIC LOAD/STORE INSTRUCTION
In response to a single-copy-atomic load/store instruction for requesting an atomic transfer of a target block of data between the memory system and the registers, where the target block has a given size greater than a maximum data size supported for a single load/store micro-operation by a load/store data path, instruction decoding circuitry maps the single-copy-atomic load/store instruction to two or more mapped load/store micro-operations each for requesting transfer of a respective portion of the target block of data. In response to the mapped load/store micro-operations, load/store circuitry triggers issuing of a shared memory access request to the memory system to request the atomic transfer of the target block of data of said given size to or from the memory system, and triggers separate transfers of respective portions of the target block of data over the load/store data path.
TASK MANAGING SYSTEM FOR TESTING-CONFIGURING VEHICLES BASED ON A TASK ORDER AND METHOD THEREOF
A task managing system (TMS) test-configures one or more vehicles based on a task order for a selected vehicle from a vehicle management system (VMS). The TMS includes a plurality of task execution controllers configured to communicate with the vehicle. The TMS further includes a processor configured to execute instructions stored in a nontransitory computer-readable medium to operate as a task delegation module and a task status module. The task delegation module is configured to assign a selected vehicle from among the one or more vehicles to a selected task execution controller, where the selected task execution controller is configured to execute the task order for the selected vehicle. The task status module is configured to monitor a status of the task order being executed by the selected task execution controller based on an update message from the selected task execution controller.
Load balancing of machine learning algorithms
A computer implemented method of executing a plurality of discrete software modules each including a machine learning algorithm as an executable software component configurable to approximate a function relating a domain data set to a range data set; a data store; and a message handler as an executable software component arranged to receive input data and communicate output data for the module, wherein the message handler is adapted to determine domain parameters for the algorithm based on the input data and to generate the output data based on a result generated by the algorithm, each module having associated a metric of resource utilization by the module, the method including receiving a request for a machine learning task; and selecting a module from the plurality of modules for the task based on the metric associated with the module.
Maintaining sequentiality for media management of a memory sub-system
Methods, systems, and devices for maintaining sequentiality for media management of a memory sub-system are described. A plurality of read commands in connection with a set of media management operations for a plurality of transfer units are issued according to a read sequence. A plurality of entries associated with the set of media management operations are stored. A plurality of write commands in connection with the set of media management operations are issued based on the plurality of entries of the read sequence.
Execution circuits using discardable state
There is provided execution circuitry. Storage circuitry retains a stored state of the execution circuitry. Operation receiving circuitry receives, from issue circuitry, an operation signal corresponding to an operation to be performed that accesses the stored state of the execution circuitry from the storage circuitry. Functional circuitry seeks to perform the operation in response to the operation signal by accessing the stored state of the execution circuitry from the storage circuitry. Delete request receiving circuitry receives a deletion signal and in response to the deletion signal, deletes the stored state of the execution circuitry from the storage circuitry. State loss indicating circuitry responds to the operation signal when the stored state of the execution circuitry is not present and is required for the operation by indicating an error. In addition, there is provided a data processing apparatus comprising issue circuitry to issue an operation to execution circuitry. The execution circuitry stores a stored state that is accessed during performance of the operation and error detecting circuitry detects an indication of an error from the execution circuitry that the stored state is required for performance of the operation and that the stored state has been deleted.
Inhibiting load instruction execution based on reserving a resource of a load and store queue but failing to reserve a resource of a store data queue
A calculation processing apparatus includes a decoder that decodes memory access instructions including a store instruction and a load instruction; a first queue that stores the decoded memory access instructions; a second queue that stores store data related to the store instruction; a storage circuit that stores target address information of the store instruction for which the first queue is reserved but the second queue is not reserved; and an inhibitor that inhibits execution of the load instruction when address information matching target address information of the load instruction is stored in the storage circuit when the load instruction is being processed. This configuration inhibits switching of the order of a store instruction and a load instruction.