Patent classifications
G06F2209/509
OVERLAPPED GEOMETRY PROCESSING IN A MULTICORE GPU
A multicore graphics processing unit (GPU) and a method of operating a GPU having at least a first core and a second core. A client driver writes a series of geometry commands in the command buffer, along with associated dependency data that indicates the extent to which correct execution of the geometry commands is dependent on the completion of execution of other commands. The first core reads a first geometry command from the command buffer and executes it. The second core reads a second geometry command from the command buffer. The second core determines that the second geometry command is not dependent on the results of the first geometry command, and, in response, executes the second geometry command.
DATA PROCESSING METHOD AND APPARATUS AND HETEROGENEOUS SYSTEM
A data processing method and apparatus, and a heterogeneous system, pertaining to the field of computer technologies are provided. The heterogeneous system includes a processor connected to an accelerator. A secondary memory is connected to the accelerator. The processor is configured to write to-be-processed data into the secondary memory and trigger the accelerator to access and process the to-be-processed data stored in the secondary memory according to a processing instruction. The accelerator is configured to write a processing result of the to-be-processed data into the secondary memory and to trigger the processor to read the processing result. Processing efficiency is enhanced by reducing the number of times of interaction between the processor and the accelerator and simplifying the procedure for data processing.
KERNEL OPTIMIZATION AND DELAYED EXECUTION
A kernel comprising at least one dynamically configurable parameter is submitted by a processor. The kernel is to be executed at a later time. Data is received after the kernel has been submitted. The at least one dynamically configurable parameter of the kernel is updated based on the data. The kernel having the at least one updated dynamically configurable parameter is executed after the at least one dynamically configurable parameter has been updated.
QUANTUM COMPUTING SERVICE WITH QUALITY OF SERVICE (QoS) ENFORCEMENT VIA OUT-OF-BAND PRIORITIZATION OF QUANTUM TASKS
A quantum computing service includes a quality of service (QoS) and out-of-band prioritization module. The QoS and out-of-band prioritization module enforces QoS guarantees for quantum tasks and quantum jobs submitted to the quantum computing service while allowing for processing of the quantum jobs and quantum tasks based on QoS guarantees and not necessarily in an order in which the quantum jobs or quantum tasks are received. Also, the QoS and out-of-band prioritization module determines updated priorities out-of-band based on quantum resource usage information for previously executed quantum tasks such that submittal of pending quantum tasks is not delayed in while update priorities are being determined.
REGULATING CLOUD BUDGET CONSUMPTION
An approach for optimizing storage on a local storage device. The approach receives a cloud resource budget limit and a cloud budget time interval. The approach estimates future cloud resource requests expected to arrive before the end of the cloud budget time interval. The approach calculates definitive and estimated costs of cloud resource usage types. The approach calculates a total estimated resource budget consumption. The approach determines if the total estimated resource budget consumption exceeds the cloud resource budget limit. If the approach determines the cloud resource budget limit is not exceeded, then the approach outputs a set of existing unfulfilled cloud resource requests for fulfillment. If the approach determines the cloud resource budget limit is exceeded, then the approach outputs a subset of set of existing unfulfilled cloud resource requests that do not exceed the cloud resource budget limit for fulfillment.
Remote product invocation framework
A method for remote product invocation includes configuring an invocation framework that includes an integration module and an endpoint/handler module. Once configured, the integration module is configured to: receive a source object; format data from said source object for a desired operation; and utilize said endpoint/handler module to make a connection to an external service that executes said desired operation using said data from said source object. A system for remote invocation of external services includes a calling entity which generates a source object containing data for execution of a remote operation; and an integration module configured to receive the source object, interpret the source object, and pass the data to an endpoint/handler which opens a connection with an external service and executes the remote operation.
Identifying dependencies in a control sequence for execution on a hardware accelerator
Provided are embodiments for a computer-implemented method, system and computer program product for identifying dependencies in a control sequence. Embodiments include receiving a control block that comprises a first error dependency (EDEP) level, maintaining the first EDEP level, and determining whether the received control block was successfully executed. Embodiments also include receiving a subsequent control block that comprises a second EDEP level, comparing the first EDEP level and the second EDEP level, and providing the subsequent control block for execution based at least in part on the successful execution of the received control block, and on the second EDEP level being less than or equal to the first EDEP level.
ARTIFICIAL INTELLIGENCE OPERATION PROCESSING METHOD AND APPARATUS, SYSTEM, TERMINAL, AND NETWORK DEVICE
Described are an artificial intelligence operation processing method and apparatus, a system, a terminal, and a network device. The method comprises: a terminal receives indication information sent by a network device, wherein the indication information is used for indicating information about an artificial intelligence/machine learning (AI/ML) task performed by the terminal. The present invention solves the technical problems in the related art of unsatisfactory needs and waste of resources in the local implementation of an AI/ML operation by a terminal, thereby achieving the effects of fully utilizing various resources such as computing power, storage, power supply, and communication rate of the terminal according to actual changes.
Executing cross-core copy instructions in an accelerator to temporarily store an operand that cannot be accommodated by on-chip memory of a primary core into a secondary core
An acceleration unit including a primary core and a secondary core is provided. The primary core includes a first on-chip memory, a primary core sequencer adapted to decode a received first cross-core copy instruction, and a primary core memory copy engine adapted to acquire a first operand from a first address in the first on-chip memory and copy the acquired first operand to a second address in a second on-chip memory of the secondary core. Further, the secondary core includes a second on-chip memory, a secondary core sequencer adapted to decode a received second cross-core copy instruction, and a secondary core memory copy engine adapted to acquire the first operand from the second address in the second on-chip memory and copy the acquired first operand back to the first address in the first on-chip memory.
Compiler for optimizing filter sparsity for neural network implementation configuration
Some embodiments provide a compiler for optimizing the implementation of a machine-trained network (e.g., a neural network) on an integrated circuit (IC). In some embodiments, the compiler determines whether sparsity requirements of channels implemented on individual cores are met on each core. If the sparsity requirement is not met, the compiler, in some embodiments, determines whether the channels of the filter can be rearranged to meet the sparsity requirements on each core and, based on the determination, either rearranges the filter channels or implements a solution to non-sparsity.