G06F2209/509

ARTIFICIAL NEURAL NETWORKS ON A DEEP LEARNING ACCELERATOR
20230004786 · 2023-01-05 ·

Multiple artificial neural networks can be compiled as a single workload. A respective throughput for each of the artificial neural networks can be changed at runtime. The multiple artificial neural networks can be partially compiled individually and then later compiled just-in-time according to changing throughput demands for the artificial neural networks. The multiple artificial neural networks can be deployed on a deep learning accelerator hardware device.

RATE LIMITING COMMANDS FOR SHARED WORK QUEUES
20230004503 · 2023-01-05 · ·

A memory management unit of a processor may receive a command associated with a process. The command may specify an operation to be performed by another device. The memory management unit may determine a counter value associated with a shared work queue of the another device, an indication the shared work queue to be specified by the command. The memory management unit may determine whether to accept or reject the command based on the counter value and a threshold for the process.

PARALLEL METHOD AND DEVICE FOR CONVOLUTION COMPUTATION AND DATA LOADING OF NEURAL NETWORK ACCELERATOR

Disclosed are a parallel method and device for convolution computation and data loading of a neural network accelerator. The method needs two input feature maps and two convolution kernel cache blocks, and sequentially stores the input feature maps and 64 convolution kernels into cache sub-blocks according to a loading length, so as to execute convolution computation and simultaneously load data of a next group of 64 convolution kernels.

5G-NR MULTI-CELL SOFTWARE FRAMEWORK
20220413928 · 2022-12-29 ·

Apparatuses, systems, and techniques to perform multi-cell physical layer (PHY) processing in a fifth generation (5G) new radio (NR) network. In at least one embodiment, a PHY library implementing a PHY pipeline groups multi-user and/or multi-cell 5G-NR PHY operations for parallel execution as a result of one or more function calls to an application programming interface provided by said PHY library.

PLATFORM FRAMEWORK CONFIGURATION STATE MANAGEMENT

Embodiments of systems and methods for platform framework configuration state management are described. A platform framework of an IHS (Information Handling System) generates a resource dependency graph based on registrations of a plurality of platform framework participants, wherein the registrations of the participants specify use of resources accessed via the platform framework. A change in context of operation of the IHS is determined. Based on the context change, a change is determined in the availability of resources accessed via the platform framework. Based on the resource dependency graph, registered participants are identified that are affected by the change in platform framework resource availability. The affected participants are notified of the change in platform framework resource availability. In some embodiments, the registrations of the participants may specify a communication handle for notifying the participant of changes in the resource dependency graph.

TECHNIQUES TO ENABLE QUALITY OF SERVICE CONTROL FOR AN ACCELERATOR DEVICE
20220413909 · 2022-12-29 ·

Examples include techniques to enable quality of service (QoS) control for an accelerator device. Circuitry at an accelerator device implements QoS control responsive to receipt of a submission descriptor for a work request to execute a workload for an application hosted by a compute device coupled with the accelerator device. An example QoS control includes accepting the submission descriptor to a work queue at the accelerator device based on a work size of submission descriptor submissions of the application to the work queue over a unit of time not exceeding a submission rate threshold. The work queue is associated with an operational unit at the accelerator device to execute the workload based on information included in the submission descriptor. The work queue to be shared with at least one other application hosted by the compute device.

RESERVOIR SIMULATION UTILIZING HYBRID COMPUTING
20220414285 · 2022-12-29 ·

Hybrid computing that utilizes a computer processor coupled to one or more graphical processing units (GPUs) is configured to perform computations that generate outputs related to reservoir simulations associated with formations that may include natural gas and oil reservoirs.

MANAGING COMPUTE RESOURCES AND RUNTIME OBJECT LOAD STATUS IN A PLATFORM FRAMEWORK

Embodiments of systems and methods for managing compute resources and runtime object load status in a platform framework are described. In some embodiments, an Information Handling System (IHS) may include a processor and a memory coupled to the processor, the memory having program instructions stored thereon that, upon execution, cause the IHS to: receive, at a platform framework via an Application Programming Interface (API), an arbitration policy; notify an application, by the platform framework via the API, of a state change with respect to the arbitration policy based upon a change in context; receive, at the platform framework from the application via the API, an identification of at least one compute resource to execute a workload associated with the arbitration policy; and offload the workload to the compute resource.

Application runtime determined dynamical allocation of heterogeneous compute resources

The present invention provides a method of operating a heterogeneous computing system comprising a plurality of computation nodes and a plurality of booster nodes, at least one of the plurality of computation nodes and plurality of booster nodes being arranged to compute a computation task, the computation task comprising a plurality of sub-tasks, wherein in a first computing iteration, the plurality of sub-tasks are assigned to and processed by ones of the plurality of computation nodes and booster nodes in a first distribution; and information relating to the processing of the plurality of sub-tasks by the plurality of computation nodes and booster nodes is used to generate a further distribution of the sub-tasks between the computation nodes and booster node for processing thereby in a further computing iteration.

Low latency remoting to accelerators

A method of offloading performance of a workload includes receiving, on a first computing system acting as an initiator, a first function call from a caller, the first function call to be executed by an accelerator on a second computing system acting as a target, the first computing system coupled to the second computing system by a network; determining a type of the first function call; and generating a list of parameter values of the first function call.