Patent classifications
G06F2209/509
Method for Scheduling Hardware Accelerator and Task Scheduler
A task scheduler is connected between a central processing unit (CPU) and each hardware accelerator. The task scheduler first obtains a target task (for example, obtains the target task from a memory), and obtains a dependency relationship between the target task and an associated task. When it is determined, based on the dependency relationship, that a first associated task (for example, a prerequisite for executing the target task is that both a task 1 and a task 2 are executed) in the associated task has been executed, it indicates that the target task meets an execution condition, and the task scheduler schedules related hardware accelerators to execute the target task. Based on a dependency relationship between tasks, the task scheduler schedules, through hardware scheduling, each hardware accelerator to execute each task, and delivery of each task is performed through direct hardware access.
Hardware offload support for an operating system offload interface using operation code verification
A method may include receiving, by a privileged component executed by a processing device, bytecode of a packet processing component from an unprivileged component executed by the processing device, analyzing, by the privileged component, the bytecode of the packet processing component to identify whether the bytecode comprises a first command that returns a redirect, analyzing, by the privileged component, the bytecode of the packet processing component to identify whether the bytecode comprises a second command that returns a runtime computed value, and responsive to determining that the bytecode comprises the first command or the second command, setting a redirect flag maintained by the privileged component.
MIGRATION OF VIRTUAL COMPUTING STORAGE RESOURCES USING SMART NETWORK INTERFACE CONTROLLER ACCELERATION
An information handling system may include a processor; a network interface; and a physical storage resource having data stored thereon that is usable by a virtual resource that is executable on the processor. The network interface may accelerate migration of the data to a destination system by, in response to a command from a virtual machine manager: offloading, from the processor, a copying process configured to copy the data to the destination system; tracking portions of the data that are changed by the virtual resource during the copying process; notifying the virtual machine manager that a designated checkpoint has been reached in the copying process; causing the virtual resource to pause; completing the copying process; and causing the virtual resource to resume and use the copied data at the destination instead of the data on the physical storage resource.
SYSTEMS AND METHODS WITH INTEGRATED MEMORY POOLING AND DIRECT SWAP CACHING
Systems and methods related to integrated memory pooling and direct swap caching are described. A system includes a compute node comprising a local memory and a pooled memory. The system further includes a host operating system (OS) having initial access to: (1) a first swappable range of memory addresses associated with the local memory and a non-swappable range of memory addresses associated with the local memory, and (2) a second swappable range of memory addresses associated with the pooled memory. The system further includes a data-mover offload engine configured to perform a cleanup operation, including: (1) restore a state of any memory content swapped-out from a memory location within the first swappable range of memory addresses to the pooled memory, and (2) move from the local memory any memory content swapped-in from a memory location within the second swappable range of memory addresses back out to the pooled memory.
Scheduling processing of machine learning tasks on heterogeneous compute circuits
Scheduling work of a machine learning application includes instantiating kernel objects by a computer processor in response to input of kernel definitions. Each kernel object is of a kernel type indicating a compute circuit. The computer processor generates a graph in a memory. Each node represents a task and specifies an assignment of the task to one or more of the kernel objects, and each edge represents a data dependency. Task queues are created in the memory and assigned to queue tasks represented by the nodes. Kernel objects are assigned to the task queues, and the tasks are enqueued by threads executing the kernel objects, based on assignments of the kernel objects to the task queues and assignments of the tasks to the kernel objects. Tasks are dequeued by the threads, and the compute circuits are activated to initiate processing of the dequeued tasks.
Control of offloading of calculation tasks in multi-access edge computing
Method for offloading calculation tasks between a user terminal and an edge host device in a communication network according to a multi-access edge computing technique, including steps of: Offloading data necessary for the execution of the calculation from the user terminal to the edge host device, and Transmitting data resulting from the calculation carried out by the edge host device, from the edge host device to the user terminal, wherein the offloading of data is controlled on the basis of joint criteria of energy efficiency and of minimization of exposure of a user of the user terminal to electromagnetic fields.
METHOD AND DEVICE FOR PROCESSING SERVICE USING REQUEST, AND COMPUTER READABLE STORAGE MEDIUM
A method and apparatus for processing a service using a request in a computer readable storage medium. An embodiment of the method comprises: acquiring a service using a request for a vehicle-mounted device; determining a to-be-processed request from unprocessed service using requests; determining, using a preset navigation status service layer, a service type to which the to-be-processed request belongs; and calling, using the navigation status service layer, a service module corresponding to the service type to process the to-be-processed request, the navigation status service layer recording corresponding relationships between different service types and different service modules.
Executing a Quantum Logic Circuit on Multiple Processing Nodes
In a general aspect, a quantum logic circuit is executed on multiple processing nodes in a computing system that includes quantum computing resources. In some aspects, methods of operating the computing system may include obtaining a computer program that includes a quantum logic circuit. The methods may include obtaining hardware resource metadata specifying properties of processing nodes in the computing system. The processing nodes include at least a subset of the quantum computing resources, and the hardware resource metadata includes error rate information and availability information for the respective processing nodes. The methods may include generating execution tasks configured to execute the quantum logic circuit on the processing nodes based on the hardware resource metadata; dispatching the execution tasks to the processing nodes; receiving output data generated by the processing nodes; and producing an output of the computer program based on the output data.
ACCELERATING INFERENCES PERFORMED BY ENSEMBLE MODELS OF BASE LEARNERS
A method is provided for accelerating machine learning inferences. The method uses an ensemble model run on input data. This ensemble model involves several base learners, where each of the base learners has been trained. The method first schedules tasks for execution. As a result of the task scheduling, one of the base learners is executed based on a subset of the input data. The execution of the tasks is then started to obtain respective task outcomes. An exit condition is repeatedly evaluated while executing the tasks by computing a deterministic function of the task outcomes obtained so far. This deterministic function output values indicate whether an inference result of the ensemble model has converged. Accordingly, the execution of the tasks can be interrupted if the exit condition evaluated last is found to be fulfilled. Eventually, an inference result of the ensemble model is estimated based on the task outcomes.
DYNAMIC CROSS-ARCHITECTURE APPLICATION ADAPTION
Embodiments described herein are generally directed to improving performance of high-performance computing (HPC) or artificial intelligence (AI) workloads on cluster computer systems. According to one embodiment, a section of a high-performance computing (HPC) or artificial intelligence (AI) workload executing on a cluster computer system is identified as significant to a figure of merit (FOM) of the workload. An alternate placement among multiple heterogeneous compute resources of a node of the cluster computer system is determined for a portion of the section currently executing on a given compute resource of the multiple heterogeneous compute resources. After predicting an improvement to the FOM based on the alternate placement, the portion is relocated to the alternate placement.