G06F9/5061

Compute optimization mechanism for deep neural networks

Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.

Storage management for machine learning at autonomous machines

A mechanism is described for facilitating storage management for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting one or more components associated with machine learning, where the one or more components include memory and a processor coupled to the memory, and where the processor includes a graphics processor. The method may further include allocating a storage portion of the memory and a hardware portion of the processor to a machine learning training set, where the storage and hardware portions are precise for implementation and processing of the training set.

Application program management method and apparatus
11507427 · 2022-11-22 · ·

This application provides an application program management method and apparatus. The method is performed in a database cluster system including at least two database nodes, at least one database object is stored in each database node, and the method includes: running an application program on a first database node in a first time period; determining a target database node based on at least one historical database object accessed by the application program in the first time period, where the target database node stores the historical database object; and running the application program on the target database node in a second time period. According to this application, a database node on which an application program runs can be dynamically adjusted, to avoid overload of the database node.

Resource pool management method and apparatus, resource pool control unit, and communications device

This application provides a resource pool management method and apparatus, a resource pool control unit, and a communications device. The method is applied to a resource pool system including a plurality of communications devices, and one resource pool control unit is deployed on each communications device. A first resource pool control unit that is responsible for managing a resource pool at a current moment receives a resource application request of an application program on any communications device, allocates, from the resource pool according to a preset rule, a first resource including one or more logical hardware devices, and sends a resource configuration request to a second resource pool control unit, so that the second resource pool control unit completes configuration of the first resource based on the resource configuration request, to provide a required hardware device resource for the application program.

Reservation-based high-performance computing system and method

A method includes communicatively coupling a shared computing resource to core computing resources associated with a first project. The core computing resources associated with the first project are configured to use the shared computing resource to perform data processing operations associated with the first project. The method also includes reassigning the shared computing resource to a second project by (i) powering down the shared computing resource, (ii) disconnecting the shared computing resource from the core computing resources associated with the first project, (iii) communicatively coupling the shared computing resource to core computing resources associated with the second project, and (iv) powering up the shared computing resource. The core computing resources associated with the second project are configured to use the shared computing resource to perform data processing operations associated with the second project. The shared computing resource lacks non-volatile memory to store data related to the first and second projects.

Cost-savings using ephemeral hosts in infrastructure as a service environments based on health score

Various examples are disclosed for placing virtual machine (VM) workloads in a computing environment. Ephemeral workloads can be placed onto reserved instances or reserved hosts in a cloud-based VM environment. If a request to place a guaranteed workload is received, ephemeral workloads can be evacuated to make way for the guaranteed workload.

Systems and methods for virtually partitioning a machine perception and dense algorithm integrated circuit

Systems and methods for virtually partitioning an integrated circuit may include identifying dimensional attributes of a target input dataset and selecting a data partitioning scheme from a plurality of distinct data partitioning schemes for the target input dataset based on the dimensional attributes of the target dataset and architectural attributes of an integrated circuit. The methods described herein may also include disintegrating the target dataset into a plurality of distinct subsets of data based on the selected data partitioning scheme and identifying a virtual processing core partitioning scheme from a plurality of distinct processing core partitioning schemes for an architecture of the integrated circuit based on the disintegration of the target input dataset. Additionally, the architecture of the integrated circuit may be virtually partitioned into a plurality of distinct partitions of processing cores and each of the plurality of distinct subsets of data may be mapped to one of the plurality of distinct partitions of processing cores.

Information handling system and method to allocate peripheral component interconnect express (PCIe) bus resources
11507421 · 2022-11-22 · ·

Information handling systems (IHSs) and methods are provided herein to allocate Peripheral Component Interconnect Express (PCIe) bus resources to a plurality of PCIe slots according to various PCIe bus resource allocation option settings. At least one host processor is included within the IHS for executing program instructions to detect a PCIe bus allocation option setting selected from a plurality of options provided in a boot firmware setup menu; determine if the PCIe bus allocation option setting has changed since the IHS was last booted; and allocate PCIe bus resources to the plurality of PCIe slots according to the detected PCIe bus allocation option setting. The plurality of options provided in the boot firmware setup menu include at least an auto detect option, which when selected, enables the at least one host processor to automatically detect unused PCIe slots and reallocate PCIe bus resources to used PCIe slots.

Method and system for generating latency aware workloads using resource devices in a resource device pool

A method for managing data includes obtaining, by a management module, a workload generation request, wherein the workload generation request specifies a plurality of resource devices, identifying available resource devices in a resource device pool based on the plurality of resource devices, performing a latency analysis on the available resource devices to obtain a plurality of resource device combinations and a total latency cost of each resource device combination, and selecting a resource device combination of the plurality of resource device combinations based on the total latency cost of each resource device combination, wherein the resource device combination comprises a second plurality of resource devices and wherein each of the second plurality of resource devices is one of the plurality of resource devices.

METHOD AND SYSTEM FOR MANAGING ELECTRONIC DESIGN AUTOMATION ON CLOUD

Existing techniques of managing Electronic Design Automation (EDA) on cloud are based on pre-defined policies which result in costly burst patterns and server farm tilt. Embodiments of present disclosure overcomes these drawbacks by a method and system for managing EDA on cloud which employ machine learning to predict optimal resource configurations for deploying EDA jobs and configuration circuit on cloud that holds resources required by the optimal resource configuration. Further, different Cloud Service Providers (CSP) are evaluated to determine least cost CSP which has the desired configuration circuit. Completion time of jobs and time required to burst the jobs on cloud are calculated based on which a wait time is determined. The jobs are retained in the queue for corresponding wait time before deploying them on the cloud. The jobs are deployed on the on-prem infrastructure if resources are freed up before the wait time.