G06F2209/5013

SYSTEMS AND METHODS FOR DISTRIBUTED RESOURCE MANAGEMENT

Methods, computer readable media, and systems service a queue, comprising a plurality of jobs, by identifying nodes satisfying a hardware requirement for at least a subset of jobs in the queue. Each job indicates when it was submitted to the queue and one or more node resource requirements. A current availability score for each node class in a plurality of node classes is determined and nodes of a first node class in the plurality of node classes are reserved when a demand score for the class satisfies the current availability score for the first node class by a first threshold amount. Reserved nodes are permitted to draw jobs from the queue in accordance with satisfaction by such nodes of the node resource requirements of the jobs but are terminated, without completing the jobs, when the current availability score for their node class exceeds a second threshold amount.

DATA PROCESSING METHOD AND APPARATUS
20220121912 · 2022-04-21 · ·

A processor-implemented data processing method includes: receiving a request for executing a neural network model on an accelerator; generating a plurality of candidate kernels for each of a plurality of layers comprised in the model; and allocating, to the accelerator, a single candidate kernel that is selected from among a plurality of candidate kernels for a layer to run on the accelerator based on corresponding kernel information and status information of the accelerator.

Method, storage medium storing instructions, and apparatus for implementing hardware resource allocation according to user-requested resource quantity

The aspects of the present disclosure provide a method and an apparatus for implementing hardware resource allocation. For example, the apparatus includes processing circuitry. The processing circuitry obtains a first value that is indicative of an allocable resource quantity of a hardware resource in a computing device. The processing circuitry also receives a second value that is indicative of a requested resource quantity of the hardware resource by a user, and then determines whether the second value is greater than the first value. When the second value is determined to be less than or equal to the first value, the processing circuitry requests the computing device to allocate the hardware resource of the requested resource quantity to the user, and subtracts the second value from the first value to update the allocable resource quantity of the hardware resource in the computing device.

Enforce changes in session behavior based on updated machine learning model with detected risk behavior during session

Systems and methods are provided for managing dynamic controls over access to computer resources and, even more particularly, for evaluating and re-evaluating dynamic conditions and changes associated with user sessions. The systems and methods are configured to automatically make a determination as to whether new or additional authentication credentials are required for a user that is already authorized for accessing resources in a user session, in response to triggering events such as the identification of a new or changed condition associated with the user session.

Resource allocation using distributed segment processing credits
11765099 · 2023-09-19 · ·

Systems and methods for allocating resources are disclosed. Resources as processing time, writes or reads are allocated. Credits are issued to the clients in a manner that ensure the system is operating in a safe allocation state. The credits can be used not only to allocate resources but also to throttle clients where necessary. Credits can be granted fully, partially, and in a number greater than requested. Zero or negative credits can also be issued to throttle clients. Segment credits are associated with identifying unique fingerprints or segments and may be allocated by determining how many credits a CPU/cores can support. This maximum number may be divided amongst clients connected with the server.

Transparent memory management for over-subscribed accelerators

Transparent memory management for over-subscribed accelerators is disclosed. A request from a remote initiator to execute a workload on a shared accelerator is received at a host system comprising the shared accelerator. A determination is made that there is insufficient physical memory of the accelerator to accommodate the request from the remote initiator. Responsive to determining that there is insufficient physical memory of the accelerator. An allocation of host system memory is requested for the remote initiator from the host system. A mapping between the remote initiator and the allocation of host system memory is then created.

METHOD, STORAGE MEDIUM STORING INSTRUCTIONS, AND APPARATUS FOR IMPLEMENTING HARDWARE RESOURCE ALLOCATION ACCORDING TO USER-REQUESTED RESOURCE QUANTITY

The aspects of the present disclosure provide a method and an apparatus for implementing hardware resource allocation. For example, the apparatus includes processing circuitry. The processing circuitry obtains a first value that is indicative of an allocable resource quantity of a hardware resource in a computing device. The processing circuitry also receives a second value that is indicative of a requested resource quantity of the hardware resource by a user, and then determines whether the second value is greater than the first value. When the second value is determined to be less than or equal to the first value, the processing circuitry requests the computing device to allocate the hardware resource of the requested resource quantity to the user, and subtracts the second value from the first value to update the allocable resource quantity of the hardware resource in the computing device.

Method and apparatus for processing resource request

A method and an apparatus for processing a resource request are provided. The method includes: classifying, by a computing device, access virtual objects into a plurality of density grades according to a density of interaction virtual objects in a current interactive range of each access virtual object; allocating a resource request quota to each density grade. The method also includes: when a resource request sent by an access virtual object in a first density grade is received within the first preset duration, processing the resource request if the resource request quota corresponding to the density grade is greater than the preset quota threshold, and subtracting a preset value from the resource request quota corresponding to the density grade; and rejecting the resource request if the resource request quota corresponding to the density grade is not greater than the preset quota threshold.

VERSIONED PROGRESSIVE CHUNKED QUEUE FOR A SCALABLE MULTI-PRODUCER AND MULTI-CONSUMER QUEUE
20210342190 · 2021-11-04 ·

A method includes receiving, by a producer thread of a plurality of producer threads, an offer request associated with an item. The producer thread increases a sequence and determines (i) a chunk identifier of a memory chunk from a pool of memory chunks and (ii) a first slot position in the memory chunk to offer the item. The producer thread also writes the item into the memory chunk at the first slot position. Then, a first consumer thread of a plurality of consumer threads determines the first slot position of the item and consumes the item at the first slot position. A second consumer thread consumes another item at a second slot position in the memory chunk and recycles the memory chunk.

Scheduling system for computational work on heterogeneous hardware

The technology includes methods, processes, and systems for virtualizing graphics processing unit (GPU) memory. Example embodiments of the technology include managing an amount of GPU memory used by one or more processes, such as Application Programming Interfaces (APIs), that directly or indirectly impact one or more other processes running on the same GPU. Managing and/or virtualizing the amount of GPU memory may ensure that an end user does not receive a GPU out-of-memory error because the API request is impacted by the processing of other API requests. A virtual machine with access to a GPU may be organized with one or more job slots that are configured to specify the number of processes that are able to run concurrently on a specific virtual machine. A process may be configured on each virtual machine running a software program or API and is used to schedule work based on GPU memory requirements.