Patent classifications
G06F2209/503
PRE-EMPTIVE CONTAINER LOAD-BALANCING, AUTO-SCALING AND PLACEMENT
A resource usage platform is disclosed. The platform performs preemptive container load balancing, auto scaling, and placement in a computing system. Resource usage data is collected from containers and used to train a model that generates inferences regarding resource usage. The resource usage operations are performed based on the inferences and on environment data such as available resources, service needs, and hardware requirements.
Resource scheduling method, scheduling server, cloud computing system, and storage medium
Embodiments of this application disclose a resource scheduling method performed at a scheduling server. Virtual machine (VM) information of a to-be-created VM is obtained and common resource information is obtained. A preset resource information private copy is updated according to the common resource information. The resource information private copy includes host machine information corresponding to a preset host machine. Finally, according to the resource information private copy, at least one candidate host machine meeting the VM information is obtained, a target host machine is obtained from the at least one candidate host machine, and the VM is created on the target host machine. In the solution, the resource information private copy can be updated in time before the resource scheduling is performed, which ensures synchronization of the resource information private copy and the common resource information, so that a better resource scheduling result is achieved.
Placement of virtual GPU requests in virtual GPU enabled systems using a requested memory requirement of the virtual GPU request
Disclosed are aspects of memory-aware placement in systems that include graphics processing units (GPUs) that are virtual GPU (vGPU) enabled. Virtual graphics processing unit (vGPU) data is identified for graphics processing units (GPUs). A configured GPU list and an unconfigured GPU list are generated using the GPU data. The configured GPU list specifies configured vGPU profiles for configured GPUs. The unconfigured GPU list specifies a total GPU memory for unconfigured GPUs. A vGPU request is assigned to a vGPU of a GPU. The GPU is a first fit, from the configured GPU list or the unconfigured GPU list that satisfies a GPU memory requirement of the vGPU request.
Methods and systems for application program interface call management
Disclosed are systems and methods for application program interface (API) call management. For example a method may include obtaining API call information for one or more API endpoints, the API call information including a number of API calls to the one or more API endpoints; obtaining resource utilization (RU) information, the RU information including project RU information for one or more projects; analyzing the API call information and the RU information to obtain API cost information, the API cost information including cost per API call for the one or more API endpoints; and managing subsequent API calls to the one or more API endpoints based on the cost per API call.
Capacity management for global resources
A service provider network may provider one or more global cloud services across multiple regions. A client may submit a request to create multiple replicas of a service resource in respective instantiations of a service in the multiple regions. The receiving region of the request may determine the capacities of the multiple regions as to serving respective replicas of the service resource. The receiving region may provide a response to the client based on the determined capacities of the regions.
JUST-IN-TIME (JIT) SCHEDULER FOR MEMORY SUBSYSTEMS
The memory sub-systems of the present disclosure discloses a just-in-time (JIT) scheduling system and method. In one embodiment, a system receives a request to perform a memory operation using a hardware resource associated with a memory device. The system identifies a traffic class corresponding to the memory operation. The system determines a number of available quality of service (QoS) credits for the traffic class during a current scheduling time frame. The system determines a number of QoS credits associated with a type of the memory operation. Responsive to determining the number of QoS credits associated with the type of the memory operation is less than the number of available QoS credits, the system submits the memory operation to be processed at a memory device.
DATA PARALLEL PROGRAMMING-BASED TRANSPARENT TRANSFER ACROSS HETEROGENEOUS DEVICES
An apparatus to facilitate data parallel programming-based transparent transfer across heterogeneous devices is disclosed. The apparatus includes a processor to: identify a change in device status that triggers a device transfer process from an original device, wherein the original device is associated with a queue of an application program of a data parallel programming runtime; identify a new device that is compatible with the original device; migrate at least one of a state or data of the original device to the new device; logically map, without user intervention, the queue to the new device in the data parallel programming runtime; and initiate execution of the application program on the new device using the queue.
METHOD AND SYSTEM FOR MEMORY MANAGEMENT ON THE BASIS OF ZONE ALLOCATIONS AND OPTIMIZATION USING IMPROVED LMK
The present disclosure provides a description of systems and methods for memory management on the basis of zone allocations. A computing device analyzes the monitored memory usage information of the user device and outputs an analysis of the monitored memory usage information. The analysis of the monitored memory usage information includes a system level memory usage view of the user device, a memory usage view of a high zone portion of the memory of the user device, and a memory usage view of a low zone portion of the memory. A user device receives a request to allocate memory for a new process. The user device, in response to determining that there is insufficient free pages in either the high zone portion or the low zone portion of the memory, sends a memory pressure notification to a low memory killer daemon of the user device.
MANAGING RESOURCE ALLOCATION IN A SOFTWARE-DEFINED SYSTEM
Resource allocation can be managed in a software-defined system. For example, a computing device can receive, for a container in a software-defined system, a container limit specifying a maximum value for the container. The computing device can receive, for the container, one or more benefit functions that assign a weight for the resource in the software-defined system. The computing device can determine a value for a resource is less than the container limit. In response to determining the value for the resource is less than the container limit, the computing device can allocate the resource to the container based on the weight from the one or more benefit functions.
Intelligent resource allocation agent for cluster computing
A resource allocation module may be configured to monitor an input queue of a cluster computing framework for a batch of one or more programs for processing. The resource allocation module parses commands in each of the one or more programs to determine an input/output (I/O) complexity parameter and at least one operation complexity parameter corresponding to each program of the one or more program files. The resource allocation module triggers execution of the one or more program files by a cluster computing framework via a network communication, wherein the cluster computing framework is configured based on the I/O complexity parameter and the at least one operation complexity parameter. Based on analysis of feedback from the cluster computing framework, the resource allocation module modifies a calculation for determining the I/O complexity parameter and/or a calculation for determining the operation complexity of the program files.