G06F2209/504

Resource management techniques for dialog-driven applications

A resource of a dialog-driven management service is allocated for a first set of requests based on determining that a population of capacity indicators in a throttling data structure exceeds a threshold. One or more capacity indicator deduction iterations associated with the resource are conducted during a time interval for which the resource remains allocated for the first set of requests. In a given iteration, a number of capacity indicators is deducted from the throttling data structure based on a resource throttling setting. A second set of requests is rejected based on the population of the throttling data structure.

Scheduling usage of oversubscribed computing resources

Generally described, one or more aspects of the present application relate to an instance resource oversubscription service for scheduling a burst period for a running virtual machine instance based on a time window specified by a user of the virtual machine instance. For example, the instance resource oversubscription service can predict future resource usage and identify the appropriate timing and physical host machine for letting the user burst (e.g., temporarily use the virtual machine instance in a manner than consumes a higher amount of computing resources such as CPU cycles, memory, network bandwidth, etc.). In doing so, the instance resource oversubscription service may consider, for example, the historical and current resource utilization levels of the virtual machine instances running on a set of available physical host machines and the burst period scheduling requests from other users of the instance resource oversubscription service.

Limiting container CPU usage based on network traffic

In an approach to limiting container CPU usage based on network traffic, a packet CPU usage for each packet type of one or more packet types is determined. A network CPU usage is calculated for a specific container based on the network CPU usage for each packet type. A container process CPU usage and a total container CPU usage are calculated for the specific container, where the total container CPU usage is the sum of the network CPU usage and the container process CPU usage. Responsive to determining that the total container CPU usage exceeds a threshold, the container CPU consumption quota and the container network bandwidth setting for the specific container are adjusted to reduce the total container CPU usage, using a set of pre-configured parameters.

System and Method for Normalization of GPU Workloads Based on Real-Time GPU Data
20170262953 · 2017-09-14 ·

An information handling system includes a host processing system and a management controller. The host processing system includes a main processor that instantiates a management controller agent, a graphics processing unit (GPU), and a GPU throttle module. The management controller accesses the management controller via a first interface to obtain a performance status from the GPU, determine that the performance status is outside of a status threshold, and direct, via a second interface of the information handling system, the GPU throttle module to throttle the GPU to bring the performance status to within the status threshold.

Resource allocation device, resource allocation method, and resource allocation program

Resource use efficiency is improved while realizing quality guarantee of an application. A resource allocation device includes a storage unit that stores resource capacity information indicating a capacity of each of server resources, an SLI information collection unit that acquires information regarding an SLI at a predetermined time interval with regard to each of a plurality of applications, and a resource allocation determination unit that calculates an allocation resource amount of each application using a moving average and a standard deviation of the acquired information regarding the SLI during a predetermined period, and determines server resources which are allocation destinations of the applications by sorting the applications in descending order of the allocation resource amounts and sequentially adding the allocation resource amounts of the sorted applications within a range which does not exceed a capacity of each server resource in descending order of the allocation resource amounts.

System and method for service limit increase for a multi-tenant cloud infrastructure environment

Systems and methods described herein for automatic limit service increase in a multi-tenant cloud infrastructure environment. The systems and methods described herein provide for automatic approval of limit service increase requests that are either automatically generated based upon a tenant's usage of resources within the cloud infrastructure environment, or are received via, e.g., a user portal. Such automatic approval can be based upon a set of maximal limits that are computed based upon the tenant's current usage, level of subscription, or hard limit values.

AUTO-SIZING FOR STREAM PROCESSING APPLICATIONS

Techniques are provided for automatically resizing applications. In one technique, policy data that indicates an order of multiple policies is stored. The policies include (1) a first policy that corresponds to a first computer resource and a first resizing action and (2) a second policy that is lower in priority than the first policy and that corresponds to a second resizing action and a second computer resource. Resource utilization data is received from at least one application executing in a cloud environment. Based on the order, the first policy is identified. Based on the resource utilization data, it is determined whether criteria associated with the first policy are satisfied with respect to the application. If satisfied, then the first resizing action is performed with respect to the application; otherwise, based on the computer resource utilization data, it is determined whether criteria associated with the second policy are satisfied.

Elastic resource pooling for dynamic throughput rebalancing

A method for utilizing elastic resource pooling techniques to dynamically rebalance throughput includes determining, for each of multiple tenants leasing computing resources of a shared resource pool, a desired claim to resources in the shared resource pool. The desired claim is based on a number of resource access requests received in association with each of the multiple tenants. The method further includes determining, for each of the multiple tenants, a guaranteed claim and a maximum potential claim on the shared resource pool; and allocating a surplus resource pool among the multiple tenants based on the determined maximum potential claim and the desired claim for each one of the multiple tenants, the surplus resource pool representing a remainder of the shared resource pool after the guaranteed claim for each of the tenants is satisfied via an initial resource allocation from the shared resource pool.

RESOURCE-USAGE NOTIFICATION FRAMEWORK IN A DISTRIBUTED COMPUTING ENVIRONMENT
20210397478 · 2021-12-23 ·

A resource-usage notification framework can be implemented for distributed computing environments. For example, a system can determine the resource usage of a software application in a distributed computing environment. The system can determine if the resource usage is within a predefined range of a predefined resource-consumption limit. If so, the system can generate an event notification and transmit the event notification to the software application. The software application can receive the event notification and perform a mitigation operation in response. The mitigation operation can be configured to prevent the resource usage from exceeding the predefined resource-consumption limit or to mitigate an impact of the resource usage exceeding the predefined resource-consumption limit.

Stream allocation using stream credits
11201828 · 2021-12-14 · ·

Systems and methods for allocating resources are disclosed. Resources such as streams are allocated using a stream credit system. Credits are issued to the clients in a manner that ensure the system is operating in a safe allocation state. The credits can be used not only to allocate resources but also to throttle clients where necessary. Credits can be granted fully, partially, and in a number greater than a request. Zero or negative credits can also be issued to throttle clients.