Patent classifications
G06F2209/504
APPLICATION PROGRAMMING INTERFACE TO LIMIT MEMORY
Apparatuses, systems, and techniques to limit memory during execution of one or more kernels and/or thread groups during PPU execution. In at least one embodiment, a process indicates a memory limit for one or more kernels and/or thread groups to a parallel processing library, and said parallel processing library restricts memory allocation for said one or more kernels and/or thread groups according to said memory limit.
AUTOSCALING AND THROTTLING IN AN ELASTIC CLOUD SERVICE
Techniques described herein can optimize usage of computing resources in a data system. Dynamic throttling can be performed locally on a computing resource in the foreground and autoscaling can be performed in a centralized fashion in the background. Dynamic throttling can lower the load without overshooting while minimizing oscillation and reducing the throttle quickly. Autoscaling may involve scaling in or out the number of computing resources in a cluster as well as scaling up or down the type of computing resources to handle different types of situations.
Efficient resource utilization in data centers
A method includes identifying high-availability jobs and low-availability jobs that demand usage of resources of a distributed system. The method includes determining a first quota of the resources available to low-availability jobs as a quantity of the resources available during normal operations, and determining a second quota of the resources available to high-availability jobs as a quantity of the resources available during normal operations minus a quantity of the resources lost due to a tolerated event. The method includes executing the jobs on the distributed system and constraining a total usage of the resources by both the high-availability jobs and the low-availability jobs to the quantity of the resources available during normal operations.
Co-operative memory management system
Systems and methods for computer memory management by a memory coordinator and a plurality of memory consumers. An urgency and memory quota of each memory consumer is initialized by the memory coordinator, which then adjusts the memory quota of each memory consumer such that the sum of the memory quota of each memory consumer does not exceed a finite amount of computer memory. Each memory consumer adjusts its memory usage in response to the quota input and urgency input from the memory coordinator.
Dynamic rate limiting of operation executions for accounts
The present disclosure relates to computer-implemented methods, software, and systems for dynamic rate limiting of execution of operation. A request from a user account for execution of an operation by an application service is. A total number of operations registered at an operations registry is determined. In response to determining that all of registered operations exceeds a first threshold value, a number of registered operations associated with a group account of the user account is determined. If it is determined that (i) the total number of registered operations exceeds a first threshold value and that the number of registered operations associated with the group account does not exceed a second threshold value or (ii) if it is determined that the total number of registered operations does not exceed the first threshold value, the operation is registered at the operations registry. An instruction to execute the registered operation is sent.
Methods and apparatus for state objects in cluster computing
Embodiments of a mobile state object for storing and transporting job metadata on a cluster computing system may use a database as an envelope for the metadata. A state object may include a database that stores the job metadata and wrapper methods. A small database engine may be employed. Since the entire database exists within a single file, complex, extensible applications may be created on the same base state object, and the state object can be sent across the network with the state intact, along with history of the object. An SQLite technology database engine, or alternatively other single file relational database engine technologies, may be used as the database engine. To support the database engine, compute nodes on the cluster may be configured with a runtime library for the database engine via which applications or other entities may access the state file database.
Server-to-container migration
For each server under consideration for container migration, whether the server has a value for a first parameter that precludes the server from being migrated to a container is determined. Each server having a value that precludes the serve from being migrated to a container is removed from further consideration. For each server remaining under consideration, a value of the server for each second parameter of a number of second parameters is determined, and the values of the server for the second parameters are weighted to yield a weight for the server. The servers remaining under consideration for migration are ranked based at least on the weights for the servers, yielding an order in which the servers are to migrated.
Redistributing update resources during update campaigns
Disclosed are various embodiments for the controlling the amount of active updates that can occur during a given time on devices that are associated with tenants (e.g., organizations) and subtenants (e.g., sub-organizations) in a multi-tenant environment. In particular, each tenant and subtenant is assigned throttle corresponding to different update parameters (e.g., an amount of devices executing an active update, an amount of data to be downloaded during a campaign, a time for completing the update campaign, etc.). When an update campaign is established, the update campaign can define the different devices that are to be updated. In some situations, the number of active updates required may exceed the allotted resources for a given subtenant. When a subtenant requires additional resources than what is assigned to complete the update, the subtenant can borrow resources defined by the update parameters from a subtenant peer that has a surplus.
Job scheduling method
A method for scheduling a single subset of jobs of a set of jobs satisfying a range constraint of number of jobs, wherein the jobs of the set of jobs share resources in a computing system, each job being assigned a weight, w, indicative of the memory usage of the job in case of its execution in the computer system, the method including: for each number of jobs, x, satisfying the range constraint, determining from the set of jobs a first subset of jobs using a knapsack problem, wherein the knapsack problem is adapted to select by using the weights the first subset of jobs having the number of jobs and having a maximal total memory usage below the current available memory of the computer system, and selecting the single subset from the first subset.
DYNAMIC CAPACITY OPTIMIZATION FOR SHARED COMPUTING RESOURCES
Systems, methods, devices, and other techniques for managing a computing resource shared by a set of online entities. A system can receive a request from a first online entity to reserve capacity of the computing resource. The system determines a relative priority of the first online entity and identifies a reservation zone that corresponds to the relative priority of the first online entity. The system determines whether to satisfy the request based on comparing (i) an amount of the requested capacity of the computing resource and (ii) an amount of the portion of unused capacity of the computing resource designated by the reservation zone that online entities having relative priorities at or below the relative priority of the first online entity are permitted to reserve.