G06F2209/5011

Cluster management

In response to receiving a parallel processing job from a customer, a system operated by a computing resource service provider allocates and configures a cluster of computer systems capable of executing the job. In an embodiment, each computer system is configured with a first network stack that allows access to resources of the computing resource service provider and a second network stack that allows access to resources of the customer. In an embodiment, the state of the cluster is monitored by the system via the first network stack. In an embodiment, the system deploys a set of tasks on the cluster for fulfilling the processing job. In an embodiment, the tasks have access to the second network stack so that each task can access resources of the customer.

Thread Creation on Local or Remote Compute Elements by a Multi-Threaded, Self-Scheduling Processor
20230091432 · 2023-03-23 ·

Representative apparatus, method, and system embodiments are disclosed for a self-scheduling processor which also provides additional functionality. Representative embodiments include a self-scheduling processor, comprising: a processor core adapted to execute a received instruction; and a core control circuit adapted to automatically schedule an instruction for execution by the processor core in response to a received work descriptor data packet. In another embodiment, the core control circuit is also adapted to schedule a fiber create instruction for execution by the processor core, to reserve a predetermined amount of memory space in a thread control memory to store return arguments, and to generate one or more work descriptor data packets to another processor or hybrid threading fabric circuit for execution of a corresponding plurality of execution threads. Event processing, data path management, system calls, memory requests, and other new instructions are also disclosed.

Resource Manager Integration in Cloud Computing Environments

In one embodiment, a system includes first host machines implementing a public-cloud computing environment, wherein at least one of the first host machines includes a resource manager that provides a public-cloud resource interface through which one or more public-cloud clients interact with one or more virtual machines, and second host machines implementing a private-cloud computing environment, wherein at least one of the second host machines includes one or more private-cloud virtual machines, wherein at least one of the first host machines further includes a private-cloud VM resource provider through which the resource manager interacts with the private-cloud virtual machines, wherein the VM resource provider translates requests to perform virtual machine operations from a public-cloud-resource interface to a private-cloud virtual machine interface, and the private-cloud virtual machines perform the requested virtual machine operations in response to receiving the translated requests from the VM resource provider.

Blockchain transaction processing systems and methods

Disclosed are computer-implemented methods, non-transitory computer-readable media, and systems or processing blockchain transactions. One computer-implemented method includes receiving a number of blockchain transactions to be executed by a blockchain node. The blockchain node allocates one or more threads and one or more coroutines for processing the number of blockchain transactions based on whether the number of blockchain transactions are CPU-bound or I/O-bound. The blockchain node executes the number of blockchain transactions using the one or more threads and one or more coroutines, generates a blockchain block including the number of blockchain transactions, and adds the blockchain block to the blockchain.

Dynamic load balancing and configuration management for heterogeneous compute accelerators in a data center

An example method of managing a plurality of hardware accelerators in a computing system includes executing workload management software in the computing system configured to allocate a plurality of jobs in a job queue among a pool of resources in the computer system; monitoring the job queue to determine required hardware functionalities for the plurality of jobs; provisioning at least one hardware accelerator of the plurality of hardware accelerators to provide the required hardware functionalities; configuring a programmable device of each provisioned hardware accelerator to implement at least one of the required hardware functionalities; and notifying the workload management software that each provisioned hardware accelerator is an available resource in the pool of resources.

Hot-plug events in a pool of reconfigurable data flow resources

A data processing system comprises a pool of reconfigurable data flow resources with arrays of physical configurable units, a controller, and a runtime processor. The controller is configured to generate a hot-plug event in response to detecting a removal of an unallocated array of physical configurable units from the pool of reconfigurable data flow resources. The runtime processor is configured to execute user applications on a subset of the arrays of physical configurable units and to receive the hot-plug event from the controller. The runtime processor is further configured to make the removed unallocated array of physical configurable units unavailable for subsequent allocations of subsequent virtual data flow resources and subsequent executions of subsequent user applications, while the subset of the arrays of physical configurable units continues the execution of the user applications.

METHOD AND APPARATUS FOR DYNAMICALLY MANAGING SHARED MEMORY POOL
20230085979 · 2023-03-23 ·

A method and an apparatus for dynamically managing a shared memory pool are provided, to determine, based on different service scenarios, a shared memory pool mechanism applicable to a current service scenario, and then dynamically adjust a memory pool mechanism based on the determining result. The method for dynamically managing a shared memory pool includes: determining a first shared memory pool mechanism, where the first shared memory pool mechanism is a fixed memory pool mechanism or a dynamic memory pool mechanism; determining a second shared memory pool mechanism suitable for a second service scenario based on the second service scenario, where the second shared memory pool mechanism is a fixed memory pool mechanism or a dynamic memory pool mechanism; and when the second shared memory pool mechanism is different from the first shared memory pool mechanism, adjusting the first shared memory pool mechanism to the second shared memory pool mechanism.

METHOD FOR RESOURCE EXCLUSION, TERMINAL DEVICE, AND STORAGE MEDIUM
20220350659 · 2022-11-03 ·

A method for resource exclusion, a terminal device, and a storage medium are disclosed in the present disclosure. The method includes the following. Obtain a first candidate resource set by performing a first resource exclusion operation on a first resource set in a resource selection window, where the first resource set includes available resources in a resource pool used by the terminal device in the resource selection window, the first resource exclusion operation includes performing resource exclusion according to a non-sensing slot in a resource sensing window, and the non-sensing slot represents a slot in which the terminal device performs no sensing. Determine a second resource set on condition that a first percentage is smaller than X %, where the first percentage is a percentage of number of resources in the first candidate resource set and number M.sub.total of resources in the first resource set.

Systems and Methods to Leverage Unused Compute Resource for Machine Learning Tasks
20230090320 · 2023-03-23 ·

Systems and methods relating to leveraging inactive computing resources are discussed. An example system may include one or more computing nodes having an active state and an inactive state, one or more processors, and a memory. The memory may contain instructions therein that, when executed, cause the one or more processors to identify a task to be performed by the one or more computing nodes based upon a received request. The instructions may further cause the one or more processors to create one or more sub-tasks based upon the task and schedule the one or more sub-tasks for execution on the one or more computing nodes during the inactive state. The instructions may further cause the one or more processors to collate the one or more sub-tasks into a completed task, and generate a completed task notification based upon the completed task.

CONSISTENT HASHING FOR COMMUNICATION DEVICES

A method for allocating a device-specific resource from one or more databases is provided. The method includes receiving, at an interface, a coupling identifier including a pool identifier and a resource identifier, as part of a processing request from a requesting entity, the processing request including a request for the device-specific resource, wherein the coupling identifier associates the requesting entity with the device-specific resource based on the resource identifier, extracting, at the interface, the pool identifier from the coupling identifier, identifying, by the interface, the processing service in which the device-specific resource associated with the resource identifier is cached, based on the pool identifier, and transmitting, from the interface to the identified processing service, at least a part of the processing request to process the cached requested device-specific resource.