G06F2209/5018

DYNAMIC WORKLOAD DISTRIBUTION FOR DATA PROCESSING
20230030808 · 2023-02-02 ·

A computer-implemented method, according to one embodiment, includes: receiving a data process that includes a plurality of sub-processes. A unique subset of the sub-processes is assigned to each of: a managing thread, and at least one other thread. Moreover, performance characteristics of each of the threads is evaluated while the respective subsets of sub-processes are being performed, and a determination is made as to whether the performance characteristics of each of the threads are substantially equal to the performance characteristics of each of the other threads. In response to determining that performance characteristics of each of the threads are not substantially equal, the subsets of the sub-processes are dynamically adjusted such that the performance characteristics of each of the threads become more equal. Moreover, the adjusted subsets of the sub-processes are reassigned to each of the managing thread and at least one other thread.

Blockchain transaction processing systems and methods

Disclosed are computer-implemented methods, non-transitory computer-readable media, and systems or processing blockchain transactions. One computer-implemented method includes receiving a number of blockchain transactions to be executed by a blockchain node. The blockchain node allocates one or more threads and one or more coroutines for processing the number of blockchain transactions based on whether the number of blockchain transactions are CPU-bound or I/O-bound. The blockchain node executes the number of blockchain transactions using the one or more threads and one or more coroutines, generates a blockchain block including the number of blockchain transactions, and adds the blockchain block to the blockchain.

WORK SCHEDULING ON PROCESSING UNITS

In some examples, a system receives a first unit of work to be scheduled in the system that includes a plurality of collections of processing units to execute units of work, where each respective collection of processing units of the plurality of collections of processing units is associated with a corresponding scheduling queue. The system selects, for the first unit of work according to a first criterion, candidate collections from among the plurality of collections of processing units, and enqueues the first unit of work in a schedule queue associated with a selected collection of processing units that is selected, according to a selection criterion, from among the candidate collections.

OPTIMIZED NETWORKING THREAD ASSIGNMENT
20220350647 · 2022-11-03 ·

Some embodiments provide a method for scheduling networking threads associated with a data compute node (DCN) executing at a host computer. When a virtual networking device is instantiated for the DCN, the method assigns the virtual networking device to a particular non-uniform memory access (NUMA) node of multiple NUMA nodes associated with the DCN. Based on the assignment of the virtual networking device to the particular NUMA node, the method assigns networking threads associated with the DCN to the same particular NUMA node and provides information to the DCN regarding the particular NUMA node in order for the DCN to assign a thread associated with an application executing on the DCN to the same particular NUMA node.

FPGA-based dynamic graph processing method

The present disclosure relates to an FPGA-based dynamic graph processing method, comprising: where graph mirrors of a dynamic graph that have successive timestamps define an increment therebetween, a pre-processing module dividing the graph mirror having the latter timestamp into at least one path unit in a manner that incremental computing for any vertex only depends on a preorder vertex of that vertex; an FPGA processing module storing at least two said path units into an on-chip memory directly linked to threads in a manner that every thread unit is able to process the path unit independently; the thread unit determining an increment value between the successive timestamps of the preorder vertex while updating a state value of the preorder vertex, and transferring the increment value to a succeeding vertex adjacent to the preorder vertex in a transfer direction determined by the path unit, so as to update the state value of the succeeding vertex.

APPLICATION PROGRAMMING INTERFACE TO CONFIGURE PROCESSOR PARTITIONING
20230131961 · 2023-04-27 ·

Apparatuses, systems, and techniques to configure processor partitioning for a multi-process service. In at least one embodiment, a multi-process service configures a set of streaming multiprocessors of one or more parallel processing units to perform one or more threads in response to an application programming interface (API).

System and Method for Lock-free Shared Data Access for Processing and Management Threads
20230128503 · 2023-04-27 ·

A method, computer program product, and computing system for defining a first flow for one or more processing threads with access to shared data within the storage system. The one or more processing threads may be executed using the first flow. A processing thread reference count may be determined for the one or more processing threads being executed using the first flow. One or more management threads may be executed on the shared data within the storage system based upon, at least in part, the processing thread reference count.

Performance threshold

Example systems relate to system call acceleration. A system may include a processor and a non-transitory computer readable medium. The non-transitory computer readable medium may include instructions to cause the processor to run a plurality of benchmarks for a hardware configuration. The non-transitory computer readable medium may further include instructions to determine a benchmark matrix based on the plurality of benchmarks. The non-transitory computer readable medium may include instructions to determine an input/output (I/O) bandwidth ceiling for the hardware configuration based on the benchmark matrix. Additionally, the non-transitory computer readable medium may include instructions to determine a performance threshold of an I/O access parameter for the hardware configuration based on the bandwidth ceiling.

CONFIGURING HARDWARE MULTITHREADING IN CONTAINERS

As part of a container initialization procedure, a maximum number of hardware threads per processor core in a set of cores of a computer system are enabled, the container initialization procedure configuring an operating system executing on the computer system for container execution and configuring a first container for execution on the operating system. From a set of available cores in the set of cores, an execution core is selected. In the selected execution core, a number of threads per core to be used during execution of the first container is configured, the number of threads per core specified for the container initialization procedure by a first simultaneous multithreading (SMT) parameter. Using the configured execution core, the first container is executed, the executing virtualizing the operating system.

DYNAMIC ADAPTIVE THREADING USING IDLE TIME ANALYSIS

An embodiment includes initiating a first cycle of a process using a first number of threads that operate in parallel to collectively execute the process and collect performance data. The embodiment aggregates the performance data and computes a first idle duration based at least in part on the aggregated performance data. The embodiment projects a thread-count recommendation based at least in part on a mathematical model that includes the first number of threads as an input number of threads, the first idle and cycle durations as input idle and cycle durations, respectively, and a second number of threads as an output variable representative of an output number of threads, where the output number of threads is determined as a function of the input idle duration. The embodiment initiates a second cycle of the process using the second number of threads output as a projection by the mathematical model.