G06F9/524

Data processing

Apparatus comprises a data memory to store lock data for each of a set of processing resources, the lock data representing lock status data and tag data indicating a resource type selected from a plurality of resource types; and a processing element to execute an atomic operation with respect to the lock data for a given processing resource, the atomic operation comprising at least: a detection of whether the given processing resource is of a required resource type; a detection from the lock status data whether the given processing resource is currently unlocked; and when the given processing resource is detected to be currently unlocked and of the required resource type, performance of a predetermined action with respect to one or both of the lock status data and the tag data.

Task priority processing method and processing device

In a multitask computing system, there are multiple tasks include a first task, a second task, and a third task, and the first task has a higher priority than that of the second task and the third task. A method including raising the priority of the second task that shares a first critical section with the first task and is accessing the first critical section when the first task is blocked due to failure to access the first critical section; determining whether there is a third task that shares a second critical section with the second task and is accessing the second critical section; and raising, when the third task is present, the priority of the third task. The techniques of the present disclosure prevent a low-priority third task from delaying the execution of a second task, thus avoiding the priority inversion caused by the delayed execution of a high-priority first task.

Determining an optimum number of threads to make available per core in a multi-core processor complex to executive tasks

Provided are a computer program product, system, and method for determining an optimum number of threads to make available per core in a multi-core processor complex to execute tasks. A determination is made of a first processing measurement based on threads executing on the cores of the processor chip, wherein each core includes circuitry to independently execute a plurality of threads. A determination is made of a number of threads to execute on the cores based on the first processing measurement. A determination is made of a second processing measurement based on the threads executing on the cores of the processor chip. A determination is made of an adjustment to the determined number of threads to execute based on the second processing measurement resulting in an adjusted number of threads. The adjusted number of threads on the cores is utilized to execute instructions.

Adaptive synchronization for redo writing to persistent memory storage

A computer's processes and/or threads generate and store in memory, data to reimplement or reverse a transaction on a database, so that the database can be recovered. This data is written to persistent memory storage (“persisted”) by another process, for which the processes and/or threads may wait. This wait includes at least a sleep phase, and additionally a spin phase which is entered if after awakening from sleep and checking (“on-awakening” check), the data to be persisted is found to not have been persisted. To sleep in the sleep phase, each process/thread specifies a sleep duration determined based at least partially on previous results of on-awakening checks. The previous results in which to-be-persisted data was found to be not persisted are indications the sleeps were insufficient, and these indications are counted and used to determine the sleep duration. Repeated determination of sleep duration makes the sleep phase adaptive.

Bottleneck detection for processes

Systems and methods for analyzing an event log for a plurality of instances of execution of a process to identify a bottleneck are provided. An event log for a plurality of instances of execution of a process is received and segments executed during one or more of the plurality of instances of execution are identified from the event log. The segments represent a pair of activities of the process. For each particular segment of the identified segments, a measure of performance is calculated for each of the one or more instances of execution of the particular segment based on the event log, each of the one or more instances of execution of the particular segment is classified based on the calculated measures of performance, and one or more metrics are computed for the particular segment based on the classified one or more instances of execution of the particular segment. The identified segments are compared with each other based on the one or more metrics to identify one of the identified segments that is most likely to have a bottleneck.

OPTIMIZATIONS FOR LONG-LIVED STATEMENTS IN A DATABASE SYSTEM
20230244655 · 2023-08-03 ·

The subject technology performs a search for a key in a regular space to locate a first visible version of the key. The subject technology determines that the first visible version of the key is not one of a N number of newest versions of the key. The subject technology performs a search of an undo space to locate a second visible version of the key. The subject technology determines whether the first visible version or the second visible version of the key is newer. The subject technology provides a newer version of the key between the first visible version and the second visible version of the key.

METHOD OF COMPLETING A PROGRAMMABLE ATOMIC TRANSACTION
20220121474 · 2022-04-21 ·

Disclosed in some examples, are methods, systems, computing devices, and machine readable mediums which define an instruction for a programmable atomic transaction that is executed as the last instruction and that terminates the executing thread, waits for all outstanding store operations to finish, clears the programmable atomic lock, and sends a completion response back to the issuing process. This guarantees that the programmable atomic lock is cleared when the transaction completes. By coupling thread termination with clearing the lock bit, this guarantees that the thread cannot terminate without clearing the lock.

Thread scheduling on SIMT architectures with busy-wait synchronization
11768715 · 2023-09-26 ·

A system and method that detects that a group of threads has executed a spin-inducing branch in a single-instruction multithreaded processor and scheduling groups of threads based on the detection, marking the group as backed-off and deprioritizing the group for scheduling. When the group is scheduled a back-off counter is initialized and decremented each clock cycle. The group of threads is prevented from being scheduled if the spin-inducing branch is executed again before the counter reaches zero. A hardware system and method for labeling spin-inducing branches that determines that a profiled thread is in a spinning state and detects that a backward branch is executed while spinning. The detection is based on executions of a loop where the operand values for the exit condition don't change. A confidence level can be used that increases with each execution of a backward branch while in the spinning state.

Computational graph critical sections
11188395 · 2021-11-30 · ·

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing critical section subgraphs in a computational graph system. One of the methods includes executing a lock operation including providing, by a task server, a request to a value server to create a shared critical section object. If the task server determines that the shared critical section object was created by the value server, the task server executes one or more other operations of the critical section subgraph in serial. The task server executes an unlock operation including providing, by the task server, a request to the value server to delete the shared critical section object.

PREVENTING DEADLOCKS IN RUNTIME
20220027213 · 2022-01-27 ·

Provided is a method for preventing deadlocks between competing threads. The method includes receiving a lock request from a first thread and, in response, identifying a potential deadlock with a second thread. In response, the method includes determining whether to deny the lock request, which includes: determining whether a first duration for which the first thread will hold the lock to complete its job is longer than a second duration for which the second thread will hold the lock to complete its job; determining whether the second thread will start to use the lock soon relative to the first duration; and determining whether both the first and second threads will complete their respective jobs within a time limit if the lock is denied to the first thread while the second thread completes its job. The method further includes denying the request for the requested lock from the first thread.