Patent classifications
G06F2209/521
Memory barrier elision for multi-threaded workloads
A system includes a memory, at least one physical processor in communication with the memory, and a plurality of hardware threads executing on the at least one physical processor. A first thread of the plurality of hardware threads is configured to execute a plurality of instructions that includes a restartable sequence. Responsive to a different second thread in communication with the first thread being pre-empted while the first thread is executing the restartable sequence, the first thread is configured to restart the restartable sequence prior to reaching a memory barrier.
Atomic Operation Predictor
In an embodiment, a processor comprises an atomic predictor circuit to predict whether or not an atomic operation will complete successfully. The prediction may be used when a subsequent load operation to the same memory location as the atomic operation is executed, to determine whether or not to forward store data from the atomic operation to the subsequent load operation. If the prediction is successful, the store data may be forwarded. If the prediction is unsuccessful, the store data may not be forwarded. In cases where an atomic operation has been failing (not successfully performing the store operation), the prediction may prevent the forwarding of the store data and thus may prevent a subsequent flush of the load.
Enhanced atomics for workgroup synchronization
A technique for synchronizing workgroups is provided. The techniques comprise detecting that one or more non-executing workgroups are ready to execute, placing the one or more non-executing workgroups into one or more ready queues based on the synchronization status of the one or more workgroups, detecting that computing resources are available for execution of one or more ready workgroups, and scheduling for execution one or more ready workgroups from the one or more ready queues in an order that is based on the relative priority of the ready queues.
SYSTEM, APPARATUS AND METHODS FOR PERFORMING SHARED MEMORY OPERATIONS
In an embodiment, an apparatus for memory access may include: a memory comprising at least one atomic memory region, and a control circuit coupled to the memory, The control circuit may be to: for each submission queue of a plurality of submission queues, identify an atomic memory location specified in a first entry of the submission queue, wherein each submission queue is to store access requests from a different requester; determine whether the atomic memory location includes existing requester information; and in response to a determination that the atomic memory location does not include existing requester information, perform an atomic operation for the atomic memory location based at least in part on the first entry of the submission queue. Other embodiments are described and claimed.
SYSTEM AND METHOD FOR MANAGING MULTI-CORE ACCESSES TO SHARED PORTS
A port is provided that utilized various techniques to manage contention for the same by controlling data that is written to and read from the port in multi-core assembly within a usable computing system. When the port is a sampling port, the assembly may include at least two cores, a plurality of buffers in operative communication with the at least one sampling ports, a non-blocking contention management unit comprising a plurality of pointers that collectively operate to manage contention of shared ports in a multi-core computing system. When the port is queuing port, the assembly may include buffers in communication with the queuing port and the buffers are configured to hold multiple messages in the queuing port. The assembly may manage contention of shared queuing ports in a multi-core computing system.
Memory system, operation method thereof, and database system including the memory system
A method for operating a multi-transaction memory system, the method includes: storing Logical Block Address (LBA) information changed in response to a request from a host and a transaction identification (ID) of the request into one page of a memory block; and performing a transaction commit in response to a transaction commit request including the transaction ID from the host, wherein the performing of the transaction commit includes: changing a valid block bitmap in a controller of the multi-transaction memory system based on the LBA information.
ATOMICITY ASSURANCE DEVICE AND ATOMICITY ASSURANCE METHOD
An atomicity securing apparatus that secures atomicity of collaborative services includes: an atomicity determination unit configured to determine, in a case in which there is an error response to a first service among a plurality of types of services configuring the collaborative services in response to a request to execute the plurality of types of services, whether or not a process for updating second services other than the first service in the plurality of types of services is completed in consideration of inquiry to a collaborative service execution apparatus that executes the collaborative services; a cancellation API request generation unit configured to generate a cancellation API request for canceling the process for updating the second services that is completed; and a cancellation API request transmission unit configured to transmit the generated cancellation API request to a server that provides the second services.
Instructions controlling access to shared registers of a multi-threaded processor
Atomic instructions, including a Compare And Swap Register, a Load and AND Register, and a Load and OR Register instruction, use registers instead of storage to communicate and share information in a multi-threaded processor. The registers are accessible to multiple threads of the multi-threaded processor, and the instructions operate on these shared registers. Access to the shared registers is controlled by the instructions via interlocking.
Instructions controlling access to shared registers of a multi-threaded processor
Atomic instructions, including a Compare And Swap Register, a Load and AND Register, and a Load and OR Register instruction, use registers instead of storage to communicate and share information in a multi-threaded processor. The registers are accessible to multiple threads of the multi-threaded processor, and the instructions operate on these shared registers. Access to the shared registers is controlled by the instructions via interlocking.
Compact NUMA-aware Locks
A computer comprising multiple processors and non-uniform memory implements multiple threads that perform a lock operation using a shared lock structure that includes a pointer to a tail of a first-in-first-out (FIFO) queue of threads waiting to acquire the lock. To acquire the lock, a thread allocates and appends a data structure to the FIFO queue. The lock is released by selecting and notifying a waiting thread to which control is transferred, with the thread selected executing on the same processor socket as the thread controlling the lock. A secondary queue of threads is managed for threads deferred during the selection process and maintained within the data structures of the waiting threads such that no memory is required within the lock structure. If no threads executing on the same processor socket are waiting for the lock, entries in the secondary queue are transferred to the FIFO queue preserving FIFO order.