Patent classifications
G06F2209/521
Compact NUMA-aware Locks
A computer comprising multiple processors and non-uniform memory implements multiple threads that perform a lock operation using a shared lock structure that includes a pointer to a tail of a first-in-first-out (FIFO) queue of threads waiting to acquire the lock. To acquire the lock, a thread allocates and appends a data structure to the FIFO queue. The lock is released by selecting and notifying a waiting thread to which control is transferred, with the thread selected executing on the same processor socket as the thread controlling the lock. A secondary queue of threads is managed for threads deferred during the selection process and maintained within the data structures of the waiting threads such that no memory is required within the lock structure. If no threads executing on the same processor socket are waiting for the lock, entries in the secondary queue are transferred to the FIFO queue preserving FIFO order.
Dynamic modification of coherent atomic memory operations
A processing device determines a scope indicating at least a portion of the processing system and target data from atomic memory operation to be performed. Based on the scope, the processing device determines one or more hardware parameters for at least a portion of the processing system. The processing device then compares the hardware parameters to the scope and target data to determine one or more corrections. The processing device then provides the scope, target data, hardware parameters, and corrections to a plurality of hardware lookup tables. The hardware lookup tables are configured to receive the scope, target data, hardware parameters, and corrections as inputs and output values indicating one or more coherency actions and one or more orderings. The processing device then executes one or more of the indicated coherency actions and the atomic memory operation based on the indicated ordering.
CASCADING EXECUTION OF ATOMIC OPERATIONS
Cascading execution of atomic operations, including: receiving a request for each thread of a plurality of threads to perform an atomic operation, wherein the plurality of threads comprises a plurality of thread subsets each corresponding to a local memory, wherein the local memory for a thread subset is accessible by the thread subset and inaccessible to a remainder of threads in the plurality of threads; generating a plurality of intermediate results by performing, by each thread subset, the atomic operation in the local memory corresponding to the thread subset; and generating a result for the request by aggregating the plurality of intermediate results in a shared memory accessible to all threads in the plurality of threads.
Arithmetic processing device, arithmetic processing system, and method for controlling arithmetic processing device
An arithmetic processing device includes: a arithmetic cores, wherein the arithmetic core comprises: an instruction controller configured to request processing corresponding to an instruction; a memory configured to store lock information indicating that a locking target address is locked, the locking target address, and priority information of the instruction; and a cache controller configured to, when storing data of a first address in a cache memory to execute a first instruction including locking of the first address from the instruction controller, suppress updating of the memory if the lock information is stored in the memory and a priority of the priority information is higher than a first priority of the first instruction.
SYSTEMS AND METHODS FOR PROCESSING ATOMIC COMMANDS
A method for executing atomic commands may include receiving, by an interface of an atomic command execution unit and from a plurality of requestors, a plurality of memory mapped atomic commands. The method may also include executing the plurality of memory mapped atomic commands to provide output values. The method may further include storing, in a first memory unit of the atomic command execution unit, requestor specific information. Different entries of a plurality of entries of the first memory unit may be allocated to different requestors of the plurality of requestors. The method may also include storing, in a second memory unit of the atomic command execution unit, the output values of the plurality of memory mapped atomic commands, and outputting, by the interface and to at least one of the plurality of requestors, at least one indication indicating a completion of at least one of the atomic commands.
Method and processor system for executing a TELT instruction to access a data item during execution of an atomic primitive
The present disclosure relates to a method for a computer system comprising a plurality of processor cores including a first processor core and a second processor core, wherein a data item is exclusively assigned to the first processor core, of the plurality of processor cores, for executing an atomic primitive by the first processor core. The method includes receiving by the first processor core, from the second processor core, a request for accessing the data item, and in response to determining by the first processor core that the executing of the atomic primitive is not completed by the first processor core, returning a rejection message to the second processor core.
MEMORY SYSTEM, OPERATION METHOD THEREOF, AND DATABASE SYSTEM INCLUDING THE MEMORY SYSTEM
A method for operating a multi-transaction memory system, the method includes: storing Logical Block Address (LBA) information changed in response to a request from a host and a transaction identification (ID) of the request into one page of a memory block; and performing a transaction commit in response to a transaction commit request including the transaction ID from the host, wherein the performing of the transaction commit includes: changing a valid block bitmap in a controller of the multi-transaction memory system based on the LBA information.
METHOD AND SYSTEM FOR IMPLEMENTING LOCK FREE SHARED MEMORY WITH SINGLE WRITER AND MULTIPLE READERS
A method and a system for implementing a lock-free shared memory accessible by a plurality of readers and a single writer are provided herein. The method including: maintaining a memory accessible by the readers and the writer, wherein the memory is a hash table having at least one linked list of buckets, each bucket in the linked list having: a bucket ID, a pointer to an object, and a pointer to another bucket; calculating a pointer to one bucket of the linked list of buckets based on a hash function in response to a read request by any of the readers; and traversing the linked list of buckets, to read a series of objects corresponding with the traversed buckets, while checking that the writer has not: added, amended, or deleted objects pointed to by any of said traversed buckets, wherein said checking is carried out in a single atomic action.
CLOUD-BASED SYSTEMS AND METHODS FOR DETECTING AND REMOVING ROOTKIT
An exemplary method includes: obtaining, at one or more cloud servers, endpoint data of an endpoint computing device; based on the endpoint data, determining, by the one or more cloud servers, a plurality of script-language rules, wherein: each of the plurality of script-language rules corresponds to an atomic operation of detecting and/or removing at least one rootkit, the at least one rootkit comprises a target rootkit, and the plurality of script-language rules comprise a set of one or more rootkit rules corresponding to the target rootkit; and transmitting, by the one or more cloud servers to the endpoint computing device, the plurality of script-language rules, wherein the set of rootkit rules is executable at the endpoint computing device to detect and/or remove the target rootkit by, for each of the set of rootkit rules, executing a corresponding atomic operation.
Compare and exchange operation using sleep-wakeup mechanism
A method, apparatus, and system are provided for performing compare and exchange operations using a sleep-wakeup mechanism. According to one embodiment, an instruction at a processor is executed to help acquire a lock on behalf of the processor. If the lock is unavailable to be acquired by the processor, the instruction is put to sleep until an event has occurred.