Patent classifications
G06F12/0828
In-memory computing with cache coherent protocol
A system for computing. In some embodiments, the system includes: a memory, the memory including one or more function-in-memory circuits; and a cache coherent protocol interface circuit having a first interface and a second interface. A function-in-memory circuit of the one or more function-in-memory circuits may be configured to perform an operation on operands including a first operand retrieved from the memory, to form a result. The first interface of the cache coherent protocol interface circuit may be connected to the memory, and the second interface of the cache coherent protocol interface circuit may be configured as a cache coherent protocol interface on a bus interface.
Cache coherency engine
A method for operating a database and a cache of at least a portion of the database may include receiving a plurality of read requests to read a data entity from the database and counting respective quantities of the requests serviced from the database and from the cache. The method may further include receiving a write request to alter the data entity in the database and determining whether to update the cache to reflect the alteration to the data entity in the write request according to the quantity of the requests serviced from the database and the quantity of the requests serviced from the cache. In an embodiment, the method further includes causing the cache to be updated when a ratio of the quantity of the requests serviced from the database to the quantity of the requests serviced from the cache exceeds a predetermined threshold.
CONTENDED LOCK REQUEST ELISION SCHEME
A system and method for network traffic management between multiple nodes are described. A computing system includes multiple nodes connected to one another. When a home node determines a number of nodes requesting read access for a given data block assigned to the home node exceeds a threshold and a copy of the given data block is already stored at a first node of the multiple nodes in the system, the home node sends a command to the first node. The command directs the first node to forward a copy of the given data block to the home node. The home node then maintains a copy of the given data block and forwards copies of the given data block to other requesting nodes until the home node detects a write request or a lock release request for the given data block.
PROCESSING COMMANDS IN A DIRECTORY-BASED COMPUTER MEMORY MANAGEMENT SYSTEM
A method for processing commands in a directory-based computer memory management system includes receiving a command to perform an operation on data stored in a set of one or more computer memory locations associated with an entry in a directory of a computer memory, the entry is associated with an indicator for indicating whether the set of one or more computer memory locations is busy, a head tag, and a tail tag. The command is associated with a command tag and a predecessor tag, and checking the indicator to determine whether the set of one or more computer memory locations is busy.
Network cache injection for coherent GPUs
Methods, devices, and systems for GPU cache injection. A GPU compute node includes a network interface controller (NIC) which includes NIC receiver circuitry which can receive data for processing on the GPU, NIC transmitter circuitry which can send the data to a main memory of the GPU compute node and which can send coherence information to a coherence directory of the GPU compute node based on the data. The GPU compute node also includes a GPU which includes GPU receiver circuitry which can receive the coherence information; GPU processing circuitry which can determine, based on the coherence information, whether the data satisfies a heuristic; and GPU loading circuitry which can load the data into a cache of the GPU from the main memory if on the data satisfies the heuristic.
Virtual network pre-arbitration for deadlock avoidance and enhanced performance
A device includes a data path, a first interface configured to receive a first memory access request from a first peripheral device, and a second interface configured to receive a second memory access request from a second peripheral device. The device further includes an arbiter circuit configured to, in a first clock cycle, a pre-arbitration winner between a first memory access request and a second memory access request based on a first number of credits allocated to a first destination device and a second number of credits allocated to a second destination device. The arbiter circuit is further configured to, in a second clock cycle select a final arbitration winner from among the pre-arbitration winner and a subsequent memory access request based on a comparison of a priority of the pre-arbitration winner and a priority of the subsequent memory access request.
METHOD FOR ACCESSING DATA VISITOR DIRECTORY IN MULTI-CORE SYSTEM AND DEVICE
The present disclosure discloses a method for accessing a data visitor directory in a multi-core system, a directory cache device, a multi-core system, and a directory storage unit. The method includes: receiving a first access request sent by a first processor core, where the first access request is used to access an entry, corresponding to a first data block, in a directory; determining, according to the first access request, that a single-pointer entry array has a first single-pointer entry corresponding to the first data block; when determining, according to the first single-pointer entry, that a sharing entry array has a first sharing entry associated with the first single-pointer entry, determining multiple visitors of the first data block according to the first sharing entry. According to embodiments of the present disclosure, storage resources occupied by a directory can be reduced.
SYNCHRONIZATION LOGIC FOR MEMORY REQUESTS
In an embodiment, a processor includes a plurality of cores and synchronization logic. The synchronization logic includes circuitry to: receive a first memory request and a second memory request; determine whether the second memory request is in contention with the first memory request; and in response to a determination that the second memory request is in contention with the first memory request, process the second memory request using a non-blocking cache coherence protocol. Other embodiments are described and claimed.
SYSTEM, DEVICE AND METHOD FOR ACCESSING DEVICE-ATTACHED MEMORY
A device connected to a host processor via a bus includes: an accelerator circuit configured to operate based on a message received from the host processor; and a controller configured to control an access to a memory connected to the device, wherein the controller is further configured to, in response to a read request received from the accelerator circuit, provide a first message requesting resolution of coherence to the host processor and prefetch first data from the memory.
ADAPTIVE CREDIT-BASED REPLENISHMENT THRESHOLD USED FOR TRANSACTION ARBITRATION IN A SYSTEM THAT SUPPORTS MULTIPLE LEVELS OF CREDIT EXPENDITURE
A device includes an arbiter circuit configured to receive a first request for a resource. The first request is associated with a first credit cost. The arbiter circuit is further configured to receive a second request for the resource. The second request is associated with a second credit cost. The arbiter circuit is further configured to select the first request for the resource as an arbitration winner. The arbiter circuit is further configured to decrement a number of available credits associated with the resource by the first credit cost. The arbiter circuit is further configured to, in response to the number of available credits associated with the resource falling to a lower credit threshold, wait until the number of available credits associated with the resource reaches an upper credit threshold to select an additional arbitration winner for the resource.