Patent classifications
G06F12/0824
MERGING DATA FOR WRITE ALLOCATE
A method includes receiving, by a level two (L2) controller, a write request for an address that is not allocated as a cache line in a L2 cache. The write request specifies write data. The method also includes generating, by the L2 controller, a read request for the address; reserving, by the L2 controller, an entry in a register file for read data returned in response to the read request; updating, by the L2 controller, a data field of the entry with the write data; updating, by the L2 controller, an enable field of the entry associated with the write data; and receiving, by the L2 controller, the read data and merging the read data into the data field of the entry.
OBJECT MEMORY DATA FLOW INSTRUCTION EXECUTION
Embodiments of the invention provide systems and methods for managing processing, memory, storage, network, and cloud computing to significantly improve the efficiency and performance of processing nodes. More specifically, embodiments of the present invention are directed to an instruction set of an object memory fabric. This object memory fabric instruction set can be used to provide a unique instruction model based on triggers defined in metadata of the memory objects. This model represents a dynamic dataflow method of execution in which processes are performed based on actual dependencies of the memory objects. This provides a high degree of memory and execution parallelism which in turn provides tolerance of variations in access delays between memory objects. In this model, sequences of instructions are executed and managed based on data access. These sequences can be of arbitrary length but short sequences are more efficient and provide greater parallelism.
REMOTE ATOMIC OPERATIONS IN MULTI-SOCKET SYSTEMS
Disclosed embodiments relate to remote atomic operations (RAO) in multi-socket systems. In one example, a method, performed by a cache control circuit of a requester socket, includes: receiving the RAO instruction from the requester CPU core, determining a home agent in a home socket for the addressed cache line, providing a request for ownership (RFO) of the addressed cache line to the home agent, waiting for the home agent to either invalidate and retrieve a latest copy of the addressed cache line from a cache, or to fetch the addressed cache line from memory, receiving an acknowledgement and the addressed cache line, executing the RAO instruction on the received cache line atomically, subsequently receiving multiple local RAO instructions to the addressed cache line from one or more requester CPU cores, and executing the multiple local RAO instructions on the received cache line independently of the home agent.
CACHING TECHNIQUES
Techniques for caching may include: determining an update to a first data page of a first cache on a first node, wherein a second node includes a second cache and wherein the second cache includes a copy of the first data page; determining, in accordance with one or more criteria, whether to send the update from the first node to the second node; responsive to determining, in accordance with the one or more criteria, to send the update, sending the update from the first node to the second node; and responsive to determining not to send the update, sending an invalidate request from the first node to the second node, wherein the invalidate request instructs the second node to invalidate the copy of the first data page stored in the second cache of the second node.
Access of named data elements in coordination namespace
An approach is described that provides access to a named data element in a Coordination Namespace that is stored in a memory that is distributed amongst a set of nodes. A request of a name corresponding to the named data element is received from a requesting process and the approach responsively searches for the name in the Coordination Namespace. In response to determining an absence of data corresponding to the named data element, a pending state is indicated to the requesting process. In response to determining that the data corresponding to the named data element exists, a successful state is returned to the requesting process. In one embodiment, the successful state also includes providing the requesting process with access to the data corresponding to the named data element.
HARDWARE COHERENT COMPUTATIONAL EXPANSION MEMORY
Embodiments herein describe transferring ownership of data (e.g., cachelines or blocks of data comprising multiple cachelines) from a host to hardware in an I/O device. In one embodiment, the host and I/O device (e.g., an accelerator) are part of a cache-coherent system where ownership of data can be transferred from a home agent (HA) in the host to a local HA in the I/O device—e.g., a computational slave agent (CSA). That way, a function on the I/O device (e.g., an accelerator function) can request data from the local HA without these requests having to be sent to the host HA. Further, the accelerator function can indicate whether the local HA tracks the data on a cacheline-basis or by a data block (e.g., multiple cachelines). This provides flexibility that can reduce overhead from tracking the data, depending on the function's desired use of the data.
Global coherence operations
A method includes receiving, by a L2 controller, a request to perform a global operation on a L2 cache and preventing new blocking transactions from entering a pipeline coupled to the L2 cache while permitting new non-blocking transactions to enter the pipeline. Blocking transactions include read transactions and non-victim write transactions. Non-blocking transactions include response transactions, snoop transactions, and victim transactions. The method further includes, in response to an indication that the pipeline does not contain any pending blocking transactions, preventing new snoop transactions from entering the pipeline while permitting new response transactions and victim transactions to enter the pipeline; in response to an indication that the pipeline does not contain any pending snoop transactions, preventing, all new transactions from entering the pipeline; and, in response to an indication that the pipeline does not contain any pending transactions, performing the global operation on the L2 cache.
SCALABLE REGION-BASED DIRECTORY
Disclosed is a cache directory including one or more cache directories configurable to interchange within each cache directory entry at least one bit between a first field and a second field to change the size of the region of memory represented and the number of cache lines tracked in the cache subsystem.
Method For PRP/SGL Handling For Out-Of-Order NVME Controllers
Read latency for a read operation to a host implementing a PRP/SGL buffer is reduced by generating an address table representing the linked-list structure defining the PRP/SGL buffer. The address table may be generated concurrently with reading of data referenced by the read command from a NAND storage device. A block table for tracking status of LBAs referenced by IO commands may include a reference to the address table which is used to transfer LBAs to host memory as soon as the address table is complete and a block of data referenced by an LBA has been read from the NAND storage device.
SYSTEM AND METHOD FOR SCALING COMMAND ORCHESTRATION THROUGH ADDRESS MAPPING
A device for processing commands to manage non-volatile memory includes a controller configured to obtain address information from a command, read, based on the address information, an entry of a metadata table, and determine, based on the entry of the metadata table, whether a metadata page corresponding to the address information is being processed by the controller. In response to determining that the metadata page corresponding to the address information is being processed, the controller determines a processing status of the metadata page, among a plurality of processing statuses, based on the entry of the metadata table and processes the command according to the processing status of the first metadata page. In response to determining that the metadata page corresponding to first address information is not being processed, the controller reads the metadata page from the non-volatile memory based on the entry of the metadata table.