G06F12/08

Data processing system having masters that adapt to agents with differing retry behaviors

A data processing system includes a plurality of snoopers, a processing unit including master, and a system fabric communicatively coupling the master and the plurality of snoopers. The master sets a retry operating mode for an interconnect operation in one of alternative first and second operating modes. The first operating mode is associated with a first type of snooper, and the second operating mode is associated with a different second type of snooper. The master issues a memory access request of the interconnect operation on the system fabric of the data processing system. Based on receipt of a combined response representing a systemwide coherence response to the request, the master delays an interval having a duration dependent on the retry operating mode and thereafter reissues the memory access request on the system fabric.

Data updating technology
11698728 · 2023-07-11 · ·

A storage system includes a management node and a plurality of storage nodes forming a redundant array of independent disks (RAID). When the management node determines that not all data in an entire stripe is updated based on a received write request, the management node sends an update data chunk obtained from to-be-written data to a corresponding storage node. The storage node does not directly update, based on the received update data chunk, a data block stored in a storage device of the storage node, but store the update data chunk into a non-volatile memories (NVM) cache of the storage node and send the update data chunk to another storage node for backup. According to the data updating method, write amplification problems caused in a stripe update process can be reduced, thereby improving update performance of the storage system.

Data set and node cache-based scheduling method and device

Disclosed is a data set and node cache-based scheduling method, which includes: obtaining storage resource information of each host node; in response to receiving a training task, obtaining operation information of the training task, and according to the operation information and the storage resource information, screening host nodes that satisfy a space required by the training task; in response to no host node satisfying the space required by the training task, scoring each host node according to the storage resource information; according to scoring results, selecting, from among all of the host nodes, a host node to be executed that is used to execute the training task; and obtaining and deleting an obsolete data set cache in the host node to be executed, and executing the training task in the host node to be executed.

Copy and restore of page in byte-addressable chunks of cluster memory

Disclosed are various embodiments for improving the resiliency and performance of cluster memory. First, a computing device can submit a write request to a byte-addressable chunk of memory stored by a memory host, wherein the byte-addressable chunk of memory is read-only. Then, the computing device can determine that a page-fault occurred in response to the write request. Next, the computing device can copy a page associated with the write request from the byte-addressable chunk of memory to the memory of the computing device. Subsequently, the computing device can free the page from the memory host. Then, the computing device can update a page table entry for the page to refer to a location of the page in the memory of the computing device.

Method and apparatus for smart store operations with conditional ownership requests

Method and apparatus implementing smart store operations with conditional ownership requests. One aspect includes a method implemented in a multi-core processor, the method comprises: receiving a conditional read for ownership (CondRFO) from a requester in response to an execution of an instruction to modify a target cache line (CL) with a new value, the CondRFO identifying the target CL and the new value; determining from a local cache a local CL corresponding to the target CL; determining a local value from the local CL; comparing the local value with the new value; setting a coherency state of the local CL to (S)hared when the local value is same as the new value; setting the coherency state of the local CL to (I)nvalid when the local value is different than the new value; and sending a response and a copy of the local CL to the requester. Other embodiments include an apparatus configured to perform the actions of the methods.

Computer Memory Expansion Device and Method of Operation

A memory expansion device operable with a host computer system (host) comprises a non-volatile memory (NVM) subsystem, cache memory, and control logic configurable to receive a submission from the host including a read command and specifying a payload in the NVM subsystem and demand data in the payload. The control logic is configured to request ownership of a set of cache lines corresponding to the payload, to indicate completion of the submission after acquiring ownership of the cache lines, and to load the payload to the cache memory. The set of cache lines correspond to a set of cache lines in a coherent destination memory space accessible by the host. The control logic is further configured to, after indicating completion of the submission and in response to a request from the host to read demand data in the payload, return the demand data after determining that the demand data is in the cache memory.

Paging and disk storage for document store
11550485 · 2023-01-10 · ·

Provided are systems and methods for paging data into main memory from checkpoint data stored on disk. In one example, the method may include one or more of receiving a request for a database record in main memory, determining whether the database record has been previously stored in the main memory, in response to determining that the database record has been previously stored in the main memory, identifying a slice where the database record was stored from among a plurality of slices included in the main memory, and paging content of the identified slice including a copy of the requested database record into the main memory from a snapshot captured of content included in the identified slice and previously stored on disk. Accordingly, documents can be paged into main memory on-demand from snapshots of slice content rather than paging an entire partition of content.

Dynamically sized redundant write buffer with sector-based tracking
11550725 · 2023-01-10 · ·

Exemplary methods, apparatuses, and systems include detecting an operation to write dirty data to a cache. The cache is divided into a plurality of channels. In response to the operation, the dirty data is written to a first cache line in the cache, the first cache line being accessed via a first channel. Additionally, a redundant copy of the dirty data is written to a second cache line in the cache. The second cache line serves as a redundant write buffer and is accessed via a second channel, the first and second channels differing from one another. A metadata entry for the second cache line is updated to reference a location of the dirty data in the first cache line.

Victim cache that supports draining write-miss entries

A caching system including a first sub-cache and a second sub-cache in parallel with the first sub-cache, wherein the second sub-cache includes a set of cache lines, line type bits configured to store an indication that a corresponding cache line of the set of cache lines is configured to store write-miss data, and an eviction controller configured to flush stored write-miss data based on the line type bits.

Resource-aware compression

Systems, apparatuses, and methods for implementing a multi-tiered approach to cache compression are disclosed. A cache includes a cache controller, light compressor, and heavy compressor. The decision on which compressor to use for compressing cache lines is made based on certain resource availability such as cache capacity or memory bandwidth. This allows the cache to opportunistically use complex algorithms for compression while limiting the adverse effects of high decompression latency on system performance. To address the above issue, the proposed design takes advantage of the heavy compressors for effectively reducing memory bandwidth in high bandwidth memory (HBM) interfaces as long as they do not sacrifice system performance. Accordingly, the cache combines light and heavy compressors with a decision-making unit to achieve reduced off-chip memory traffic without sacrificing system performance.