G06F12/0859

Managing multiple cache memory circuit operations

A cache memory circuit capable of dealing with multiple conflicting requests to a given cache line is disclosed. In response to receiving an acquire request for the given cache line from a particular lower-level cache memory circuit, the cache memory circuit sends probe requests regarding the given cache line to other lower-level cache memory circuits. In situations where a different lower-level cache memory circuit is simultaneously trying to evict the given cache line at the particular lower-level cache memory circuit is trying to obtain a copy of the cache line, the cache memory circuit performs a series of operations to service both requests and ensure that the particular lower-level cache memory circuit receives a copy of the given cache line that includes any changes in the evicted copy of the given cache line.

Garbage collection prefetching state machine

Garbage collection or other computational work accesses memory which is located outside processor registers. Some embodiments specify at least some of the memory accesses and separate them from other computations, and utilize a memory access state machine to control the execution of both kinds of computation. Code that employs memory access results is placed in a run routine which is divided between respective states of the state machine. The specified memory accesses are invoked from a state code, and overlap other computation. A prefetch buffer may be dynamically sized based on the availability of space in the prefetch buffer. Code for shared work, such as address relocation code, may be placed in its own state structure. Candidate code for possible separation into a specified memory access routine may be automatically recognized.

DATA PROCESSING SYSTEM HAVING A CACHE WITH A STORE BUFFER
20190317897 · 2019-10-17 ·

In a data processing system, a store request is provided having corresponding store data and a corresponding access address, and a memory coherency required attribute corresponding to the access address of the store request is provided. When the store request results in a write-through store due to a cache hit or results in a cache miss, the corresponding access address and store data is stored in a selected entry of the store buffer and a merge allowed indicator is stored in the selected entry which indicates whether or not the selected entry is a candidate for merging. The merge allowed indicator is determined based on the memory coherency required attribute from the MMU and a store buffer coherency enable control bit of the cache. Entries of the store buffer which include an asserted merge allowed indicator and share a memory line in the memory are merged.

Data processing system having a cache with a store buffer
10445237 · 2019-10-15 · ·

In a data processing system, a store request is provided having corresponding store data and a corresponding access address, and a memory coherency required attribute corresponding to the access address of the store request is provided. When the store request results in a write-through store due to a cache hit or results in a cache miss, the corresponding access address and store data is stored in a selected entry of the store buffer and a merge allowed indicator is stored in the selected entry which indicates whether or not the selected entry is a candidate for merging. The merge allowed indicator is determined based on the memory coherency required attribute from the MMU and a store buffer coherency enable control bit of the cache. Entries of the store buffer which include an asserted merge allowed indicator and share a memory line in the memory are merged.

PREFETCHER BASED SPECULATIVE DYNAMIC RANDOM-ACCESS MEMORY READ REQUEST TECHNIQUE
20190294546 · 2019-09-26 ·

A method includes monitoring a request rate of speculative memory read requests from a penultimate-level cache to a main memory. The speculative memory read requests correspond to data read requests that missed in the penultimate-level cache. A hit rate of searches of a last-level cache for data requested by the data read requests is monitored. Core demand speculative memory read requests to the main memory are selectively enabled in parallel with searching of the last-level cache for data of a corresponding core demand data read request based on the request rate and the hit rate. Prefetch speculative memory read requests to the main memory are selectively enabled in parallel with searching of the last-level cache for data of a corresponding prefetch data read request based on the request rate and the hit rate.

Profiling cache replacement
10387329 · 2019-08-20 · ·

Profiling cache replacement is a technique for managing data migration between a main memory and a cache memory to improve overall system performance. A profiler maintains counters that count memory requests for access to the pages maintained in both the cache memory and the main memory. Based on this access-request count information, a mover moves pages between the main and cache memories. For example, the mover can swap little-requested pages of the cache memory with highly-requested pages of the main memory. The mover can do so, for instance, when the counters indicate that the number of page access requests for highly-requested pages of the main memory is greater than the number of page access requests for little-requested pages of the cache memory. To avoid impeding the operations of memory users, the mover can perform page swapping in the background at predetermined time intervals, such as once every microsecond (s).

Apparatuses and methods for transferring data
10387299 · 2019-08-20 · ·

The present disclosure includes apparatuses and methods related to shifting data. An example apparatus comprises a cache coupled to an array of memory cells and a controller. The controller is configured to perform a first operation beginning at a first address to transfer data from the array of memory cells to the cache, and perform a second operation concurrently with the first operation, the second operation beginning at a second address.

Controller, storage device, and computer program product for writing and transfer process

According to an embodiment, a controller is connected to an external storage device and controls access to a semiconductor storage device including blocks each including memory cell groups each having memory cells. The block includes pages associated with each memory cell group. A writing process for each memory cell group includes writing stages. The controller includes a determining unit configured to determine data to be transferred to the page required in the writing process for a first memory cell group before the writing stage first starts when the writing stage is performed; a reading unit configured to read the determined data from the semiconductor storage device and to store the read data in the external storage device before the writing stage starts; and a writing unit configured to perform the writing process using the data stored in the external storage device when the writing stage is performed.

CACHE MISS THREAD BALANCING

A simultaneous multithread (SMT) processor having a shared dispatch pipeline includes a first circuit that detects a cache miss thread. A second circuit determines a first cache hierarchy level at which the detected cache miss occurred. A third circuit determines a Next To Complete (NTC) group in the thread and a plurality of additional groups (X) in the thread. The additional groups (X) are dynamically configured based on the detected cache miss. A fourth circuit determines whether any groups in the thread are younger than the determined NTC group and the plurality of additional groups (X), and flushes all the determined younger groups from the cache miss thread.

Cache miss thread balancing

A simultaneous multithread (SMT) processor having a shared dispatch pipeline includes a first circuit that detects a cache miss thread. A second circuit determines a first cache hierarchy level at which the detected cache miss occurred. A third circuit determines a Next To Complete (NTC) group in the thread and a plurality of additional groups (X) in the thread. The additional groups (X) are dynamically configured based on the detected cache miss. A fourth circuit determines whether any groups in the thread are younger than the determined NTC group and the plurality of additional groups (X), and flushes all the determined younger groups from the cache miss thread.