G06F2212/6032

STORE-TO-LOAD FORWARDING CORRECTNESS CHECKS USING PHYSICAL ADDRESS PROXIES STORED IN LOAD QUEUE ENTRIES
20220358044 · 2022-11-10 ·

A microprocessor includes a load/store unit that performs store-to-load forwarding, a PIPT L2 set-associative cache, a store queue having store entries, and a load queue having load entries. Each L2 entry is uniquely identified by a set index and a way. Each store/load entry holds, for an associated store/load instruction, a store/load physical address proxy (PAP) for a store/load physical memory line address (PMLA). The store/load PAP specifies the set index and the way of the L2 entry into which a cache line specified by the store/load PMLA is allocated. Each load entry also holds associated load instruction store-to-load forwarding information. The load/store unit compares the store PAP with the load PAP of each valid load entry whose associated load instruction is younger in program order than the store instruction and uses the comparison and associated forwarding information to check store-to-load forwarding correctness with respect to each younger load instruction.

USING PHYSICAL ADDRESS PROXIES TO HANDLE SYNONYMS WHEN WRITING STORE DATA TO A VIRTUALLY-INDEXED CACHE

A microprocessor includes a virtually-indexed L1 data cache that has an allocation policy that permits multiple synonyms to be co-resident. Each L2 entry is uniquely identified by a set index and a way number. A store unit, during a store instruction execution, receives a store physical address proxy (PAP) for a store physical memory line address (PMLA) from an L1 entry hit upon by a store virtual address, and writes the store PAP to a store queue entry. The store PAP comprises the set index and the way number of an L2 entry that holds a line specified by the store PMLA. The store unit, during the store commit, reads the store PAP from the store queue, looks up the store PAP in the L1 to detect synonyms, writes the store data to one or more of the detected synonyms, and evicts the non-written detected synonyms.

METHODS AND APPARATUS FOR ALLOCATION IN A VICTIM CACHE SYSTEM

Methods, apparatus, systems and articles of manufacture are disclosed for allocation in a victim cache system. An example apparatus includes a first cache storage, a second cache storage, a cache controller coupled to the first cache storage and the second cache storage and operable to receive a memory operation that specifies an address, determine, based on the address, that the memory operation evicts a first set of data from the first cache storage, determine that the first set of data is unmodified relative to an extended memory, and cause the first set of data to be stored in the second cache storage.

Prediction confirmation for cache subsystem

A cache subsystem is disclosed. The cache subsystem includes a cache configured to store information in cache lines arranged in a plurality of ways. A requestor circuit generates a request to access a particular cache line in the cache. A prediction circuit is configured to generate a prediction of which of the ways includes the particular cache line. A comparison circuit verifies the prediction by comparing a particular address tag associated with the particular cache line to a cache tag corresponding to a predicted one of the ways. Responsive to determining that the prediction was correct, a confirmation indication is stored indicating the correct prediction. For a subsequent request for the particular cache line, the cache is configured to forego a verification of the prediction that the particular cache line is included in the one of the ways based on the confirmation indication.

Prediction Confirmation for Cache Subsystem

A cache subsystem is disclosed. The cache subsystem includes a cache configured to store information in cache lines arranged in a plurality of ways. A requestor circuit generates a request to access a particular cache line in the cache. A prediction circuit is configured to generate a prediction of which of the ways includes the particular cache line. A comparison circuit verifies the prediction by comparing a particular address tag associated with the particular cache line to a cache tag corresponding to a predicted one of the ways. Responsive to determining that the prediction was correct, a confirmation indication is stored indicating the correct prediction. For a subsequent request for the particular cache line, the cache is configured to forego a verification of the prediction that the particular cache line is included in the one of the ways based on the confirmation indication.

Write combining using physical address proxies stored in a write combine buffer

A microprocessor includes a physically-indexed-and-tagged second-level set-associative cache. Each cache entry is uniquely identified by a set index and a way number. Each entry of a write-combine buffer (WCB) holds write data to be written to a write physical memory address, a portion of which is a write physical line address. Each WCB entry also holds a write physical address proxy (PAP) for the write physical line address. The write PAP specifies the set index and the way number of the cache entry into which a cache line specified by the write physical line address is allocated. In response to receiving a store instruction that is being committed and that specifies a store PAP, the WCB compares the store PAP with the write PAP of each WCB entry and requires a match as a necessary condition for merging store data of the store instruction into a WCB entry.

Apparatuses and methods for compute enabled cache
11599475 · 2023-03-07 · ·

An example includes a compute component, a memory and a controller coupled to the memory. The controller configured to operate on a block select and a subrow select as metadata to a cache line to control placement of the cache line in the memory to allow for a compute enabled cache.

Methods and apparatus to facilitate read-modify-write support in a coherent victim cache with parallel data paths

Methods, apparatus, systems and articles of manufacture are disclosed facilitate read-modify-write support in a coherent victim cache with parallel data paths. An example apparatus includes a random-access memory configured to be coupled to a central processing unit via a first interface and a second interface, the random-access memory configured to obtain a read request indicating a first address to read via a snoop interface, an address encoder coupled to the random-access memory, the address encoder to, when the random-access memory indicates a hit of the read request, generate a second address corresponding to a victim cache based on the first address, and a multiplexer coupled to the victim cache to transmit a response including data obtained from the second address of the victim cache.

METHOD FOR ACCESSING DATA VISITOR DIRECTORY IN MULTI-CORE SYSTEM AND DEVICE

The present disclosure discloses a method for accessing a data visitor directory in a multi-core system, a directory cache device, a multi-core system, and a directory storage unit. The method includes: receiving a first access request sent by a first processor core, where the first access request is used to access an entry, corresponding to a first data block, in a directory; determining, according to the first access request, that a single-pointer entry array has a first single-pointer entry corresponding to the first data block; when determining, according to the first single-pointer entry, that a sharing entry array has a first sharing entry associated with the first single-pointer entry, determining multiple visitors of the first data block according to the first sharing entry. According to embodiments of the present disclosure, storage resources occupied by a directory can be reduced.

Create page locality in cache controller cache allocation

Integrated circuits are provided which create page locality in cache controllers that allocate entries to set-associative cache, which includes data storage for a plurality of Sets of Ways. A plurality of cache controllers may be interleaved with a processor and device(s), and allocate to any pages in the cache. A cache controller may select a Way from a Set to which to allocate new entries in the set-associative cache and bias selection of the Way according to a plurality of upper address bits (or other function). These bits may be identical at the cache controller during sequential memory transactions. A processor may determine the bias centrally, and inform the cache controllers of the selected Set and Way. Other functions, algorithms or approaches may be chosen to influence bias of Way selection, such as based on analysis of metadata belonging to cache controllers used for making Way allocation selections.