G06F2212/6032

Methods and apparatus to facilitate an atomic operation and/or a histogram operation in cache pipeline

Methods, apparatus, systems and articles of manufacture to facilitate an atomic operation and/or a histogram operation in cache pipeline are disclosed. An example system includes a cache storage coupled to an arithmetic component; and a cache controller coupled to the cache storage, wherein the cache controller is operable to: receive a memory operation that specifies a set of data; retrieve the set of data from the cache storage; utilize the arithmetic component to determine a set of counts of respective values in the set of data; generate a vector representing the set of counts; and provide the vector.

WRITE COMBINING USING PHYSICAL ADDRESS PROXIES STORED IN A WRITE COMBINE BUFFER
20220358043 · 2022-11-10 ·

A microprocessor includes a physically-indexed-and-tagged second-level set-associative cache. Each cache entry is uniquely identified by a set index and a way number. Each entry of a write-combine buffer (WCB) holds write data to be written to a write physical memory address, a portion of which is a write physical line address. Each WCB entry also holds a write physical address proxy (PAP) for the write physical line address. The write PAP specifies the set index and the way number of the cache entry into which a cache line specified by the write physical line address is allocated. In response to receiving a store instruction that is being committed and that specifies a store PAP, the WCB compares the store PAP with the write PAP of each WCB entry and requires a match as a necessary condition for merging store data of the store instruction into a WCB entry.

Set associative cache memory with heterogeneous replacement policy

A set associative cache memory, comprising: an array of storage elements arranged as M sets by N ways; an allocation unit that allocates the storage elements in response to memory accesses that miss in the cache memory. Each memory access selects a set; for each parcel of a plurality of parcels, a parcel specifier specifies: a subset of ways of the N ways included in the parcel. The subsets of ways of parcels associated with a selected set are mutually exclusive; a replacement scheme associated with the parcel from among a plurality of predetermined replacement schemes. For each memory access, the allocation unit: selects the parcel specifier in response to the memory access; and uses the replacement scheme associated with the parcel to allocate into the subset of ways of the selected set included in the parcel.

CACHING DATA FROM A NON-VOLATILE MEMORY
20170308478 · 2017-10-26 ·

A data processing system 2 includes interconnect circuitry 10 providing a plurality of memory transaction paths between one or more transaction masters, including a processor 4, debugging circuitry 6 and a DMA unit 8, and one or more transaction slaves including a non-volatile memory 12, a DRAM memory 18 and an I/O interface 20. A cache memory 26 is provided between the interconnect circuitry 10 and the non-volatile memory 12. This cache memory 26 may be a two way set associative cache memory. The cache memory 26 may serve as a read-only cache memory. A cache miss will result in a line fill of a cache line including the target data which was missed. If prefetching is enabled for the cache memory 26 and the transaction was attempting to read a program instruction, then a prefetch operation may be performed in which a further contiguous cache line of data is also fetched into the cache memory 26 upon the cache miss.

Multi-mode set associative cache memory dynamically configurable to selectively select one or a plurality of its sets depending upon the mode

A cache memory stores 2^J-byte cache lines and includes an array of 2^N sets each holding tags each X bits, an input receives a Q-bit memory address, MA[(Q−1):0], having: a tag MA[(Q−1):(Q−X)] and an index MA[(Q−X−1):J]. Q is an integer at least (N+J+X−1). In a first mode: set selection logic selects one set using the index and LSB of the tag; comparison logic compares all but LSB of the tag with all but LSB of each tag in the selected set and indicates a hit if a match; otherwise allocation logic allocates into the selected set. In a second mode: the set selection logic selects two sets using the index; the comparison logic compares the tag with each tag in the selected two sets and indicates a hit if a match; and otherwise allocates into one set of the two selected sets.

DYNAMIC POWERING OF CACHE MEMORY BY WAYS WITHIN MULTIPLE SET GROUPS BASED ON UTILIZATION TRENDS
20170300418 · 2017-10-19 ·

A set associative cache memory comprises an M×N memory array of storage entries arranged as M sets by N ways, both M and N are integers greater than one. Within each group of P mutually exclusive groups of the M sets, the N ways are separately powerable. A controller, for each group of the P groups, monitors a utilization trend of the group and dynamically causes power to be provided to a different number of ways of the N ways of the group during different time instances based on the utilization trend.

INDEXING ENTRIES OF A STORAGE STRUCTURE SHARED BETWEEN MULTIPLE THREADS
20170286421 · 2017-10-05 ·

An apparatus has processing circuitry for processing instructions from multiple threads. A storage structure is shared between the threads and has a number of entries. Indexing circuitry generates a target index value identifying an entry of the storage structure to be accessed in response to a request from the processing circuitry specifying a requested index value corresponding to information to be accessed from the storage structure. The indexing circuitry generates the target index value as a function of the requested index value and a key value selected depending on which of the threads trigger the request. The key value for at least one of the threads is updated from time to time.

Methods and apparatus for data cache way prediction based on classification as stack data

A method of way prediction for a data cache having a plurality of ways is provided. Responsive to an instruction to access a stack data block, the method accesses identifying information associated with a plurality of most recently accessed ways of a data cache to determine whether the stack data block resides in one of the plurality of most recently accessed ways of the data cache, wherein the identifying information is accessed from a subset of an array of identifying information corresponding to the plurality of most recently accessed ways; and when the stack data block resides in one of the plurality of most recently accessed ways of the data cache, the method accesses the stack data block from the data cache.

Counter-based victim selection in a cache memory

A set-associative cache memory includes a plurality of congruence classes each including multiple entries for storing cache lines of data. A respective one of a plurality of counters is maintained for each cache line stored in the multiple entries. In response to a memory access request, the cache memory selects a victim cache line stored in a particular entry of a particular congruence class for eviction from the cache memory by reference to at least a counter value of the victim cache line. The cache memory also receives a new cache line of data for insertion into the particular entry and an indication of a distance from the cache memory to a data source from which the cache memory received the new cache line. The cache memory installs the new cache line in the particular entry and sets an initial counter value of the counter for the new cache line based on the received indication of the distance.

Managing synonyms in virtual-address caches
09772943 · 2017-09-26 · ·

A virtual-address cache module receives at least a portion of a virtual address and in response indicates a hit or a miss. A first cache structure stores only memory blocks with virtual addresses that are members of a set of multiple synonym virtual addresses that have all been previously received by the virtual-address cache module during the operating period, where each member of a particular set of multiple synonym virtual addresses translates to a common physical address, and a memory block with the common physical address is stored in at most a single storage location within the first cache structure. A second cache structure stores only memory blocks with virtual addresses that do not have any synonym virtual addresses that have been previously received by the virtual-address cache during the operating period.