Patent classifications
G06F12/0811
ZERO LATENCY PREFETCHING IN CACHES
This invention involves a cache system in a digital data processing apparatus including: a central processing unit core; a level one instruction cache; and a level two cache. The cache lines in the second level cache are twice the size of the cache lines in the first level instruction cache. The central processing unit core requests additional program instructions when needed via a request address. Upon a miss in the level one instruction cache that causes a hit in the upper half of a level two cache line, the level two cache supplies the upper half level cache line to the level one instruction cache. On a following level two cache memory cycle, the level two cache supplies the lower half of the cache line to the level one instruction cache. This cache technique thus prefetches the lower half level two cache line employing fewer resources than an ordinary prefetch.
ZERO LATENCY PREFETCHING IN CACHES
This invention involves a cache system in a digital data processing apparatus including: a central processing unit core; a level one instruction cache; and a level two cache. The cache lines in the second level cache are twice the size of the cache lines in the first level instruction cache. The central processing unit core requests additional program instructions when needed via a request address. Upon a miss in the level one instruction cache that causes a hit in the upper half of a level two cache line, the level two cache supplies the upper half level cache line to the level one instruction cache. On a following level two cache memory cycle, the level two cache supplies the lower half of the cache line to the level one instruction cache. This cache technique thus prefetches the lower half level two cache line employing fewer resources than an ordinary prefetch.
BULK MEMORY INITIALIZATION
The disclosure relates to technology for bulk initialization of memory in a computer system. The computer system comprises a processor core comprising a load store unit and a last level cache in communication with the processor core. The last level cache is configured to receive bulk store operations from the load store unit. Each bulk store operation includes a physical address in the memory to be initialized. The last level cache is configured to send multiple write transactions to the memory for each bulk store operation to perform a bulk initialization of the memory for each bulk store operation. The last level cache is configured to track status of the bulk store operations.
BULK MEMORY INITIALIZATION
The disclosure relates to technology for bulk initialization of memory in a computer system. The computer system comprises a processor core comprising a load store unit and a last level cache in communication with the processor core. The last level cache is configured to receive bulk store operations from the load store unit. Each bulk store operation includes a physical address in the memory to be initialized. The last level cache is configured to send multiple write transactions to the memory for each bulk store operation to perform a bulk initialization of the memory for each bulk store operation. The last level cache is configured to track status of the bulk store operations.
AGGRESSIVE WRITE FLUSH SCHEME FOR A VICTIM CACHE
A caching system including a first sub-cache and a second sub-cache in parallel with the first sub-cache, wherein the second sub-cache includes: line type bits configured to store an indication that a corresponding cache line of the second sub-cache is configured to store write-miss data, and an eviction controller configured to evict a cache line of the second sub-cache storing write-miss data based on an indication that the cache line has been fully written.
AGGRESSIVE WRITE FLUSH SCHEME FOR A VICTIM CACHE
A caching system including a first sub-cache and a second sub-cache in parallel with the first sub-cache, wherein the second sub-cache includes: line type bits configured to store an indication that a corresponding cache line of the second sub-cache is configured to store write-miss data, and an eviction controller configured to evict a cache line of the second sub-cache storing write-miss data based on an indication that the cache line has been fully written.
VIRTUALIZED CACHES
Systems and methods are disclosed for virtualized caches. For example, an integrated circuit (e.g., a processor) for executing instructions includes a virtually indexed physically tagged first-level (L1) cache configured to output to an outer memory system one or more bits of a virtual index of a cache access as one or more bits of a requestor identifier. For example, the L1 cache may be configured to operate as multiple logical L1 caches with a cache way of a size less than or equal to a virtual memory page size. For example, the integrated circuit may include an L2 cache of the outer memory system that is configured to receive the requestor identifier and implement a cache coherency protocol to disambiguate an L1 synonym occurring in multiple portions of the virtually indexed physically tagged L1 cache associated with different requestor identifier values.
VIRTUALIZED CACHES
Systems and methods are disclosed for virtualized caches. For example, an integrated circuit (e.g., a processor) for executing instructions includes a virtually indexed physically tagged first-level (L1) cache configured to output to an outer memory system one or more bits of a virtual index of a cache access as one or more bits of a requestor identifier. For example, the L1 cache may be configured to operate as multiple logical L1 caches with a cache way of a size less than or equal to a virtual memory page size. For example, the integrated circuit may include an L2 cache of the outer memory system that is configured to receive the requestor identifier and implement a cache coherency protocol to disambiguate an L1 synonym occurring in multiple portions of the virtually indexed physically tagged L1 cache associated with different requestor identifier values.
ZERO BITS IN L3 TAGS
In one embodiment, a microprocessor, comprising: plural cores, each of the cores comprising a level 1 (L1) cache and a level 2 (L2) cache; and a shared level 3 (L3) cache comprising plural L3 tag array entries, wherein a first portion of the plural L3 tag array entries is associated with data and a second portion of the plural L3 tag array entries is decoupled from data, wherein each L3 tag array entry comprises tag information and data zero information, the data zero information indicating whether any data associated with the tag information is known to be zero or not.
ZERO BITS IN L3 TAGS
In one embodiment, a microprocessor, comprising: plural cores, each of the cores comprising a level 1 (L1) cache and a level 2 (L2) cache; and a shared level 3 (L3) cache comprising plural L3 tag array entries, wherein a first portion of the plural L3 tag array entries is associated with data and a second portion of the plural L3 tag array entries is decoupled from data, wherein each L3 tag array entry comprises tag information and data zero information, the data zero information indicating whether any data associated with the tag information is known to be zero or not.