Patent classifications
G06F12/0817
AGGRESSIVE WRITE FLUSH SCHEME FOR A VICTIM CACHE
A caching system including a first sub-cache and a second sub-cache in parallel with the first sub-cache, wherein the second sub-cache includes: line type bits configured to store an indication that a corresponding cache line of the second sub-cache is configured to store write-miss data, and an eviction controller configured to evict a cache line of the second sub-cache storing write-miss data based on an indication that the cache line has been fully written.
ZERO BITS IN L3 TAGS
In one embodiment, a microprocessor, comprising: plural cores, each of the cores comprising a level 1 (L1) cache and a level 2 (L2) cache; and a shared level 3 (L3) cache comprising plural L3 tag array entries, wherein a first portion of the plural L3 tag array entries is associated with data and a second portion of the plural L3 tag array entries is decoupled from data, wherein each L3 tag array entry comprises tag information and data zero information, the data zero information indicating whether any data associated with the tag information is known to be zero or not.
ZERO BITS IN L3 TAGS
In one embodiment, a microprocessor, comprising: plural cores, each of the cores comprising a level 1 (L1) cache and a level 2 (L2) cache; and a shared level 3 (L3) cache comprising plural L3 tag array entries, wherein a first portion of the plural L3 tag array entries is associated with data and a second portion of the plural L3 tag array entries is decoupled from data, wherein each L3 tag array entry comprises tag information and data zero information, the data zero information indicating whether any data associated with the tag information is known to be zero or not.
METHOD, DEVICE AND COMPUTER PROGRAM PRODUCT FOR STORAGE SYSTEM MANAGEMENT
A technique for managing a storage system involves determining, in response to a first write operation on a first data block on a persistent storage device, whether a first group of data corresponding to the first data block is included in a cache; updating the first group of data in the cache if it is determined that the first group of data is included in the cache; and adding the first group of data to an associated data set of the cache to serve as a first record. Accordingly, such a technique can associatively manage different types of cached data corresponding to a data block, thereby optimizing the system performance.
APPROACH FOR SUPPORTING MEMORY-CENTRIC OPERATIONS ON CACHED DATA
A technical solution to the technical problem of how to support memory-centric operations on cached data uses a novel memory-centric memory operation that invokes write back functionality on cache controllers and memory controllers. The write back functionality enforces selective flushing of dirty, i.e., modified, cached data that is needed for memory-centric memory operations from caches to the completion level of the memory-centric memory operations, and updates the coherence state appropriately at each cache level. The technical solution ensures that commands to implement the selective cache flushing are ordered before the memory-centric memory operation at the completion level of the memory-centric memory operation.
Disaggregated system domain
An approach is disclosed that configures a computer system node from components that are each connected to an intra-node network. The configuring is performed by selecting a set of components, including at least one processor, and assigning each of the components a different address range within the node. An operating system is run on the processor included in the node with the operating system accessing each of the assigned components.
PREFETCH UNIT FILTER FOR MICROPROCESSOR
A method, programming product, processor, and/or system for prefetching data is disclosed that includes: receiving a request for data at a cache; identifying whether the request for data received at the cache is a demand request or a prefetch request; and determining, in response to identifying that the request for data received at the cache is a prefetch request, whether to terminate the prefetch request, wherein determining whether to terminate the prefetch request comprises: determining how many hits have occurred for a prefetch stream corresponding to the prefetch request received at the cache; and determining, based upon the number of hits that have occurred for the prefetch stream corresponding to the prefetch request received by the cache, whether to terminate the prefetch request.
PREFETCH UNIT FILTER FOR MICROPROCESSOR
A method, programming product, processor, and/or system for prefetching data is disclosed that includes: receiving a request for data at a cache; identifying whether the request for data received at the cache is a demand request or a prefetch request; and determining, in response to identifying that the request for data received at the cache is a prefetch request, whether to terminate the prefetch request, wherein determining whether to terminate the prefetch request comprises: determining how many hits have occurred for a prefetch stream corresponding to the prefetch request received at the cache; and determining, based upon the number of hits that have occurred for the prefetch stream corresponding to the prefetch request received by the cache, whether to terminate the prefetch request.
Data processing system having masters that adapt to agents with differing retry behaviors
A data processing system includes a plurality of snoopers, a processing unit including master, and a system fabric communicatively coupling the master and the plurality of snoopers. The master sets a retry operating mode for an interconnect operation in one of alternative first and second operating modes. The first operating mode is associated with a first type of snooper, and the second operating mode is associated with a different second type of snooper. The master issues a memory access request of the interconnect operation on the system fabric of the data processing system. Based on receipt of a combined response representing a systemwide coherence response to the request, the master delays an interval having a duration dependent on the retry operating mode and thereafter reissues the memory access request on the system fabric.
Lock-free work-stealing thread scheduler
Systems and methods are provided for lock-free thread scheduling. Threads may be placed in a ring buffer shared by all computer processing units (CPUs), e.g., in a node. A thread assigned to a CPU may be placed in the CPU's local run queue. However, when a CPU's local run queue is cleared, that CPU checks the shared ring buffer to determine if any threads are waiting to run on that CPU, and if so, the CPU pulls a batch of threads related to that ready-to-run thread to execute. If not, an idle CPU randomly selects another CPU to steal threads from, and the idle CPU attempts to dequeue a thread batch associated with the CPU from the shared ring buffer. Polling may be handled through the use of a shared poller array to dynamically distribute polling across multiple CPUs.