G06F2212/6022

PREFETCH OF RANDOM DATA USING APPLICATION TAGS

A processor may boot a system. The processor may determine a type of operation of data based on an application tag. The processor may analyze at least one specific table for the application tag. The processor may perform an operation associated with the application tag.

SERVICING CPU DEMAND REQUESTS WITH INFLIGHT PREFETCHES
20230078414 · 2023-03-16 ·

Disclosed embodiments provide a technique in which a memory controller determines whether a fetch address is a miss in an L1 cache and, when a miss occurs, allocates a way of the L1 cache, determines whether the allocated way matches a scoreboard entry of pending service requests, and, when such a match is found, determine whether a request address of the matching scoreboard entry matches the fetch address. When the matching scoreboard entry also has a request address matching the fetch address, the scoreboard entry is modified to a demand request.

Prefetch signaling in memory system or subsystem

Methods, systems, and devices for prefetch signaling in a memory system or sub-system are described. A memory device (e.g., a local memory controller of memory device) of a main memory may transmit a prefetch indicator indicating a size of prefetch data associated with a first set of data requested by an interface controller. The size of the prefetch data may be equal to or different than the size of the first set of data. The main memory may, in some examples, store the size of prefetch data along with the first set of data. The memory device may transmit the prefetch indicator (e.g., an indicator signal) to the interface controller using a pin compatible with an industry standard or specification and/or a separate pin configured for transmitting command or control information. The memory device may transmit the prefetch indicator while the first set of data is being transmitted.

Storage device processing stream data, system including the same, and operation method

A storage device which is connected to a host using a virtual memory includes a solid state drive that receives a streaming access command including a logical block address (LBA) list and a chunk size, and prefetches stream data requested according to the LBA list and the chunk size from a nonvolatile memory device without an additional command. The prefetched stream data is sequentially loaded onto a buffer, and an in-storage computing block accesses a streaming region registered on the virtual memory to sequentially read the stream data loaded onto the buffer in units of the chunk size. The buffer is mapped onto a virtual memory address of the streaming region.

Low-power cached ambient computing
11599471 · 2023-03-07 · ·

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a prefetch processing to prepare an ambient computing device to operate in a low-power state without waking a memory device. One of the methods includes performing, by an ambient computing device, a prefetch process that populates a cache with prefetched instructions and data required for the ambient computing device to process inputs to the system while in the low-power state, and entering the low-power state, and processing, by the ambient computing device in the low-power state, inputs to the system using the prefetched instructions and data stored in the cache.

Cache replacement mechanisms for speculative execution
11663130 · 2023-05-30 · ·

Described herein are systems and methods for cache replacement mechanisms for speculative execution. For example, some systems include, a buffer comprising entries that are each configured to store a cache line of data and a tag that includes an indication of a status of the cache line stored in the entry, in an integrated circuit that is configured to: responsive to a cache miss caused by a load instruction that is speculatively executed by a processor pipeline, load a cache line of data corresponding to the cache miss into a first entry of the buffer and update the tag of the first entry to indicate the status is speculative; responsive to the load instruction being retired by the processor pipeline, update the tag to indicate the status is validated; and, responsive to the load instruction being flushed from the processor pipeline, update the tag to indicate the status is cancelled.

Prefetch disable of memory requests targeting data lacking locality

A system and method for efficiently processing memory requests are described. A processing unit includes at least a processor core, a cache, and a non-cache storage buffer capable of storing data prevented from being stored in the cache. While processing a memory request targeting the non-cache storage buffer, the processor core inspects a flag stored in a tag of the memory request. The processor core prevents data prefetching into one or more of the non-cache storage buffer and the cache based on determining the flag specifies preventing data prefetching into one or more of the non-cache storage buffer and the cache using the target address of the memory request during processing of this instance of the memory request. While processing a prefetch hint instruction, the processor core determines from the tag whether to prevent prefetching.

Translation bandwidth optimized prefetching strategy through multiple translation lookaside buffers

A computer system includes a processor and a prefetch engine. The processor is configured to generate a demand access stream. The prefetch engine is configured to generate a first prefetch request and a second prefetch request based on the demand access stream, to output the first prefetch request to a first translation lookaside buffer (TLB), and to output the second prefetch request to a second TLB that is different from the first TLB. The processor performs a first TLB lookup in the first TLB based on one of the demand access stream or the first prefetch request, and performs a second TLB lookup in the second TLB based on the second prefetch request.

CACHE DATA DETERMINING METHOD AND APPARATUS
20170371807 · 2017-12-28 ·

The present embodiments provide a cache data determining method and apparatus, and pertain to the field of computer technologies. The method includes: acquiring a data identifier of read cache miss data; selecting, based on the acquired data identifier, a data identifier of to-be-determined data; recording data identifiers by groups; collecting statistics on quantities of occurrence times, in each group, of the data identifiers; and selecting target to-be-determined data according to the quantities of occurrence times, and determining the target to-be-determined data as cache miss data to be written into a cache memory. Data identifiers are recorded by groups, and after statistics on quantities of occurrence times, in each group, of the data identifiers is collected, target to-be-determined data is selected according to the quantities of occurrence times, and the target to-be-determined data is determined as cache miss data to be written into a cache memory.

REGION AWARE DELTA PREFETCHER

An apparatus includes memory circuitry including a first data structure and prefetch circuitry that is coupled to the memory circuitry. The prefetch circuitry is to store, in the first data structure, a first subregion entry corresponding to a first subregion of a memory region allocated to a program. The first subregion entry is to include a plurality of delta values. A first delta value of the plurality of delta values represents a first distance between two cache lines associated with consecutive memory accesses within a second subregion of the memory region. The prefetch circuitry is further to detect a first memory access of a first cache line in the first subregion, identify prefetch candidates based on the first cache line and the plurality of delta values, and issue at least one prefetch request based on at least two of the prefetch candidates to be prefetched into a cache.