G06F2212/602

LBNs prefetching per CPU

The present disclosure generally relates to prefetching data from one or more CPUs prior to the data being requested by a host device. The prefetched data is prefetched from memory and stored in cache. If a host device requests data that is not already in cache, then a determination is made regarding whether the data is scheduled to be written into cache. If the data is not in cache and is not scheduled to be written into cache, then the data is retrieved from memory and delivered to the host device. If the data is scheduled to be written into cache, or is currently being written into cache, then the request to retrieve the data is delayed or scheduled to retrieve the data once the data is in cache. If the data is already in cache, the data is delivered to the host device.

INFORMATION PROCESSING DEVICE AND INFORMATION PROCESSING METHOD
20220318146 · 2022-10-06 · ·

A device including: a processor executing a program; a first cache memory; a second cache memory belonging to a memory hierarchy lower than that of the first cache memory; a determination unit that determines, based on first information indicating a virtual address of information accessed in the second cache memory when the program is executed, second information indicating a virtual address of target information to be prefetched; and a prefetch unit that prefetches the target information and stores the prefetched target information in the second cache memory, wherein the second cache memory includes a conversion unit that converts, by using correspondence information indicating a correspondence relationship between the physical address of the target information and the virtual address of the target information, the second information into third information indicating a physical address of the target information, and the prefetch unit prefetches the target information using the third information.

Method of cache prefetching that increases the hit rate of a next faster cache

The size of a cache is modestly increased so that a short pointer to a predicted next memory address in the same cache is added to each cache line in the cache. In response to a cache hit, the predicted next memory address identified by the short pointer in the cache line of the hit along with an associated entry are pushed to a next faster cache when a valid short pointer to the predicted next memory address is present in the cache line of the hit.

Memory controller and method of operating the same

Provided herein is a memory controller for controlling a memory device. The memory controller includes a workload detector configured to determine a change in workload based on reception of a changed request from a host or a change in clock received from an external device, a device performance controller configured to determine, if the workload is determined as changed, read performance based on a ratio of a size of data output to the host to a size of data requested from the host every preset period and configured to output a read-look-ahead (RLA) command to the memory device based on the determined read performance, a buffer memory configured to store data read from the memory device in response to the RLA command and a memory size controller configured to control a size of the buffer memory. The RLA command instructs to output data which is frequently requested from the host.

Translation bandwidth optimized prefetching strategy through multiple translation lookaside buffers

A computer system includes a processor and a prefetch engine. The processor is configured to generate a demand access stream. The prefetch engine is configured to generate a first prefetch request and a second prefetch request based on the demand access stream, to output the first prefetch request to a first translation lookaside buffer (TLB), and to output the second prefetch request to a second TLB that is different from the first TLB. The processor performs a first TLB lookup in the first TLB based on one of the demand access stream or the first prefetch request, and performs a second TLB lookup in the second TLB based on the second prefetch request.

NEXT LINE PREFETCHERS EMPLOYING INITIAL HIGH PREFETCH PREDICTION CONFIDENCE STATES FOR THROTTLING NEXT LINE PREFETCHES IN A PROCESSOR-BASED SYSTEM
20170371790 · 2017-12-28 ·

Next line prefetchers employing initial high prefetch prediction confidence states for throttling next line prefetches in processor-based system are disclosed. Next line prefetcher prefetches a next memory line into cache memory in response to read operation. To mitigate prefetch mispredictions, next line prefetcher is throttled to cease prefetching after prefetch prediction confidence state becomes a no next line prefetch state indicating number of incorrect predictions. Instead of initial prefetch prediction confidence state being set to no next line prefetch state, which is built up in response to correct predictions before performing a next line prefetch, initial prefetch prediction confidence state is set to next line prefetch state to allow next line prefetching. Thus, next line prefetcher starts prefetching next lines before requiring correct predictions to be “built up” in prefetch prediction confidence state. CPU performance may be increased, because prefetching begins sooner rather than waiting for correct predictions to occur.

Reducing Memory Access Latencies During Ray Traversal
20170372448 · 2017-12-28 ·

While prefetching data for a second fiber, a hierarchical data structure is traversed using a first fiber after deferring traversal for the second fiber. Then context is switched to the second fiber, and the hierarchical data structure is traversed using second fiber while prefetching data for another fiber.

MEMORY CONTROLLER THAT FORCES PREFETCHES IN RESPONSE TO A PRESENT ROW ADDRESS CHANGE TIMING CONSTRAINT
20170371791 · 2017-12-28 ·

An apparatus having a memory controller is described. The memory controller includes prefetch circuitry to prefetch, from a memory, data having a same row address in response to the memory controller's servicing of its request stream being stalled because of a timing constraint that prevents a change in row address. The memory controller also includes a cache to cache the prefetched data. The memory controller also includes circuitry to compare addresses of read requests in the memory controller's request stream against respective addresses of the prefetched data in the cache and to service those of the requests in the memory controller's request stream having a matching address with corresponding ones of the prefetched data in the cache.

SETTING CACHE ENTRY AGE BASED ON HINTS FROM ANOTHER CACHE LEVEL

A processor replaces data at a first cache based on hints from a second cache, wherein the hints indicate information about the data that is not available to the first cache directly. When data at an entry is transferred from the first cache to the second cache, the first cache can provide an age hint to the second cache to indicate that the data should be assigned a higher or lower initial age relative to a nominal initial age. The second cache assigns the entry for the data an initial age based on the age hint and, when replacing data, selects data for replacement based on the age of each entry.

Generation and use of memory access instruction order encodings

Apparatus and methods are disclosed for controlling execution of memory access instructions in a block-based processor architecture using a hardware structure that indicates a relative ordering of memory access instruction in an instruction block. In one example of the disclosed technology, a method of executing an instruction block having a plurality of memory load and/or memory store instructions includes selecting a next memory load or memory store instruction to execute based on dependencies encoded within the block, and on a store vector that stores data indicating which memory load and memory store instructions in the instruction block have executed. The store vector can be masked using a store mask. The store mask can be generated when decoding the instruction block, or copied from an instruction block header. Based on the encoded dependencies and the masked store vector, the next instruction can issue when its dependencies are available.