Patent classifications
G06F12/128
Multi-Level System Memory With Near Memory Scrubbing Based On Predicted Far Memory Idle Time
An apparatus is described that includes a memory controller to interface to a multi-level system memory. The memory controller includes least recently used (LRU) circuitry to keep track of least recently used cache lines kept in a higher level of the multi-level system memory. The memory controller also includes idle time predictor circuitry to predict idle times of a lower level of the multi-level system memory. The memory controller is to write one or more lesser used cache lines from the higher level of the multi-level system memory to the lower level of the multi-level system memory in response to the idle time predictor circuitry indicating that an observed idle time of the lower level of the multi-level system memory is expected to be long enough to accommodate the write of the one or more lesser used cache lines from the higher level of the multi-level system memory to the lower level of the multi-level system memory.
Caching of metadata for deduplicated LUNs
Efficient processing of user data read requests in a deduplicated data storage system places the metadata for most frequently requested data in data structures and locations in the system hierarchy where the metadata will be most rapidly available. The total amount of such metadata makes storing all of the metadata in high speed memory expensive, and the system and method described uses both the temporal and the spatial characteristics of the user system activity in any epoch to adjust the contents of metadata cache so as to respond to the dynamics of a multi user or multi-application environment where the storage system is not made aware of the time changing mix of operations except by observation of the individual requests. A history record is used to promote metadata from the slow memory to the fast memory, and a process selection may be adjusted based on the address-space activity.
Data cache with hybrid writeback and writethrough
Described is a data cache implementing hybrid writebacks and writethroughs. A processing system includes a memory, a memory controller, and a processor. The processor includes a data cache including cache lines, a write buffer, and a store queue. The store queue writes data to a hit cache line and an allocated entry in the write buffer when the hit cache line is initially in at least a shared coherence state, resulting in the hit cache line being in a shared coherence state with data and the allocated entry being in a modified coherence state with data. The write buffer requests and the memory controller upgrades the hit cache line to a modified coherence state with data based on tracked coherence states. The write buffer retires the data upon upgrade. The data cache writebacks the data to memory for a defined event.
Data cache with hybrid writeback and writethrough
Described is a data cache implementing hybrid writebacks and writethroughs. A processing system includes a memory, a memory controller, and a processor. The processor includes a data cache including cache lines, a write buffer, and a store queue. The store queue writes data to a hit cache line and an allocated entry in the write buffer when the hit cache line is initially in at least a shared coherence state, resulting in the hit cache line being in a shared coherence state with data and the allocated entry being in a modified coherence state with data. The write buffer requests and the memory controller upgrades the hit cache line to a modified coherence state with data based on tracked coherence states. The write buffer retires the data upon upgrade. The data cache writebacks the data to memory for a defined event.
READ CACHE MANAGEMENT
Implementations disclosed herein provide for a storage system including an on-disk read cache and a variety of read cache management techniques. According to one implementation, a storage device controller time-sequentially reads a series of non-contiguous data blocks storing a data sequence in a read cache of a magnetic disk, the data sequence identified by a requested sequence of logical block addresses (LBAs). The controller determines that read requests for the data sequence satisfy at least one predetermined access frequency criterion and, responsive to the determination, the controller re-writes data of the data sequence to a series of contiguous data blocks in the read cache.
METHOD FOR DEBUGGING STATIC MEMORY CORRUPTION
An indication is received. The indication is of an address in a first page in virtual memory used by an application with a static memory corruption. A loadable kernel module will monitor the address. Access to the first page in virtual memory is changed from read/write access to read only access. A second page in virtual memory is created with read/write access. Whether a page fault occurs on the first page in virtual memory during the execution of the application with the static memory corruption is determined.
GRAPHICS PROCESSORS AND GRAPHICS PROCESSING UNITS HAVING DOT PRODUCT ACCUMULATE INSTRUCTION FOR HYBRID FLOATING POINT FORMAT
Described herein is a graphics processing unit (GPU) configured to receive an instruction having multiple operands, where the instruction is a single instruction multiple data (SIMD) instruction configured to use a bfloat16 (BF16) number format and the BF16 number format is a sixteen-bit floating point format having an eight-bit exponent. The GPU can process the instruction using the multiple operands, where to process the instruction includes to perform a multiply operation, perform an addition to a result of the multiply operation, and apply a rectified linear unit function to a result of the addition.
GRAPHICS PROCESSORS AND GRAPHICS PROCESSING UNITS HAVING DOT PRODUCT ACCUMULATE INSTRUCTION FOR HYBRID FLOATING POINT FORMAT
Described herein is a graphics processing unit (GPU) configured to receive an instruction having multiple operands, where the instruction is a single instruction multiple data (SIMD) instruction configured to use a bfloat16 (BF16) number format and the BF16 number format is a sixteen-bit floating point format having an eight-bit exponent. The GPU can process the instruction using the multiple operands, where to process the instruction includes to perform a multiply operation, perform an addition to a result of the multiply operation, and apply a rectified linear unit function to a result of the addition.
Create page locality in cache controller cache allocation
Integrated circuits are provided which create page locality in cache controllers that allocate entries to set-associative cache, which includes data storage for a plurality of Sets of Ways. A plurality of cache controllers may be interleaved with a processor and device(s), and allocate to any pages in the cache. A cache controller may select a Way from a Set to which to allocate new entries in the set-associative cache and bias selection of the Way according to a plurality of upper address bits (or other function). These bits may be identical at the cache controller during sequential memory transactions. A processor may determine the bias centrally, and inform the cache controllers of the selected Set and Way. Other functions, algorithms or approaches may be chosen to influence bias of Way selection, such as based on analysis of metadata belonging to cache controllers used for making Way allocation selections.
Create page locality in cache controller cache allocation
Integrated circuits are provided which create page locality in cache controllers that allocate entries to set-associative cache, which includes data storage for a plurality of Sets of Ways. A plurality of cache controllers may be interleaved with a processor and device(s), and allocate to any pages in the cache. A cache controller may select a Way from a Set to which to allocate new entries in the set-associative cache and bias selection of the Way according to a plurality of upper address bits (or other function). These bits may be identical at the cache controller during sequential memory transactions. A processor may determine the bias centrally, and inform the cache controllers of the selected Set and Way. Other functions, algorithms or approaches may be chosen to influence bias of Way selection, such as based on analysis of metadata belonging to cache controllers used for making Way allocation selections.