G06F2212/22

Considering a frequency of access to groups of tracks and density of the groups to select groups of tracks to destage

Provided are a computer program product, system, and method for considering a frequency of access to groups of tracks and density of the groups to select groups of tracks to destage. One of a plurality of densities for one of a plurality of groups of tracks is incremented in response to determining at least one of that the group is not ready to destage and that one of the tracks in the group in the cache transitions to being ready to destage. A determination is made of a group frequency indicating a frequency at which tracks in the group are modified. At least one of the density and the group frequency is used for each of the groups to determine whether to destage the group. The tracks in the group in the cache are destaged to the storage in response to determining to destage the group.

METHOD AND APPARATUS FOR ACCESSING DATA STORED IN A STORAGE SYSTEM THAT INCLUDES BOTH A FINAL LEVEL OF CACHE AND A MAIN MEMORY
20180293167 · 2018-10-11 ·

A data access system including a processor and a storage system including a main memory and a cache module. The cache module includes a FLC controller and a cache. The cache is configured as a FLC to be accessed prior to accessing the main memory. The processor is coupled to levels of cache separate from the FLC. The processor generates, in response to data required by the processor not being in the levels of cache, a physical address corresponding to a physical location in the storage system. The FLC controller generates a virtual address based on the physical address. The virtual address corresponds to a physical location within the FLC or the main memory. The cache module causes, in response to the virtual address not corresponding to the physical location within the FLC, the data required by the processor to be retrieved from the main memory.

Universal cache management system

Techniques for universal cache management are described. In an example embodiment, a plurality of caches are allocated, in volatile memory of a computing device, to a plurality of data-processing instances, where each one of the plurality of caches is exclusively allocated to a separate one of the plurality of data-processing instances. A common cache is allocated in the volatile memory of the computing device, where the common cache is shared by the plurality of data-processing instances. Each instance of the plurality of data-processing instances is configured to: identify a data block in the particular cache allocated to that instance, where the data block has not been changed since the data block was last persistently written to one or more storage devices; cause the data block to be stored in the common cache; and remove the data block from the particular cache. Data blocks in the common cache are maintained without being persistently written to the one or more storage devices.

Considering a density of tracks to destage in groups of tracks to select groups of tracks to destage

Provided are a computer program product, system, and method for considering a density of tracks to destage in groups of tracks to select groups of tracks to destage. Groups of tracks in the cache are scanned to determine whether they are ready to destage. A determination is made as to whether the tracks in one of the groups are ready to destage in response to scanning the tracks in the group. A density for the group is increased in response to determining that the group is not ready to destage. The group is destaged in response to determining that the density of the group exceeds a density threshold.

Dataflow accelerator architecture for general matrix-matrix multiplication and tensor computation in deep learning

A general matrix-matrix multiplication (GEMM) dataflow accelerator circuit is disclosed that includes a smart 3D stacking DRAM architecture. The accelerator circuit includes a memory bank, a peripheral lookup table stored in the memory bank, and a first vector buffer to store a first vector that is used as a row address into the lookup table. The circuit includes a second vector buffer to store a second vector that is used as a column address into the lookup table, and lookup table buffers to receive and store lookup table entries from the lookup table. The circuit further includes adders to sum the first product and a second product, and an output buffer to store the sum. The lookup table buffers determine a product of the first vector and the second vector without performing a multiply operation. The embodiments include a hierarchical lookup architecture to reduce latency. Accumulation results are propagated in a systolic manner.

METHOD AND SYSTEM FOR MANAGING BUFFER DEVICE IN STORAGE SYSTEM
20180173629 · 2018-06-21 ·

A method and system for managing a buffer device in a storage system. The method comprising determining a first priority for a first queue included in the buffer device, the first queue comprising at least one data page associated with a first storage device in the storage system; in at least one round, in response to the first priority not satisfying a first predetermined condition, updating the first priority according to a first updating rule, the first updating rule making the updated first priority much closer to the first predetermined condition than the first priority; and in response to the first priority satisfying the first predetermined condition, flushing data in a data page in the first queue to the first storage device.

MEMORY SYSTEM

A memory system includes a volatile first storing unit, a nonvolatile second storing unit in which data is managed in a predetermined unit, and a controller that writes data requested by a host apparatus in the second storing unit via the first storing unit and reads out data requested by the host apparatus from the second storing unit to the first storing unit and transfers the data to the host apparatus. The controller includes a management table for managing the number of failure areas in a predetermined unit that occur in the second storing unit and switches, according to the number of failure areas, an operation mode in writing data in the second storing unit from the host apparatus.

Nonvolatile memory module having DRAM used as cache, computing system having the same, and operating method thereof

A nonvolatile memory module includes at least one nonvolatile memory, at least one nonvolatile memory controller configured to control the nonvolatile memory, at least one dynamic random access memory (DRAM) used as a cache of the at least one nonvolatile memory, data buffers configured to store data exchanged between the at least one DRAM and an external device, and a memory module control device configured to control the nonvolatile memory controller, the at least one DRAM, and the data buffers. The at least one DRAM stores a tag corresponding to cache data and compares the stored tag with input tag information to determine whether to output the cache data.

Accelerated address indirection table lookup for wear-leveled non-volatile memory

Embodiments are generally directed to accelerated address indirection table lookup for wear-leveled non-volatile memory. A embodiment of a memory device includes nonvolatile memory; a memory controller; and address indirection logic to provide address indirection for the nonvolatile memory, of the address indirection logic to maintain an address indirection table (AIT) in the nonvolatile memory, the AIT including a plurality of levels, and copy at least a portion of the AIT to a second memory, the second memory having less latency than the first memory.

Spilling small cache entries to a solid state device
09946657 · 2018-04-17 · ·

Systems for managing a multi-level cache in high-performance computing. A method is practiced over a multi-tier caching subsystem that comprises a first cache tier of random access memory, and a second cache tier that comprises a block-oriented device. The solid-state drive device is a block-oriented device comprising a plurality of blocks having a minimum block size. Cache entries are initially stored in the first cache, including cache entries that are smaller than the minimum block size of the block-oriented device. During cache operations such as first tier eviction, a plurality smaller entries are packed into blocks of the minimum block size before being spilled into the second tier. If an entry in the packed block is accessed again, the entire packed block is brought into the first tier. A key structure is maintained to track individual invalidated entries in a packed block without invalidating other entries in the packed block.