G06F2212/22

METHOD AND SYSTEM OF TRAINING DEEP LEARNING MODEL, DEVICE, AND MEDIUM

The present application provides a method of training a deep learning model. A specific implementation solution of the method of training the deep learning model includes: determining, according to first training data for a current training round, a first target parameter required to be written into a target memory in a first network parameter required by an embedding of the first training data, wherein the target memory is a memory contained in a target processor; determining a remaining storage slot in the target memory according to a first mapping relationship between a storage slot of the target memory and a network parameter; and writing, in response to the remaining storage slot meeting a storage requirement of the first target parameter, the first target parameter into the target memory so that a computing core contained in the target processor adjusts the first network parameter according to the first training data.

CONSIDERING A DENSITY OF TRACKS TO DESTAGE IN GROUPS OF TRACKS TO SELECT GROUPS OF TRACKS TO DESTAGE
20180095888 · 2018-04-05 ·

Provided are a computer program product, system, and method for considering a density of tracks to destage in groups of tracks to select groups of tracks to destage. Groups of tracks in the cache are scanned to determine whether they are ready to destage. A determination is made as to whether the tracks in one of the groups are ready to destage in response to scanning the tracks in the group. A density for the group is increased in response to determining that the group is not ready to destage. The group is destaged in response to determining that the density of the group exceeds a density threshold.

Method and apparatus for accessing data stored in a storage system that includes both a final level of cache and a main memory
09928172 · 2018-03-27 · ·

A data access system including a processor and a storage system including a main memory and a cache module. The cache module includes a FLC controller and a cache. The cache is configured as a FLC to be accessed prior to accessing the main memory. The processor is coupled to levels of cache separate from the FLC. The processor generates, in response to data required by the processor not being in the levels of cache, a physical address corresponding to a physical location in the storage system. The FLC controller generates a virtual address based on the physical address. The virtual address corresponds to a physical location within the FLC or the main memory. The cache module causes, in response to the virtual address not corresponding to the physical location within the FLC, the data required by the processor to be retrieved from the main memory.

Data storage management in a memory device

The disclosure is related to systems and methods of managing data storage in a memory device. In a particular embodiment, a method is disclosed that includes receiving, in a data storage device, at least one data packet that has a size that is different from an allocated storage capacity of at least one physical destination location on a data storage medium in the data storage device for the at least one data packet. The method also includes storing the at least one received data packet in a non-volatile cache memory prior to transferring the at least one received data packet to the at least one physical destination location.

Dataflow accelerator architecture for general matrix-matrix multiplication and tensor computation in deep learning

A general matrix-matrix multiplication (GEMM) dataflow accelerator circuit is disclosed that includes a smart 3D stacking DRAM architecture. The accelerator circuit includes a memory bank, a peripheral lookup table stored in the memory bank, and a first vector buffer to store a first vector that is used as a row address into the lookup table. The circuit includes a second vector buffer to store a second vector that is used as a column address into the lookup table, and lookup table buffers to receive and store lookup table entries from the lookup table. The circuit further includes adders to sum the first product and a second product, and an output buffer to store the sum. The lookup table buffers determine a product of the first vector and the second vector without performing a multiply operation. The embodiments include a hierarchical lookup architecture to reduce latency. Accumulation results are propagated in a systolic manner.

UNIVERSAL CACHE MANAGEMENT SYSTEM
20170308470 · 2017-10-26 ·

Techniques for universal cache management are described. In an example embodiment, a plurality of caches are allocated, in volatile memory of a computing device, to a plurality of data-processing instances, where each one of the plurality of caches is exclusively allocated to a separate one of the plurality of data-processing instances. A common cache is allocated in the volatile memory of the computing device, where the common cache is shared by the plurality of data-processing instances. Each instance of the plurality of data-processing instances is configured to: indentify a data block in the particular cache allocated to that instance, where the data block has not been changed since the data block was last persistently written to one or more storage devices; cause the data block to be stored in the common cache; and remove the data block from the particular cache. Data blocks in the common cache are maintained without being persistently written to the one or more storage devices.

System and method for dram-less SSD data protection during a power failure event

A solid-state drive (SSD) may not include a dynamic random access memory (DRAM) but rather may utilize a host memory buffer of system random access memory (RAM). During a power failure data on dirty cache lines may be lost. A power protection caching policy may be implemented where an SSD controller is capable of accepting a flush cache signal, which may be a signal to a redefined pin of the SSD or a command, from a controller of the information handling system. The controller may utilize a slope detect mechanism and/or a power good detect mechanism to detect a power failure and if a power failure is detected to issue a flush cache signal the SSD controller to cause a flush of all dirty cache lines from the host memory buffer before the power failure results in inoperability of circuitry associated with the dirty cache lines.

Method for dynamically establishing translation layer of solid state disk

A method for dynamically establishing a transition layer of a solid state disk (SSD). When a SSD is activated, the storage mode of the logical to physical (L2P) table is dynamically selected according to the state in the buffer memory of the SSD and the comparison between the capacity of the buffer memory and that of the L2P table. The establishing position of a flash translation layer (FTL) is suitably adjusted according to the selected storage mode such that the lifespan of the SSD can be prolonged.

Universal cache management system

Techniques for universal cache management are described. In an example embodiment, a plurality of caches are allocated, in volatile memory of a computing device, to a plurality of data-processing instances, where each one of the plurality of caches is exclusively allocated to a separate one of the plurality of data-processing instances. A common cache is allocated in the volatile memory of the computing device, where the common cache is shared by the plurality of data-processing instances. Each instance of the plurality of data-processing instances is configured to: identify a data block in the particular cache allocated to that instance, where the data block has not been changed since the data block was last persistently written to one or more storage devices; cause the data block to be stored in the common cache; and remove the data block from the particular cache. Data blocks in the common cache are maintained without being persistently written to the one or more storage devices.

METHOD AND APPARATUS FOR ACCESSING DATA STORED IN A STORAGE SYSTEM THAT INCLUDES BOTH A FINAL LEVEL OF CACHE AND A MAIN MEMORY
20170177481 · 2017-06-22 ·

A data access system including a processor and a storage system including a main memory and a cache module. The cache module includes a FLC controller and a cache. The cache is configured as a FLC to be accessed prior to accessing the main memory. The processor is coupled to levels of cache separate from the FLC. The processor generates, in response to data required by the processor not being in the levels of cache, a physical address corresponding to a physical location in the storage system. The FLC controller generates a virtual address based on the physical address. The virtual address corresponds to a physical location within the FLC or the main memory. The cache module causes, in response to the virtual address not corresponding to the physical location within the FLC, the data required by the processor to be retrieved from the main memory.