Patent classifications
G06F12/0886
LATENCY REDUCTION IN SPI FLASH MEMORY DEVICES
A method can include: receiving, in a memory device, a read request from a host device that is coupled to the memory device by an interface; decoding an address of the read request that is received from the interface; decoding a command of the read request to determine whether the read request is for an aligned address operation; maintaining the decoded address without modification when the read request is determined as being for the aligned address operation regardless of an actual alignment of the decoded address; and executing the read request as the aligned address operation on the memory device by using the decoded address.
Heuristics for selecting subsegments for entry in and entry out operations in an error cache system with coarse and fine grain segments
A memory device comprises a memory bank comprising a plurality of addressable memory cells, wherein the memory bank is divided into a plurality of segments. Further, the device comprises a cache memory operable for storing a second plurality of data words, wherein each data word of the second plurality of data words is either awaiting write verification associated with the memory bank or is to be re-written into the memory bank. The cache memory is divided into a plurality of primary segments, wherein each primary segment of the cache memory is direct mapped to a corresponding segment of the plurality of segments, wherein each primary segment is sub-divided into a plurality of secondary segments, and wherein each of the plurality of secondary segments comprises at least one counter for tracking a number of entries stored therein.
Heuristics for selecting subsegments for entry in and entry out operations in an error cache system with coarse and fine grain segments
A memory device comprises a memory bank comprising a plurality of addressable memory cells, wherein the memory bank is divided into a plurality of segments. Further, the device comprises a cache memory operable for storing a second plurality of data words, wherein each data word of the second plurality of data words is either awaiting write verification associated with the memory bank or is to be re-written into the memory bank. The cache memory is divided into a plurality of primary segments, wherein each primary segment of the cache memory is direct mapped to a corresponding segment of the plurality of segments, wherein each primary segment is sub-divided into a plurality of secondary segments, and wherein each of the plurality of secondary segments comprises at least one counter for tracking a number of entries stored therein.
Look-up table initialize
A digital data processor includes an instruction memory storing instructions specifying a data processing operation and a data operand field, an instruction decoder coupled to the instruction memory for recalling instructions from the instruction memory and determining the operation and the data operand, and an operational unit coupled to a data register file and to an instruction decoder to perform a data processing operation upon an operand corresponding to an instruction decoded by the instruction decoder and storing results of the data processing operation. The operational unit is configured to perform a table write in response to a look up table initialization instruction by duplicating at least one data element from a source data register to create duplicated data elements, and writing the duplicated data elements to a specified location in a specified number of at least one table and a corresponding location in at least one other table.
SYSTEMS AND METHODS FOR READING AND WRITING SPARSE DATA IN A NEURAL NETWORK ACCELERATOR
Disclosed herein includes a system, a method, and a device for reading and writing sparse data in a neural network accelerator. A mask identifying byte positions within a data word having non-zero values in memory can be accessed. Each bit of the mask can have a first value or a second value, the first value indicating that a byte of the data word corresponds to a non-zero byte value, the second value indicating that the byte of the data word corresponds to a zero byte value. The data word can be modified to have non-zero byte values stored at an end of a first side of the data word in the memory, and any zero byte values stored in a remainder of the data word. The modified data word can be written to the memory via at least a first slice of a plurality of slices that is configured to access the first side of the data word in the memory.
Latency reduction in SPI flash memory devices
A method can include: receiving, in a memory device, a read request from a host device that is coupled to the memory device by an interface; decoding an address of the read request that is received from the interface; decoding a command of the read request to determine whether the read request is for an aligned address operation; maintaining the decoded address without modification when the read request is determined as being for the aligned address operation regardless of an actual alignment of the decoded address; and executing the read request as the aligned address operation on the memory device by using the decoded address.
COMPRESSION AWARE PREFETCH
Methods, devices, and systems for prefetching data. First data is loaded from a first memory location. The first data in cached in a cache memory. Other data is prefetched to the cache memory based on a compression of the first data and a compression of the other data. In some implementations, the compression of the first data and the compression of the other data are determined based on metadata associated with the first data and metadata associated with the other data. In some implementations, the other data is prefetched to the cache memory based on a total of a compressed size of the first data and a compressed size of the other data being less than a threshold size. In some implementations, the other data is not prefetched to the cache memory based on the other data being uncompressed.
Adaptive cache
Described apparatuses and methods form adaptive cache lines having a configurable capacity from hardware cache lines having a fixed capacity. The adaptive cache lines can be formed in accordance with a programmable cache-line parameter. The programmable cache-line parameter can specify a capacity for the adaptive cache lines. The adaptive cache lines may be formed by combining respective groups of fixed-capacity hardware cache lines. The quantity of fixed-capacity hardware cache lines included in respective adaptive cache lines may be based on the programmable cache-line parameter. The programmable cache-line parameter can be selected in accordance with characteristics of the cache workload.
Writing store data of multiple store operations into a cache line in a single cycle
A load-store unit (LSU) of a processor core determines whether or not a second store operation specifies an adjacent update to that specified by a first store operation. The LSU additionally determines whether the total store data length of the first and second store operations exceeds a maximum size. Based on determining the second store operation specifies an adjacent update and the total store data length does not exceed the maximum size, the LSU merges the first and second store operations and writes merged store data into a same write block of a cache. Based on determining that the total store data length exceeds the maximum size, the LSU splits the second store operation into first and second portions, merges the first portion with the first store operation, and writes store data of the partially merged store operation into the write block.
Logical-to-physical mapping using a flag to indicate whether a mapping entry points to [[for]] sequentially stored data
Methods, systems, and devices for compressed logical-to-physical mapping for sequentially stored data are described. A memory device may use a hierarchical set of logical-to-physical mapping tables for mapping logical block address generated by a host device to physical addresses of the memory device. The memory device may determine whether all of the entries of a terminal logical-to-physical mapping table are consecutive physical addresses. In response to determining that all of the entries contain consecutive physical addresses, the memory device may store a starting physical address of the consecutive physical addresses as an entry in a higher-level table along with a flag indicating that the entry points directly to data in the memory device rather than pointing to a terminal logical-to-physical mapping table. The memory device may, for subsequent reads of data stored in one or more of the consecutive physical addresses, bypass the terminal table to read the data.