Patent classifications
G06F12/0848
METHOD AND APPARATUS FOR CONTROLLING CACHE LINE STORAGE IN CACHE MEMORY
A method and apparatus physically partitions clean and dirty cache lines into separate memory partitions, such as one or more banks, so that during low power operation, a cache memory controller reduces power consumption of the cache memory containing the clean only data. The cache memory controller controls refresh operation so that data refresh does not occur for clean data only banks or the refresh rate is reduced for clean data only banks. Partitions that store dirty data can also store clean data, however other partitions are designated for storing only clean data so that the partitions can have their refresh rate reduced or refresh stopped for periods of time. When multiple DRAM dies or packages are employed, the partition can occur on a die or package level as opposed to a bank level within a die.
Processing method and accelerating device
The present disclosure provides a processing device including: a coarse-grained pruning unit configured to perform coarse-grained pruning on a weight of a neural network to obtain a pruned weight, an operation unit configured to train the neural network according to the pruned weight. The coarse-grained pruning unit is specifically configured to select M weights from the weights of the neural network through a sliding window, and when the M weights meet a preset condition, all or part of the M weights may be set to 0. The processing device can reduce the memory access while reducing the amount of computation, thereby obtaining an acceleration ratio and reducing energy consumption.
Techniques to improve translation lookaside buffer reach by leveraging idle resources
Techniques are disclosed for processing address translations. The techniques include detecting a first miss for a first address translation request for a first address translation in a first translation lookaside buffer, in response to the first miss, fetching the first address translation into the first translation lookaside buffer and evicting a second address translation from the translation lookaside buffer into an instruction cache or local data share memory, detecting a second miss for a second address translation request referencing the second address translation, in the first translation lookaside buffer, and in response to the second miss, fetching the second address translation from the instruction cache or the local data share memory.
Memory data access apparatus and method thereof
The present disclosure provides a memory data access apparatus and method thereof. The memory data access apparatus includes a cache memory and a processing unit. The processing unit is configured to: execute a memory read instruction, wherein the memory read instruction includes a memory address; determine that access of the memory address in the cache memory is missed; determine that the memory address is within a memory address range, wherein the memory address range corresponds to a data access amount; and read data blocks corresponding to the data access amount from the memory address of a memory.
Accumulators corresponding to bins in memory
In some examples, a system includes a processing entity and a memory to store data arranged in a plurality of bins associated with respective key values of a key. The system includes a cache to store cached data elements for respective accumulators that are updatable to represent occurrences of the respective key values of the key, where each accumulator corresponds to a different bin of the plurality of bins, and each cached data element has a range that is less than a range of a corresponding bin of the plurality of bins. Responsive to a value of a given cached data element as updated by a given accumulator satisfying a criterion, the processing entity is to cause an aggregation of the value of the given cached data element with a bin value in a respective bin.
Multi-level system memory with near memory capable of storing compressed cache lines
A method is described. The method includes receiving a read or write request for a cache line. The method includes directing the request to a set of logical super lines based on the cache line's system memory address. The method includes associating the request with a cache line of the set of logical super lines. The method includes, if the request is a write request: compressing the cache line to form a compressed cache line, breaking the cache line down into smaller data units and storing the smaller data units into a memory side cache. The method includes, if the request is a read request: reading smaller data units of the compressed cache line from the memory side cache and decompressing the cache line.
Demoting data elements from cache using ghost cache statistics
A method for demoting data elements from a cache is disclosed. The method maintains a heterogeneous cache comprising a higher performance portion and a lower performance portion. The method maintains, within the lower performance portion, a ghost cache containing statistics for data elements that are currently contained in the heterogeneous cache, and data elements that have been demoted from the heterogeneous cache within a specified time interval. The method maintains, for the ghost cache, multiple LRU lists that designate an order in which data elements are demoted from the lower performance portion. The method utilizes the statistics to determine in which LRU lists the data elements are referenced. A corresponding system and computer program product are also disclosed.
Static power reduction in caches using deterministic naps
Disclosed embodiments relate to a dNap architecture that accurately transitions cache lines to full power state before an access to them. This ensures that there are no additional delays due to waking up drowsy lines. Only cache lines that are determined by the DMC to be accessed in the immediate future are fully powered while others are put in drowsy mode. As a result, we are able to significantly reduce leakage power with no cache performance degradation and minimal hardware overhead, especially at higher associativities. Up to 92% static/Leakage power savings are accomplished with minimal hardware overhead and no performance tradeoff.
Dynamic processor cache to avoid speculative vulnerability
A processor including a logic unit configured to execute multiple instructions being one of a speculative instruction or an architectural instruction is provided. The processor also includes a split cache comprising multiple lines, each line including a data accessed by an instruction and copied into the split cache from a memory address, wherein a line is identified as a speculative line for the speculative instruction, and as an architectural line for the architectural instruction. The processor includes a cache manager configured to select a number of speculative lines allocated in the split cache. The cache manager prevents an architectural line from being replaced by a speculative line based on a number of speculative lines allotted in the split cache, and manages the number of speculative lines to be allocated in the split cache based on the number of speculative lines relative to a number of architectural lines.
METHOD AND APPARATUS FOR CONTROLLING CACHE LINE STORAGE IN CACHE MEMORY
A method and apparatus physically partitions clean and dirty cache lines into separate memory partitions, such as one or more banks, so that during low power operation, a cache memory controller reduces power consumption of the cache memory containing the clean only data. The cache memory controller controls refresh operation so that data refresh does not occur for clean data only banks or the refresh rate is reduced for clean data only banks. Partitions that store dirty data can also store clean data, however other partitions are designated for storing only clean data so that the partitions can have their refresh rate reduced or refresh stopped for periods of time. When multiple DRAM dies or packages are employed, the partition can occur on a die or package level as opposed to a bank level within a die.