Patent classifications
G06F12/128
Scalable System on a Chip
An integrated circuit (IC) including a plurality of processor cores, a plurality of graphics processing units, a plurality of peripheral circuits, and a plurality of memory controllers is configured to support scaling of the system using a unified memory architecture. For example, the IC may include an interconnect fabric configured to provide communication between the one or more memory controller circuits and the processor cores, graphics processing units, and peripheral devices; and an off-chip interconnect coupled to the interconnect fabric and configured to couple the interconnect fabric to a corresponding interconnect fabric on another instance of the integrated circuit, wherein the interconnect fabric and the off-chip interconnect provide an interface that transparently connects the one or more memory controller circuits, the processor cores, graphics processing units, and peripheral devices in either a single instance of the integrated circuit or two or more instances of the integrated circuit.
System cache optimizations for deep learning compute engines
In an example, an apparatus comprises a plurality of compute engines; and logic, at least partially including hardware logic, to detect a cache line conflict in a last-level cache (LLC) communicatively coupled to the plurality of compute engines; and implement context-based eviction policy to determine a cache way in the cache to evict in order to resolve the cache line conflict. Other embodiments are also disclosed and claimed.
System cache optimizations for deep learning compute engines
In an example, an apparatus comprises a plurality of compute engines; and logic, at least partially including hardware logic, to detect a cache line conflict in a last-level cache (LLC) communicatively coupled to the plurality of compute engines; and implement context-based eviction policy to determine a cache way in the cache to evict in order to resolve the cache line conflict. Other embodiments are also disclosed and claimed.
QUALITY-OF-SERVICE INFORMATION FOR A MULTI-MEMORY SYSTEM
Methods, systems, and devices for quality-of-service information for a multi-memory system are described. An interface controller may receive a first command from a host device during a set of clock cycles. The first command may be received over a command bus that includes a pin, such as a command select pin configured for double data rate signaling. The interface controller may decode the first command based on a state of the command select pin during at least one clock cycle of the set of clock cycles. And the interface controller may determine quality-of-service information for a second command based on decoding the first command and on information, such as a plurality of bits, included in the first command.
NVMe CONTROLLER MEMORY MANAGER
Embodiments of the present disclosure generally relate to an NVMe storage device having a controller memory manager and a method of accessing an NVMe storage device having a controller memory manager. In one embodiment, a storage device comprises a non-volatile memory, a volatile memory, and a controller memory manager. The controller memory manager is operable to store one or more NVMe data structures within the non-volatile memory and the volatile memory.
NVMe CONTROLLER MEMORY MANAGER
Embodiments of the present disclosure generally relate to an NVMe storage device having a controller memory manager and a method of accessing an NVMe storage device having a controller memory manager. In one embodiment, a storage device comprises a non-volatile memory, a volatile memory, and a controller memory manager. The controller memory manager is operable to store one or more NVMe data structures within the non-volatile memory and the volatile memory.
CACHE ARCHITECTURES FOR MEMORY DEVICES
Methods, systems, and devices for cache architectures for memory devices are described. For example, a memory device may include a main array having a first set of memory cells, a cache having a second set of memory cells, and a cache delay register configured to store an indication of cache addresses associated with recently performed access operations. In some examples, the cache delay register may be operated as a first-in-first-out (FIFO) register of cache addresses, where a cache address associated with a performed access operation may be added to the beginning of the FIFO register, and a cache address at the end of the FIFO register may be purged. Information associated with access operations on the main array may be maintained in the cache, and accessed directly (e.g., without another accessing of the main array), at least as long as the cache address is present in the cache delay register.
CACHE RESIZING BASED ON PROCESSOR WORKLOAD
A processor sets the size of a processor cache based on an identified workload executing at the processor. The cache size is set in response to the processor exiting a low-power mode. By setting the size of the cache based on the workload, the processor is able to tailor the size of the cache to the characteristics of a particular workload while also reducing, for at least some workloads, the overhead associated with entering or exiting the low-power mode.
METHODS AND APPARATUS FOR ALLOCATION IN A VICTIM CACHE SYSTEM
Methods, apparatus, systems and articles of manufacture are disclosed for allocation in a victim cache system. An example apparatus includes a first cache storage, a second cache storage, a cache controller coupled to the first cache storage and the second cache storage and operable to receive a memory operation that specifies an address, determine, based on the address, that the memory operation evicts a first set of data from the first cache storage, determine that the first set of data is unmodified relative to an extended memory, and cause the first set of data to be stored in the second cache storage.
METHODS AND APPARATUS FOR ALLOCATION IN A VICTIM CACHE SYSTEM
Methods, apparatus, systems and articles of manufacture are disclosed for allocation in a victim cache system. An example apparatus includes a first cache storage, a second cache storage, a cache controller coupled to the first cache storage and the second cache storage and operable to receive a memory operation that specifies an address, determine, based on the address, that the memory operation evicts a first set of data from the first cache storage, determine that the first set of data is unmodified relative to an extended memory, and cause the first set of data to be stored in the second cache storage.