G06F2212/6046

APPARATUS AND METHOD FOR IMPLEMENTING A MULTI-LEVEL MEMORY HIERARCHY HAVING DIFFERENT OPERATING MODES
20180341588 · 2018-11-29 ·

A system and method are described for integrating a memory and storage hierarchy including a non-volatile memory tier within a computer system. In one embodiment, PCMS memory devices are used as one tier in the hierarchy, sometimes referred to as far memory. Higher performance memory devices such as DRAM placed in front of the far memory and are used to mask some of the performance limitations of the far memory. These higher performance memory devices are referred to as near memory. In one embodiment, the near memory is configured to operate in a plurality of different modes of operation including (but not limited to) a first mode in which the near memory operates as a memory cache for the far memory and a second mode in which the near memory is allocated a first address range of a system address space with the far memory being allocated a second address range of the system address space, wherein the first range and second range represent the entire system address space.

Cache management system and method

A method, computer program product, and computing system for determining a queue depth and a flush rate for each of a plurality of pending data queues associated with a cache system, thus defining a queue depth/flush rate pair for each of the plurality of pending data queues. A predicted drain time is determined for each of the plurality of pending data queues based, at least in part, upon the queue depth/flush rate pair, thus defining a plurality of predicted drain times that are respectively associated with the plurality of pending data queues.

Data bit inversion tracking in cache memory to reduce data bits written for write operations

Data bit inversion tracking in cache memory to reduce data bits written for write operations is disclosed. In one aspect, a cache memory including a cache controller and a cache array is provided. The cache array includes one or more cache entries, each of which includes a cache data field and a bit change track field. The cache controller compares a current cache data word to a new data word to be written and stores a bit track change word representing the difference (i.e., inverted bits) between the current cache data word and the new data word in the bit change track field. By using the bit track change word stored in the bit change track field to determine whether fewer bit writes are required to write data in an inverted or a non-inverted form, power consumption can be reduced for write operations through reduced bit write operations.

SYSTEM CACHE OPTIMIZATIONS FOR DEEP LEARNING COMPUTE ENGINES
20180307624 · 2018-10-25 · ·

In an example, an apparatus comprises a plurality of compute engines; and logic, at least partially including hardware logic, to detect a cache line conflict in a last-level cache (LLC) communicatively coupled to the plurality of compute engines; and implement context-based eviction policy to determine a cache way in the cache to evict in order to resolve the cache line conflict. Other embodiments are also disclosed and claimed.

USING AN ACCESS INCREMENT NUMBER TO CONTROL A DURATION DURING WHICH TRACKS REMAIN IN CACHE

Provided are a computer program product, system, and method for using an access increment number to control a duration during which tracks remain in cache. Tracks in a storage in the cache are indicated in a cache list. For each of the tracks indicated in the cache list, an access value is updated when one of the tracks is accessed in the cache. An access to a track in the cache indicated in the cache list is received. A determination is made as to whether an access increment number for the accessed track, wherein the access increment number is greater than one. The access value for the accessed track is incremented by the determined access increment number in response to the track being accessed in the cache. The access value for one of the tracks is used to determine whether to initiate to demote the track from the cache.

Apparatus and method for implementing a multi-level memory hierarchy having different operating modes

A system and method are described for integrating a memory and storage hierarchy including a non-volatile memory tier within a computer system. In one embodiment, PCMS memory devices are used as one tier in the hierarchy, sometimes referred to as far memory. Higher performance memory devices such as DRAM placed in front of the far memory and are used to mask some of the performance limitations of the far memory. These higher performance memory devices are referred to as near memory. In one embodiment, the near memory is configured to operate in a plurality of different modes of operation including (but not limited to) a first mode in which the near memory operates as a memory cache for the far memory and a second mode in which the near memory is allocated a first address range of a system address space with the far memory being allocated a second address range of the system address space, wherein the first range and second range represent the entire system address space.

Dynamically-adjusted host memory buffer
10102135 · 2018-10-16 · ·

Host memory buffer is dynamically adjusted based on performance. As memory pages are accessed, one or more counts of the memory pages are maintained. If the counts indicate some of the memory pages are identical, then a portion of host system memory allocated to buffer cache may be reduced or decremented in response to repetitive access. However, if the counts indicate different memory pages are accessed, then the host system memory allocated to the buffer cache may be increased or incremented.

Method and system for efficient communication and command system for deferred operation
10095627 · 2018-10-09 · ·

A method and system for efficiently executing a delegate of a program by a processor coupled to an external memory. A payload including state data or command data is bound with a program delegate. The payload is mapped with the delegate via the payload identifier. The payload is pushed to a repository buffer in the external memory. The payload is flushed by reading the payload identifier and loading the payload from the repository buffer. The delegate is executed using the loaded payload.

System, Apparatus And Method For Overriding Of Non-Locality-Based Instruction Handling

In one embodiment, a processor includes: a core including a decode unit to decode a memory access instruction having a no-locality hint to indicate that data associated with the memory access instruction has at least one of non-spatial locality and non-temporal locality; and a locality controller to determine whether to override the no-locality hint based at least in part on one or more performance monitoring values. Other embodiments are described and claimed.

DYNAMIC FILL POLICY FOR A SHARED CACHE
20180285261 · 2018-10-04 · ·

Technologies are provided in embodiments to dynamically fill a shared cache. At least some embodiments include determining that data requested in a first request for the data by a first processing device is not stored in a cache shared by the first processing device and a second processing device, where a dynamic fill policy is applicable to the first request. Embodiments further include determining to deallocate, based at least in part on a threshold, an entry in a buffer, the entry containing information corresponding to the first request for the data. Embodiments also include sending a second request for the data to a system memory, and sending the data from the system memory to the first processing device. In more specific embodiments, the data from the system memory is not written to the cache based, at least in part, on the determination to deallocate the entry.