G06F2212/6046

METHOD AND SYSTEM FOR EFFICIENT COMMUNICATION AND COMMAND SYSTEM FOR DEFERRED OPERATION
20180011794 · 2018-01-11 ·

A method and system for efficiently executing a delegate of a program by a processor coupled to an external memory. A payload including state data or command data is bound with a program delegate. The payload is mapped with the delegate via the payload identifier. The payload is pushed to a repository buffer in the external memory. The payload is flushed by reading the payload identifier and loading the payload from the repository buffer. The delegate is executed using the loaded payload.

Oldest operation wait time indication input into set-dueling
11561895 · 2023-01-24 · ·

Systems, apparatuses, and methods for dynamically adjusting cache policies to reduce execution core wait time are disclosed. A processor includes a cache subsystem. The cache subsystem includes one or more cache levels and one or more cache controllers. A cache controller partitions a cache level into two test portions and a remainder portion. The cache controller applies a first policy to the first test portion and applies a second policy to the second test portion. The cache controller determines the amount of time the execution core spends waiting on accesses to the first and second test portions. If the measured wait time is less for the first test portion than for the second test portion, then the cache controller applies the first policy to the remainder portion. Otherwise, the cache controller applies the second policy to the remainder portion.

Cache management method using object-oriented manner and associated microcontroller
11693782 · 2023-07-04 · ·

The present invention provides a microcontroller, wherein the microcontroller includes a processor, a first memory and a cache controller. The first memory includes at least a working space. The cache controller is coupled to the first memory, and is arranged for managing the working space of the first memory, and dynamically loading at least one object from a second memory to the working space of the first memory in an object-oriented manner.

LATERAL PERSISTENCE DIRECTORY STATES

Aspects of the invention include defining one or more processor units having a plurality of caches, each processor unit comprising a processor having at least one cache, and wherein each of the one or more processor units are coupled together by an interconnect fabric, for each of the plurality of caches, arranging a plurality of cache lines into one or more congruence classes, each congruence class comprises a chronology vector, arranging each cache in the plurality of caches into a cluster of caches based on a plurality of scope domains, determining a first cache line to evict based on the chronology vector, and determining a target cache for installing the first cache line based on a scope of the first cache line and a saturation metric associated with the target cache, wherein the scope of the first cache line is determined based on lateral persistence tag bits.

System cache optimizations for deep learning compute engines
11586558 · 2023-02-21 · ·

In an example, an apparatus comprises a plurality of compute engines; and logic, at least partially including hardware logic, to detect a cache line conflict in a last-level cache (LLC) communicatively coupled to the plurality of compute engines; and implement context-based eviction policy to determine a cache way in the cache to evict in order to resolve the cache line conflict. Other embodiments are also disclosed and claimed.

Write cache for neural network inference circuit
11586910 · 2023-02-21 · ·

Some embodiments provide a neural network inference circuit (NNIC) for executing a neural network that includes computation nodes at multiple layers. The NNIC includes multiple value computation circuits for computing output values of computation nodes. The NNIC includes a set of memories for storing the output values of computation nodes for use as input values to computation nodes in subsequent layers of the neural network. The NNIC includes a set of write control circuits for writing the computed output values to the set of memories. Upon receiving a set of computed output values, a write control circuit (i) temporarily stores the set of computed output values in a cache when adding the set of computed output values to the cache does not cause the cache to fill up and (ii) writes data in the cache to the set of memories when the cache fills up.

Managing memory allocation between input/output adapter caches

A first cache of a first IOA is detected storing an amount of data that satisfies a memory shortage threshold. A request for extra memory for the first IOA is transmitted. The request is sent in response to detecting that the first cache stores the amount of data that satisfies the memory shortage threshold. The request is transmitted to a plurality of IOAs of a computer system. A second cache of a second IOA is detected storing an amount of data that satisfies a memory dissemination threshold. Memory of the second cache is allocated to the first cache. The memory is allocated in response to the request and the amount of data in the second cache satisfying the memory dissemination threshold.

Dynamic cache allocation policy adaptation in a data processing apparatus
09836403 · 2017-12-05 · ·

A data processing apparatus and method of processing data are disclosed according to which a processor unit is configured to issue write access requests for memory which are buffered and handled by a memory access buffer. A cache unit is configured, in dependence on an allocation policy defined for the cache unit, to cache accessed data items. Memory transactions are constrained to be carried out so that all of a predetermined range of memory addresses within which one or more memory addresses specified by the buffered write access requests lie must be written by the corresponding write operation. If the buffered write access requests do not comprise all memory addresses within at least two predetermined ranges of memory addresses, and the cache unit is configured to operate with a no-write allocate policy, the data processing apparatus is configured to cause the cache unit to subsequently operate with a write allocate policy.

USING AN ACCESS INCREMENT NUMBER TO CONTROL A DURATION DURING WHICH TRACKS REMAIN IN CACHE

Provided are a computer program product, system, and method for using an access increment number to control a duration during which tracks remain in cache. Tracks in a storage in the cache are indicated in a cache list. For each of the tracks indicated in the cache list, an access value is updated when one of the tracks is accessed in the cache. An access to a track in the cache indicated in the cache list is received. A determination is made as to whether an access increment number for the accessed track, wherein the access increment number is greater than one. The access value for the accessed track is incremented by the determined access increment number in response to the track being accessed in the cache. The access value for one of the tracks is used to determine whether to initiate to demote the track from the cache.

Dynamically-Adjusted Host Memory Buffer
20170293562 · 2017-10-12 ·

Host memory buffer is dynamically adjusted based on performance. As memory pages are accessed, one or more counts of the memory pages are maintained. If the counts indicate some of the memory pages are identical, then a portion of host system memory allocated to buffer cache may be reduced or decremented in response to repetitive access. However, if the counts indicate different memory pages are accessed, then the host system memory allocated to the buffer cache may be increased or incremented.