G06F2212/6042

APPARATUSES, SYSTEMS, AND METHODS FOR CONTROLLING CACHE ALLOCATIONS IN A CONFIGURABLE COMBINED PRIVATE AND SHARED CACHE IN A PROCESSOR-BASED SYSTEM

Apparatuses, systems, and methods for controlling cache allocations in a configurable combined private and shared cache in a processor-based system. The processor-based system is configured to receive a cache allocation request to allocate a line in a share cache structure, which may further include a client identification (ID). The cache allocation request and the client ID can be compared to a sub-non-uniform memory access (NUMA) (sub-NUMA) bit mask and a client allocation bit mask to generate a cache allocation vector. The sub-NUMA bit mask may have been programmed to indicate that processing cores associated with a sub-NUMA region are available, whereas processing cores associated with other sub-NUMA regions are not available, and the client allocation bit mask may have been programmed to indicate that processing cores are available. The sub-NUMA bit mask and the client allocation bit mask can be combined to create a cache allocation vector that a cache allocation request to allocate a line serviced by one of processing cores.

Distributed numeric sequence generation
11526927 · 2022-12-13 · ·

Various embodiments of a distributed numeric sequence generation system and method are described. In particular, some embodiments provide high-scale, high-availability, low-cost and low-maintenance numeric sequence generation in a non-Relational Database Management System (“non-RBMS”) system by sacrificing monotonicity. The distributed numeric sequence generation system comprises a plurality of hosts, wherein individual hosts implement a cache for caching a plurality of numeric sequences. A host can access master numeric sequence data at a separate system to obtain values for numeric sequences to store in its cache. A host can receive a request from a client for values of a numeric sequence, and provide to the client the values for the numeric sequence from its cache. Some embodiments of the distributed numeric sequence generation system and method are also equipped to vend recyclable and bounded numeric sequences.

Techniques to support a holistic view of cache class of service for a processor cache

A holistic view of cache class of service (CLOS) to include an allocation of processor cache resources to a plurality of CLOS. The allocation of processor cache resources to include allocation of cache ways for an n-way set of associative cache. Examples include monitoring usage of the plurality of CLOS to determine processor cache resource usage and to report the processor cache resource usage.

Spiking neural network accelerator using external memory
11593623 · 2023-02-28 · ·

System configurations and techniques for implementation of a neural network in neuromorphic hardware with use of external memory resources are described herein. In an example, a system for processing spiking neural network operations includes: a plurality of neural processor clusters to maintain neurons of the neural network, with the clusters including circuitry to determine respective states of the neurons and internal memory to store the respective states of the neurons; and a plurality of axon processors to process synapse data of synapses in the neural network, with the processors including circuitry to retrieve synapse data of respective synapses from external memory, evaluate the synapse data based on a received spike message, and propagate another spike message to another neuron based on the synapse data. Further details for use and access of the external memory and processing configurations for such neural network operations are also disclosed.

TIERED CACHING OF DATA IN A STORAGE SYSTEM
20230055389 · 2023-02-23 ·

A first read request for data stored at a non-volatile memory is received by a primary storage controller. The data is programmed from the non-volatile memory to a first cache of the primary storage controller, the first cache to store the data over a first time range. A second read request is received for the data. In response to receiving the second read request for the data, the data is programmed to a second cache to store the data over a second time range that is greater than the first time range. A notification is transmitted to a secondary storage controller, the notification including information associated with the programming of the data to the second cache.

System cache optimizations for deep learning compute engines
11586558 · 2023-02-21 · ·

In an example, an apparatus comprises a plurality of compute engines; and logic, at least partially including hardware logic, to detect a cache line conflict in a last-level cache (LLC) communicatively coupled to the plurality of compute engines; and implement context-based eviction policy to determine a cache way in the cache to evict in order to resolve the cache line conflict. Other embodiments are also disclosed and claimed.

METHODS AND APPARATUS FOR ALLOCATION IN A VICTIM CACHE SYSTEM

Methods, apparatus, systems and articles of manufacture are disclosed for allocation in a victim cache system. An example apparatus includes a first cache storage, a second cache storage, a cache controller coupled to the first cache storage and the second cache storage and operable to receive a memory operation that specifies an address, determine, based on the address, that the memory operation evicts a first set of data from the first cache storage, determine that the first set of data is unmodified relative to an extended memory, and cause the first set of data to be stored in the second cache storage.

Tiered caching of data in a storage system

A first read request for data stored at a non-volatile memory is received by a primary storage controller. The data is programmed from the non-volatile memory to a first cache of the primary storage controller, the first cache to store the data over a first time range. A second read request is received for the data. In response to receiving the second read request for the data, the data is programmed to a second cache to store the data over a second time range that is greater than the first time range. A notification is transmitted to a secondary storage controller, the notification including information associated with the programming of the data to the second cache.

Methods and apparatus for implementing cache policies in a graphics processing unit

A method of processing a workload in a graphics processing unit (GPU) may include detecting a work item of the workload in the GPU, determining a cache policy for the work item, and operating at least a portion of a cache memory hierarchy in the GPU for at least a portion of the work item based on the cache policy. The work item may be detected based on information received from an application and/or monitoring one or more performance counters by a driver and/or hardware detection logic. The method may further include monitoring one or more performance counters, wherein the cache policy for the work item may be determined and/or changed based on the one or more performance counters. The cache policy for the work item may be selected based on a runtime learning model.

Secure fast reboot of a virtual machine

A system for managing a virtual machine is provided. The system includes a processor configured to initiate a session for accessing a virtual machine by accessing an operating system image from a system disk and monitor read and write requests generated during the session. The processor is further configured to write any requested information to at least one of a memory cache and a write back cache located separately from the system disk and read the operating system image content from at least one of the system disk and a host cache operably coupled between the system disk and the at least one processor. Upon completion of the computing session, the processor is configured to clear the memory cache, clear the write back cache, and reboot the virtual machine using the operating system image stored on the system disk or stored in the host cache.