G06F2212/603

COMPUTATION PROCESSING APPARATUS AND METHOD OF PROCESSING COMPUTATION
20230169009 · 2023-06-01 · ·

A computation processing apparatus that is able to execute threads, the apparatus includes: a cache including ways which respectively include storage areas identified by index addresses; and a processor coupled to the cache and configured to: determine a cache hit; hold a way number and an index address which identify a storage area holding target data of an atomic instruction executed by any one of the threads; determine a conflict between instructions in a case where a pair of the way number and the index address match a pair of a way number and an index address that identify a storage area that holds target data of a memory access instruction executed by an other one of the threads;

and suppress input and output of the target data of the memory access instruction to and from the cache when determining the conflict.

SYSTEM AND METHODS FOR PROVIDING FAST CACHEABLE ACCESS TO A KEY-VALUE DEVICE THROUGH A FILESYSTEM INTERFACE
20170242867 · 2017-08-24 ·

A system and method for leveraging a native operating system (130) page cache (315) when using non-block system storage devices (120) is disclosed. A computer (105) may include a processor (110), memory (115), and a non-block system storage device (120). A file system (135) may be stored in memory (115) and running on the processor (110), which may include a page cache (315). A key-value file system (KVFS) (145) may reside between the file system (135) and the storage device (120) and may map received file system commands (310) to key-value system commands (330) that may be executed by the storage device (120). Results of the key-value system commands (330) may be returned to the file system (135), permitting the operating system (130) to cache data in the page cache (315).

Memory nest efficiency with cache demand generation

Embodiments of the disclosure relate to optimizing a memory nest for a workload. Aspects include an operating system determining the cache/memory footprint of each work unit of the workload and assigning a time slice to each work unit of the workload based on the cache/memory footprint of each work unit. Aspects further include executing the workload on a processor by providing each work unit access to the processor for the time slice assigned to each work unit.

READ PREDICTION DURING A SYSTEM BOOT PROCEDURE
20210373907 · 2021-12-02 ·

Methods, systems, and devices for read prediction during a system boot procedure are described. A memory device may identify a command for a boot procedure and transfer data stored in a memory array to a cache of the memory device. In some cases, the memory device may prefetch data used during the boot procedure and thereby improve the latency of the boot procedure. When the memory device receives a command that requests data stored in the memory array as part of the boot procedure, the memory device may identify a cache hit based on prefetching the requested data before the command is received. In such cases, the memory device may retrieve the prefetched data from the cache.

DYNAMIC LAST LEVEL CACHE ALLOCATION FOR CLOUD REAL-TIME WORKLOADS
20220197700 · 2022-06-23 ·

A system includes a memory, a processor in communication with the memory, and an operating system (“OS”) executing on the processor. The processor belongs to a processor socket. The OS is configured to pin a workload of a plurality of workloads to the processor belonging to the processor socket. Each respective processor belonging to the processor socket shares a common last-level cache (“LLC”). The OS is also configured to measure an LLC occupancy for the workload, reserve the LLC occupancy for the workload thereby isolating the workload from other respective workloads of the plurality of workloads sharing the processor socket, and maintain isolation by monitoring the LLC occupancy for the workload.

Cache filter
11366762 · 2022-06-21 · ·

The present disclosure includes apparatuses and methods related to a memory system including a filter. An example apparatus can include a filter to store a number flags, wherein each of the number of flags corresponds to a cache entry and each of the number of flags identifies a portion of the memory device where data of a corresponding cache entry is stored in the memory device.

Memory device implementing multiple port read
11367480 · 2022-06-21 · ·

A memory device provides for a multiple-port read operation, and includes an array of bitcells and a control circuit. Each bitcell of the array includes a write wordline port and a first read wordline port. The control circuit provides an output to the write wordline port, and includes as inputs a write select port and a second read wordline port. In a write mode, the control circuit couples the write select port to the output and disables the second read port. In a read mode, the control circuit couples the second read wordline port to the output and disables the write select port, thereby enabling a multiple-port read operation to the array of bitcells.

Multi-core interconnection bus, inter-core communication method, and multi-core processor

The present invention discloses a multi-core interconnection bus, including a request transceiver module adapted to receive a data request from a processor core, and forward the data request to a snoop and caching module through a request execution module, where the data request includes a request address; the snoop and caching module adapted to look up cache data validity information of the request address, acquire data from a shared cache, and sequentially return the cache data validity information and the data acquired from the shared cache to the request execution module; and the request execution module adapted to determine, based on the cache data validity information, a target processor core whose local cache stores valid data, forward the data request to the target processor core, and receive returned data; and determine response data from the data returned by the target processor core and that returned by the snoop and caching module, and return, through the request transceiver module, the response data to the processor core that initiates the data request. The present invention also discloses a corresponding inter-core communication method and a multi-core processor.

Virtual media performance improvement

An information handling system may include a host system and a management controller configured to provide out-of-band management of the information handling system. The management controller may be configured to: receive, from a management console, a request to establish virtual media for the host system; cause the requested virtual media to be mounted as a drive accessible to the host system; receive read requests from the host system for data associated with the mounted drive; and cache data from the virtual media in a local cache such that at least some of the read requests from the host system are serviceable via the local cache instead of via a network request to the management console.

High-throughput software-defined convolutional interleavers and de-interleavers
11722154 · 2023-08-08 · ·

High-throughput software-defined convolutional interleavers and de-interleavers are provided herein. In some examples, a method for generating convolutionally interleaved samples on a general purpose processor with cache is provided. Memory is represented as a three dimensional array, indexed by block number, row, and column. Input samples may be written to the cache according to an indexing scheme. Output samples may be generated every MN samples by reading out the samples from the cache in a transposed and vectorized order.