G06F2212/304

ARCHITECTURAL SUPPORT FOR PERSISTENT APPLICATIONS

Illustrative embodiments are directed to methods, apparatus and computer program products for caching at least a fraction of data stored in a non-volatile memory in a mirror region of a dynamic random access memory. A memory controller hub of a processor chip coupled to both the non-volatile memory and the dynamic random access memory is configured to, when an update to the dynamic random access memory is cached in the mirror region of the dynamic random access memory, use the memory controller hub to write the update directly to the mirror region of the dynamic random access memory and concurrently mirror the update to the non-volatile memory to provide coherent persistent durability of the update. When a read from the dynamic random access memory is cached in the mirror region of the dynamic random access memory, embodiments can use the memory controller hub to serve the read directly from the mirror region of the dynamic random access memory to optimize read operations of persistent objects.

Configurable cache for multi-endpoint heterogeneous coherent system

A device includes a memory bank. The memory bank includes data portions of a first way group. The data portions of the first way group include a data portion of a first way of the first way group and a data portion of a second way of the first way group. The memory bank further includes data portions of a second way group. The device further includes a configuration register and a controller configured to individually allocate, based on one or more settings in the configuration register, the first way and the second way to one of an addressable memory space and a data cache.

MULTI-PROCESSOR BRIDGE WITH CACHE ALLOCATE AWARENESS
20210349821 · 2021-11-11 ·

Techniques for loading data, comprising receiving a memory management command to perform a memory management operation to load data into the cache memory before execution of an instruction that requests the data, formatting the memory management command into one or more instruction for a cache controller associated with the cache memory, and outputting an instruction to the cache controller to load the data into the cache memory based on the memory management command.

VIRTUAL NETWORK PRE-ARBITRATION FOR DEADLOCK AVOIDANCE AND ENHANCED PERFORMANCE
20230325078 · 2023-10-12 ·

A device includes a data path, a first interface configured to receive a first memory access request from a first peripheral device, and a second interface configured to receive a second memory access request from a second peripheral device. The device further includes an arbiter circuit configured to, in a first clock cycle, a pre-arbitration winner between a first memory access request and a second memory access request based on a first number of credits allocated to a first destination device and a second number of credits allocated to a second destination device. The arbiter circuit is further configured to, in a second clock cycle select a final arbitration winner from among the pre-arbitration winner and a subsequent memory access request based on a comparison of a priority of the pre-arbitration winner and a priority of the subsequent memory access request.

Multicore shared cache operation engine

Techniques including receiving configuration information for a trigger control channel of the one or more trigger control channels, the configuration information defining a first one or more triggering events, receiving a first memory management command, store the first memory management command, detecting a first one or more triggering events, and triggering the stored first memory management command based on the detected first one or more triggering events.

Delayed snoop for improved multi-process false sharing parallel thread performance

Techniques for maintaining cache coherency comprising storing data blocks associated with a main process in a cache line of a main cache memory, storing a first local copy of the data blocks in a first local cache memory of a first processor, storing a second local copy of the set of data blocks in a second local cache memory of a second processor executing a first child process of the main process to generate first output data, writing the first output data to the first data block of the first local copy as a write through, writing the first output data to the first data block of the main cache memory as a part of the write through, transmitting an invalidate request to the second local cache memory, marking the second local copy of the set of data blocks as delayed, and transmitting an acknowledgment to the invalidate request.

CONFIGURABLE CACHE FOR COHERENT SYSTEM
20230384931 · 2023-11-30 ·

A device includes a memory bank. The memory bank includes data portions of a first way group. The data portions of the first way group include a data portion of a first way of the first way group and a data portion of a second way of the first way group. The memory bank further includes data portions of a second way group. The device further includes a configuration register and a controller configured to individually allocate, based on one or more settings in the configuration register, the first way and the second way to one of an addressable memory space and a data cache.

CREDIT AWARE CENTRAL ARBITRATION FOR MULTI-ENDPOINT, MULTI-CORE SYSTEM
20220374356 · 2022-11-24 ·

A device includes a data path, a first interface configured to receive a first memory access request from a first peripheral device, and a second interface configured to receive a second memory access request from a second peripheral device. The device further includes an arbiter circuit configured to determine a first destination device connected to the data path and associated with the first memory access request and a first credit threshold corresponding to the first memory access request. The arbiter circuit is further configured to determine a second destination device connected to the data path and associated with the second memory access request and a second credit threshold corresponding to the second memory access request. The arbiter circuit is configured to arbitrate access to the data path by the first memory access request and the second memory access request based on the first credit threshold and the second credit threshold.

MULTICORE, MULTIBANK, FULLY CONCURRENT COHERENCE CONTROLLER

A system includes a multi-core shared memory controller (MSMC). The MSMC includes a snoop filter bank, a cache tag bank, and a memory bank. The cache tag bank is connected to both the snoop filter bank and the memory bank. The MSMC further includes a first coherent slave interface connected to a data path that is connected to the snoop filter bank. The MSMC further includes a second coherent slave interface connected to the data path that is connected to the snoop filter bank. The MSMC further includes an external memory master interface connected to the cache tag bank and the memory bank. The system further includes a first processor package connected to the first coherent slave interface and a second processor package connected to the second coherent slave interface. The system further includes an external memory device connected to the external memory master interface.

DISTRIBUTED ERROR DETECTION AND CORRECTION WITH HAMMING CODE HANDOFF
20220283942 · 2022-09-08 ·

A device includes a data path, a first interface connected to the data path and configured to receive a request from a processor package to write a data value to a memory address, and a controller connected to the data path and configured to receive the request to write the data value to the memory address and to calculate a Hamming code of the data value. The controller is configured to transmit the data value and the Hamming code on the data path. The device includes an external memory interleave connected to the data path. The external memory interleave is configured to receive the data value and calculate a test Hamming code of the data value and to determine whether to send the data value to an external memory interface to be written to the memory address based on a comparison of the Hamming code and the test Hamming code.