Patent classifications
G06F12/0851
Cache array macro micro-masking
A computer-implemented method for memory macro disablement in a cache memory includes identifying a defective portion of a memory macro of a cache memory bank. The method includes iteratively testing each line of the memory macro, the testing including attempting at least one write operation at each line of the memory macro. The method further includes determining that an error occurred during the testing. The method further includes, in response to determining the memory macro as being defective, disabling write operations for a portion of the cache memory bank that includes the memory macro by generating a logical mask that includes at least bits comprising a compartment bit, and read address bits.
MULTICORE SHARED CACHE OPERATION ENGINE
Techniques including receiving configuration information for a trigger control channel of the one or more trigger control channels, the configuration information defining a first one or more triggering events, receiving a first memory management command, store the first memory management command, detecting a first one or more triggering events, and triggering the stored first memory management command based on the detected first one or more triggering events.
DELAYED SNOOP FOR IMPROVED MULTI-PROCESS FALSE SHARING PARALLEL THREAD PERFORMANCE
Techniques for maintaining cache coherency comprising storing data blocks associated with a main process in a cache line of a main cache memory, storing a first local copy of the data blocks in a first local cache memory of a first processor, storing a second local copy of the set of data blocks in a second local cache memory of a second processor executing a first child process of the main process to generate first output data, writing the first output data to the first data block of the first local copy as a write through, writing the first output data to the first data block of the main cache memory as a part of the write through, transmitting an invalidate request to the second local cache memory, marking the second local copy of the set of data blocks as delayed, and transmitting an acknowledgment to the invalidate request.
Distributed error detection and correction with hamming code handoff
A device includes a data path, a first interface connected to the data path and configured to receive a request from a processor package to write a data value to a memory address, and a controller connected to the data path and configured to receive the request to write the data value to the memory address and to calculate a Hamming code of the data value. The controller is configured to transmit the data value and the Hamming code on the data path. The device includes an external memory interleave connected to the data path. The external memory interleave is configured to receive the data value and calculate a test Hamming code of the data value and to determine whether to send the data value to an external memory interface to be written to the memory address based on a comparison of the Hamming code and the test Hamming code.
Multi-processor, multi-domain, multi-protocol, cache coherent, speculation aware shared memory and interconnect
A device includes an interconnect and a plurality of devices connected to the interconnect. The plurality of devices includes a first interface connected to the interconnect and a second interface connected to the interconnect. The plurality of devices further includes a first memory bank connected to the interconnect and a second memory bank connected to the interconnect. The plurality of devices further includes an external memory interface connected to the interconnect and a controller configured to establish virtual channels among the plurality of devices connected to the interconnect.
DYNAMIC MEMORY ADDRESS ENCODING
Described herein is a memory architecture that is configured to dynamically determine an address encoding to use to encode multi-dimensional data such as multi-coordinate data in a manner that provides a coordinate bias corresponding to a current memory access pattern. The address encoding may be dynamically generated in response to receiving a memory access request or may be selected from a set of preconfigured address encodings. The dynamically generated or selected address encoding may apply an interleaving technique to bit representations of coordinate values to obtain an encoded memory address. The interleaving technique may interleave a greater number of bits from the bit representation corresponding to the coordinate direction in which a coordinate bias is desired than from bit representations corresponding to other coordinate directions.
Configurable cache for multi-endpoint heterogeneous coherent system
A device includes a memory bank. The memory bank includes data portions of a first way group. The data portions of the first way group include a data portion of a first way of the first way group and a data portion of a second way of the first way group. The memory bank further includes data portions of a second way group. The device further includes a configuration register and a controller configured to individually allocate, based on one or more settings in the configuration register, the first way and the second way to one of an addressable memory space and a data cache.
Delayed snoop for improved multi-process false sharing parallel thread performance
Techniques for maintaining cache coherency comprising storing data blocks associated with a main process in a cache line of a main cache memory, storing a first local copy of the data blocks in a first local cache memory of a first processor, storing a second local copy of the set of data blocks in a second local cache memory of a second processor executing a first child process of the main process to generate first output data, writing the first output data to the first data block of the first local copy as a write through, writing the first output data to the first data block of the main cache memory as a part of the write through, transmitting an invalidate request to the second local cache memory, marking the second local copy of the set of data blocks as delayed, and transmitting an acknowledgment to the invalidate request.
TECHNIQUES FOR STORING DATA AND TAGS IN DIFFERENT MEMORY ARRAYS
A memory controller includes logic circuitry to generate a first data address identifying a location in a first external memory array for storing first data, a first tag address identifying a location in a second external memory array for storing a first tag, a second data address identifying a location in the second external memory array for storing second data, and a second tag address identifying a location in the first external memory array for storing a second tag. The memory controller includes an interface that transfers the first data address and the first tag address for a first set of memory operations in the first and the second external memory arrays. The interface transfers the second data address and the second tag address for a second set of memory operations in the first and the second external memory arrays.
Reverse order queue updates by virtual devices
A system includes a memory including a ring buffer having a plurality of slots, a processor in communication with the memory, a guest operating system, and a hypervisor. The hypervisor is configured to detect a request associated with a memory entry, retrieve up to a predetermined quantity of memory entries in the ring buffer from an original slot to an end slot, and test a respective descriptor of each successive slot from the original slot through the end slot while the respective descriptor of each successive slot in the ring buffer remains unchanged. Additionally, the hypervisor is configured to execute the request associated with the memory entries and respective valid descriptors. The hypervisor is also configured to walk the ring buffer backwards from the end slot to the original slot while clearing the valid descriptors.