Patent classifications
G06F12/1027
Resiliency and performance for cluster memory
Disclosed are various embodiments for improving resiliency and performance of clustered memory. A computing device can acquire a chunk of byte-addressable memory from a cluster memory host. The computing device can then identify an active set of allocated memory pages and an inactive set of allocated memory pages for a process executing on the computing device. Next, the computing device can store the active set of allocated memory pages for the process in the memory of the computing device. Finally, the computing device can store the inactive set of allocated memory pages for the process in the chunk of byte-addressable memory of the cluster memory host.
Using request class and reuse recording in one cache for insertion policies of another cache
Systems and methods are disclosed for maintaining insertion policies of a lower-level cache. Techniques are described for selecting, based on metadata of an evicted data block received from an upper-level cache, an insertion policy out of the insertion policies. Then, determining, based on the selected insertion policy, whether to insert the data block into the lower-level cache. If it is determined to insert, the data block is inserted into the lower-level cache according to the selected insertion policy. Techniques for dynamically updating the insertion policies of the lower-level cache are also disclosed.
Method and apparatus for using a storage system as main memory
A data access system including a processor, multiple cache modules for the main memory, and a storage drive. The cache modules include a FLC controller and a main memory cache. The multiple cache modules function as main memory. The processor sends read/write requests (with physical address) to the cache module. The cache module includes two or more stages with each stage including a FLC controller and DRAM (with associated controller). If the first stage FLC module does not include the physical address, the request is forwarded to a second stage FLC module. If the second stage FLC module does not include the physical address, the request is forwarded to the storage drive, a partition reserved for main memory. The first stage FLC module has high speed, lower power operation while the second stage FLC is a low-cost implementation. Multiple FLC modules may connect to the processor in parallel.
Memory Management Unit for Multi-Threaded Architecture
An exemplary multi-threaded memory management system comprises a memory management unit (MMU) configured with a plurality of physical address (PA) output ports individually dedicated to a respective plurality of threads, wherein the MMU is configured to adjust scheduling of the plurality of threads based on the status of an item requested from a cache. The MMU may be configured to translate a virtual address (VA) input from an individual thread to a PA output on the respective PA output port. The cache may be a translation look-aside buffer. The item requested from the cache may be in transient status when a response is expected or valid status when the response is received. The MMU may signal a thread scheduler to run a thread when a requested item's status becomes valid, permitting stalling individual threads without blocking other threads that continue running using the PA output port dedicated to each thread.
Memory Management Unit for Multi-Threaded Architecture
An exemplary multi-threaded memory management system comprises a memory management unit (MMU) configured with a plurality of physical address (PA) output ports individually dedicated to a respective plurality of threads, wherein the MMU is configured to adjust scheduling of the plurality of threads based on the status of an item requested from a cache. The MMU may be configured to translate a virtual address (VA) input from an individual thread to a PA output on the respective PA output port. The cache may be a translation look-aside buffer. The item requested from the cache may be in transient status when a response is expected or valid status when the response is received. The MMU may signal a thread scheduler to run a thread when a requested item's status becomes valid, permitting stalling individual threads without blocking other threads that continue running using the PA output port dedicated to each thread.
Flexible on-die fabric interface
An interface for coupling an agent to a fabric supports a set of coherent interconnect protocols and includes a global channel to communicate control signals to support the interface, a request channel to communicate messages associated with requests to other agents on the fabric, a response channel to communicate responses to other agents on the fabric, and a data channel to couple to communicate messages associated with data transfers to other agents on the fabric, where the data transfers include payload data.
ERROR CONTAINMENT FOR ENABLING LOCAL CHECKPOINT AND RECOVERY
Various embodiments include a parallel processing computer system that detects memory errors as a memory client loads data from memory and disables the memory client from storing data to memory, thereby reducing the likelihood that the memory error propagates to other memory clients. The memory client initiates a stall sequence, while other memory clients continue to execute instructions and the memory continues to service memory load and store operations. When a memory error is detected, a specific bit pattern is stored in conjunction with the data associated with the memory error. When the data is copied from one memory to another memory, the specific bit pattern is also copied, in order to identify the data as having a memory error.
ERROR CONTAINMENT FOR ENABLING LOCAL CHECKPOINT AND RECOVERY
Various embodiments include a parallel processing computer system that detects memory errors as a memory client loads data from memory and disables the memory client from storing data to memory, thereby reducing the likelihood that the memory error propagates to other memory clients. The memory client initiates a stall sequence, while other memory clients continue to execute instructions and the memory continues to service memory load and store operations. When a memory error is detected, a specific bit pattern is stored in conjunction with the data associated with the memory error. When the data is copied from one memory to another memory, the specific bit pattern is also copied, in order to identify the data as having a memory error.
Method, system, and apparatus for supporting multiple address spaces to facilitate data movement
Methods, systems, and apparatuses provide support for multiple address spaces in order to facilitate data movement. One apparatus includes an input/output memory management unit (IOMMU) comprising: a plurality of memory-mapped input/output (MMIO) registers that map memory address spaces belonging to the IOMMU and at least a second IOMMU; and hardware control logic operative to: synchronize the plurality of MMIO registers of the at least the second IOMMU; receive, from a peripheral component endpoint coupled to the IOMMU, a direct memory access (DMA) request, the DMA request to a memory address space belonging to the at least the second IOMMU; access the plurality of MMIO registers of the IOMMU based on context data of the DMA request; and access, from the IOMMU, a function assigned to the memory address space belonging to the at least the second IOMMU based on the accessed plurality of MMIO registers.
Secure processor and a program for a secure processor
The instruction code including an instruction code stored in the area where the encrypted instruction code is stored in a non-rewritable format is authenticated using a specific key which is specific to the core where the instruction code is executed or an authenticated key by a specific key to perform an encryption processing for the input and output data between the core and the outside.