G06F12/1072

CONCURRENT PROCESSING OF MEMORY MAPPING INVALIDATION REQUESTS
20220414016 · 2022-12-29 ·

A translation lookaside buffer (TLB) receives mapping invalidation requests from one or more sources, such as one or more processing units of a processing system. The TLB includes one or more invalidation processing pipelines, wherein each processing pipeline includes multiple processing states arranged in a pipeline, so that a given stage executes its processing operations concurrent with other stages of the pipeline executing their processing operations.

FAULT BUFFER FOR TRACKING PAGE FAULTS IN UNIFIED VIRTUAL MEMORY SYSTEM

A system for managing virtual memory. The system includes a first processing unit configured to execute a first operation that references a first virtual memory address. The system also includes a first memory management unit (MMU) associated with the first processing unit and configured to generate a first page fault upon determining that a first page table that is stored in a first memory unit associated with the first processing unit does not include a mapping corresponding to the first virtual memory address. The system further includes a first copy engine associated with the first processing unit. The first copy engine is configured to read a first command queue to determine a first mapping that corresponds to the first virtual memory address and is included in a first page state directory. The first copy engine is also configured to update the first page table to include the first mapping.

FAULT BUFFER FOR TRACKING PAGE FAULTS IN UNIFIED VIRTUAL MEMORY SYSTEM

A system for managing virtual memory. The system includes a first processing unit configured to execute a first operation that references a first virtual memory address. The system also includes a first memory management unit (MMU) associated with the first processing unit and configured to generate a first page fault upon determining that a first page table that is stored in a first memory unit associated with the first processing unit does not include a mapping corresponding to the first virtual memory address. The system further includes a first copy engine associated with the first processing unit. The first copy engine is configured to read a first command queue to determine a first mapping that corresponds to the first virtual memory address and is included in a first page state directory. The first copy engine is also configured to update the first page table to include the first mapping.

ARTIFICIAL INTELLIGENCE CHIP AND DATA OPERATION METHOD

An artificial intelligence chip and a data operation method are provided. The artificial intelligence chip receives a command carrying first data and address information and includes a chip memory, a computing processor, a base address register, and an extended address processor. The base address register is configured to access an extended address space in the chip memory. The extended address processor receives the command. The extended address processor determines an operation mode of the first data according to the address information. When the address information points to a first section of the extended address space, the extended address processor performs a first operation on the first data. When the address information points to a section other than the first section of the extended address space, the extended address processor notifies the computing processor of the operation mode and the computing processor performs a second operation on the first data.

INDEPENDENTLY CONTROLLED DMA AND CPU ACCESS TO A SHARED MEMORY REGION

An embodiment of an integrated circuit comprises circuitry to share page tables associated with a page between a processor memory management unit (MMU) and an input/output memory management unit (IOMMU), store a page table entry in the memory associated with the page, and separately control access to the page from a processor and from a direct memory access (DMA) request based on one or more fields of the stored page table entry. Other embodiments are disclosed and claimed.

USER-SPACE REMOTE MEMORY PAGING
20220398199 · 2022-12-15 ·

Techniques for implementing user-space remote memory paging are provided. In one set of embodiments, these techniques include a user-space remote memory paging (RMP) runtime that can: (1) pre-allocate one or more regions of remote memory for use by an application; (2) at a time of receiving/intercepting a memory allocation function call invoked by the application, map the virtual memory address range of the allocated local memory to a portion of the pre-allocated remote memory; (3) at a time of detecting a page fault directed to a page that is mapped to remote memory, retrieve the page via Remote Direct Memory Access (RDMA) from its remote memory location and store the retrieved page in a local main memory cache; and (4) on a periodic basis, identify pages in the local main memory cache that are candidates for eviction and write out the identified pages via RDMA to their mapped remote memory locations if they have been modified.

Application-transparent near-memory processing architecture with memory channel network

A system includes a printed circuit board (PCB) on which is disposed memory components and a processor disposed on the PCB and coupled between the memory components and a host memory controller. The processor comprises a memory channel network (MCN) memory controller to handle memory requests associated with the memory components; a local buffer; and a core coupled to the MCN memory controller and the local buffer. The core executes an operating system (OS) running a network software layer and a distributed computing framework; and an MCN driver to: receive a network packet from the network software layer; store the network packet in the local buffer; and assert a transmit polling field of the local buffer to signal to the host memory controller that the network packet is available for transmission to a host computing device.

LIVE-MIGRATION OF PINNED DIRECT MEMORY ACCESS PAGES TO SUPPORT MEMORY HOT-REMOVE
20220374354 · 2022-11-24 ·

A system on chip (SoC) coupled to a memory can perform a hot-remove operation in a computer system. In a hot-remove operation, software (e.g., operating system) and hardware (e.g., memory controller and interconnect circuitry) components migrate memory content from one region to another target region in the memory. A peripheral device can have direct memory access (DMA) to a page in the region of memory that is being hot-removed. The interconnect circuitry can migrate the page to the target region while maintaining the peripheral device's direct access to the memory. Interconnect circuitry uses hardware mirroring in response to a write command to a memory address in the region being hot-removed. With hardware mirroring, the data is stored in two locations; the first location is the memory address in the region being moved, and the second location is a memory address in the target region.

Pooled memory address translation
11507528 · 2022-11-22 · ·

A shared memory controller receives, from a computing node, a request associated with a memory transaction involving a particular line in a memory pool. The request includes a node address according to an address map of the computing node. An address translation structure is used to translate the first address into a corresponding second address according to a global address map for the memory pool, and the shared memory controller determines that a particular one of a plurality of shared memory controllers is associated with the second address in the global address map and causes the particular shared memory controller to handle the request.

Pooled memory address translation
11507528 · 2022-11-22 · ·

A shared memory controller receives, from a computing node, a request associated with a memory transaction involving a particular line in a memory pool. The request includes a node address according to an address map of the computing node. An address translation structure is used to translate the first address into a corresponding second address according to a global address map for the memory pool, and the shared memory controller determines that a particular one of a plurality of shared memory controllers is associated with the second address in the global address map and causes the particular shared memory controller to handle the request.