G06F12/1063

Remapping techniques for message signaled interrupts
11550745 · 2023-01-10 · ·

Techniques are disclosed relating to address mapping for message signaled interrupts. In some embodiments, an apparatus includes interrupt control circuitry configured to process, from multiple client circuits, message signaled interrupts that include addresses in an interrupt controller address space. First and second interface controller circuitry may control respective peripheral interfaces for multiple devices. Remap control circuitry may be configured to access a first table based on at least a portion of virtual addresses of a first message signaled interrupt from the first interface controller circuit and generate a first address in the interrupt controller address space based on an accessed entry in the first table and access a second table based on at least a portion of virtual addresses of a second message signaled interrupt from the second interface controller circuit and generate a second address in the interrupt controller address space based on an accessed entry in the second table.

Method and Apparatus for Shared Virtual Memory to Manage Data Coherency in a Heterogeneous Processing System
20180011792 · 2018-01-11 · ·

One embodiment provides for a heterogeneous computing device comprising a first processor coupled with a second processor, wherein one or more of the first or second processor includes graphics processing logic; wherein each of the first processor and the second processor includes first logic to perform virtual to physical memory address translation; and wherein the first logic includes cache coherency state for a block of memory associated with a virtual memory address.

VIRTUALIZED CACHES
20230004494 · 2023-01-05 ·

Systems and methods are disclosed for virtualized caches. For example, an integrated circuit (e.g., a processor) for executing instructions includes a virtually indexed physically tagged first-level (L1) cache configured to output to an outer memory system one or more bits of a virtual index of a cache access as one or more bits of a requestor identifier. For example, the L1 cache may be configured to operate as multiple logical L1 caches with a cache way of a size less than or equal to a virtual memory page size. For example, the integrated circuit may include an L2 cache of the outer memory system that is configured to receive the requestor identifier and implement a cache coherency protocol to disambiguate an L1 synonym occurring in multiple portions of the virtually indexed physically tagged L1 cache associated with different requestor identifier values.

MULTICAST AND REFLECTIVE MEMORY BEHAVIOR FOR MEMORY MODEL CONSISTENCY
20230229599 · 2023-07-20 ·

In various examples, a memory model may support multicasting where a single request for a memory access operation may be propagated to multiple physical addresses associated with multiple processing elements (e.g., corresponding to respective local memory). Thus, the request may cause data to be read from and/or written to memory for each of the processing elements. In some examples, a memory model exposes multicasting to processes. This may include providing for separate multicast and unicast instructions or shared instructions with one or more parameters (e.g., indicating a virtual address) being used to indicate multicasting or unicasting. Additionally or alternatively, whether a request(s) is processed using multicasting or unicasting may be opaque to a process and/or application or may otherwise be determined by the system. One or more constraints may be imposed on processing requests using multicasting to maintain a coherent memory interface.

Performing speculative address translation in processor-based devices

Performing speculative address translation in processor-based devices is disclosed herein. In one exemplary embodiment, a processor-based device provides a processing element (PE) that defines a speculative translation instruction such as an enqueue instruction for offloading operations to a peripheral device. The speculative translation instruction references a plurality of bytes including one or more virtual memory addresses. After receiving the speculative translation instruction, an instruction decode stage of an execution pipeline circuit of the PE transmits a request for address translation of the virtual memory address to a memory management unit (MMU) of the PE. The MMU then performs speculative address translation of the virtual memory address into a corresponding translated memory address. In some embodiments, any address translation errors encountered are raised to an appropriate exception level, and may be raised synchronously or asynchronously with respect to an operation performed when the speculative translation instruction is executed.

Address hashing in a multiple memory controller system

In an embodiment, a system may support programmable hashing of address bits at a plurality of levels of granularity to map memory addresses to memory controllers and ultimately at least to memory devices. The hashing may be programmed to distribute pages of memory across the memory controllers, and consecutive blocks of the page may be mapped to physically distant memory controllers. In an embodiment, address bits may be dropped from each level of granularity, forming a compacted pipe address to save power within the memory controller. In an embodiment, a memory folding scheme may be employed to reduce the number of active memory devices and/or memory controllers in the system when the full complement of memory is not needed.

Cache memory that supports tagless addressing
11537531 · 2022-12-27 · ·

The disclosed embodiments relate to a computer system with a cache memory that supports tagless addressing. During operation, the system receives a request to perform a memory access, wherein the request includes a virtual address. In response to the request, the system performs an address-translation operation, which translates the virtual address into both a physical address and a cache address. Next, the system uses the physical address to access one or more levels of physically addressed cache memory, wherein accessing a given level of physically addressed cache memory involves performing a tag-checking operation based on the physical address. If the access to the one or more levels of physically addressed cache memory fails to hit on a cache line for the memory access, the system uses the cache address to directly index a cache memory, wherein directly indexing the cache memory does not involve performing a tag-checking operation and eliminates the tag storage overhead.

Handling memory requests

A converter module is described which handles memory requests issued by a cache (e.g. an on-chip cache), where these memory requests include memory addresses defined within a virtual memory space. The converter module receives these requests, issues each request with a transaction identifier and uses that identifier to track the status of the memory request. The converter module sends requests for address translation to a memory management unit and where there the translation is not available in the memory management unit receives further memory requests from the memory management unit. The memory requests are issued to a memory via a bus and the transaction identifier for a request is freed once the response has been received from the memory. When issuing memory requests onto the bus, memory requests received from the memory management unit may be prioritized over those received from the cache.

USER-SPACE REMOTE MEMORY PAGING
20220398199 · 2022-12-15 ·

Techniques for implementing user-space remote memory paging are provided. In one set of embodiments, these techniques include a user-space remote memory paging (RMP) runtime that can: (1) pre-allocate one or more regions of remote memory for use by an application; (2) at a time of receiving/intercepting a memory allocation function call invoked by the application, map the virtual memory address range of the allocated local memory to a portion of the pre-allocated remote memory; (3) at a time of detecting a page fault directed to a page that is mapped to remote memory, retrieve the page via Remote Direct Memory Access (RDMA) from its remote memory location and store the retrieved page in a local main memory cache; and (4) on a periodic basis, identify pages in the local main memory cache that are candidates for eviction and write out the identified pages via RDMA to their mapped remote memory locations if they have been modified.

Secure address translation services using bundle access control

Embodiments are directed to providing a secure address translation service. An embodiment of a system includes a memory device to store memory data in a plurality of physical pages shared by a plurality of devices, a first table to map each page of memory to an associated bundle identifier (ID) that identifies one or more devices having access to a page of memory, a second table to map each bundle ID to page access permissions that define access to one or more pages associated with a bundle ID and a translation agent to receive requests from the plurality of devices to perform memory operations on the memory and determine page access permissions for requests received from the plurality of devices using the first table and the second table.