G06F2212/684

Translation lookaside buffer

Embodiments disclosed pertain to apparatuses, systems, and methods for Translation Lookaside Buffers (TLBs) that support virtualization and multi-threading. Disclosed embodiments pertain to a TLB that includes a content addressable memory (CAM) with variable page size entries and a set associative memory with fixed page size entries. The CAM may include: a first set of logically contiguous entry locations, wherein the first set comprises a plurality of subsets, and each subset comprises logically contiguous entry locations for exclusive use of a corresponding virtual processing element (VPE); and a second set of logically contiguous entry locations, distinct from the first set, where the entry locations in the second set may be shared among available VPEs. The set associative memory may comprise a third set of logically contiguous entry locations shared among the available VPEs distinct from the first and second set of entry locations.

Page table walker with page table entry (PTE) physical address prediction
11494300 · 2022-11-08 · ·

Methods and apparatus provide virtual to physical address translations and a hardware page table walker with region based page table prefetch operation that produces virtual memory region tracking information that includes at least: data representing a virtual base address of a virtual memory region and a physical address of a first page table entry (PTE) corresponding to a virtual page within the virtual memory region. The hardware page table walker, in response to the TLB miss indication, prefetches a physical address of a second page table entry, that provides a final physical address for the missed TLB entry, using the virtual memory region tracking information. In some implementations, the prefetching of the physical PTE address is done in parallel with earlier levels of a page walk operations.

METHOD AND APPARATUS FOR TRANSLATION LOOKASIDE BUFFER WITH MULTIPLE COMPRESSED ENCODINGS

Methods and apparatus obtain one or more system page table entries that represent virtual system (e.g., memory) page to physical system page translations. A number of the obtained system page table entries that can be encoded in each of a plurality of translation lookaside buffer (TLB) entry encoding formats are determined. The method and apparatus may select one of the TLB entry encoding formats that encode a number of the obtained system page table entries. The method and apparatus may encode a number of obtained system page table entries in the TLB entry encoding format selected into a compressed encoding format TLB entry. The method and apparatus may associate the compressed encoding format TLB entry with an encoding format indication of the encoding format selected. The method and apparatus may decode a compressed encoding format TLB entry based on a determined TLB entry encoding format.

READ-IF-HIT-PRE-POPA REQUEST
20220058121 · 2022-02-24 ·

Requester circuitry 4 issues an access request specifying a target physical address (PA) and a target physical address space (PAS) identifier identifying a target PAS. Prior to a point of physical aliasing (PoPA), a pre-PoPA memory system component 24, 8 treats aliasing PAs from different PASs which actually correspond to the same memory system resource as if they correspond to different memory system resources. A post-PoPA memory system component 6 treats the aliasing PAs as referring to the same memory system resource. When the target PA and target PAS of a read-if-hit-pre-PoPA request hit in a pre-PoPA cache 24, a data response is returned to the requester circuitry 4. If the read-if-hit-pre-PoPA request misses in the pre-PoPA cache 24, a no-data response is returned. The read-if-hit-pre-PoPA request is safe to issue speculatively while waiting for security checks to be performed, improving performance.

OPERATION OF A MULTI-SLICE PROCESSOR IMPLEMENTING A UNIFIED PAGE WALK CACHE

Operation of a multi-slice processor that includes a plurality of execution slices, a plurality of load/store slices, and one or more page walk caches, where operation includes: receiving, at a load/store slice, an instruction to be issued; determining, at the load/store slice, a process type indicating a source of the instruction to be a host process or a guest process; and determining, in accordance with an allocation policy and in dependence upon the process type, an allocation of an entry of the page walk cache, wherein the page walk cache comprises one or more entries for both host processes and guest processes.

OBJECT TAGGED MEMORY MONITORING METHOD AND PROCESSING APPARATUS

Described are a method and processing apparatus to tag and track objects related to memory allocation calls. An application or software adds a tag to a memory allocation call to enable object level tracking. An entry is made into an object tracking table, which stores the tag and a variety of statistics related to the object and associated memory devices. The object statistics may be queried by the application to tune power/performance characteristics either by the application making runtime placement decisions, or by off-line code tuning based on a previous run. The application may add a tag to a memory allocation call to specify the type of memory characteristics requested based on the object statistics.

OPERATION OF A MULTI-SLICE PROCESSOR IMPLEMENTING EXCEPTION HANDLING IN A NESTED TRANSLATION ENVIRONMENT
20170308425 · 2017-10-26 ·

Operation of a multi-slice processor that includes a plurality of execution slices, a plurality of load/store slices, and one or more translation caches, where operation includes: determining, at the load/store slice, a real address from a cache hit in the translation cache for an effective address for an instruction received at a load/store slice; determining, at the load/store slice, an error condition corresponding to an access of the real address; determining, at the load/store slice, a process type indicating a source of the instruction to be a guest process; and responsive to determining the error condition, initiating, in dependence upon the process type indicating a source of the instruction to be a guest process, an effective address translation corresponding to a cache miss in the translation cache for the effective address for the instruction.

FAULTING ADDRESS PREDICTION FOR PREFETCH TARGET ADDRESS

An apparatus comprises memory management circuitry to perform a translation table walk for a target address of a memory access request and to signal a fault in response to the translation table walk identifying a fault condition for the target address, prefetch circuitry to generate a prefetch request to request prefetching of information associated with a prefetch target address to a cache; and faulting address prediction circuitry to predict whether the memory management circuitry would identify the fault condition for the prefetch target address if the translation table walk was performed by the memory management circuitry for the prefetch target address. In response to a prediction that the fault condition would be identified for the prefetch target address, the prefetch circuitry suppresses the prefetch request and the memory management circuitry prevents the translation table walk being performed for the prefetch target address of the prefetch request.

Microcontroller for memory management unit

One embodiment of the present invention includes a microcontroller coupled to a memory management unit (MMU). The MMU is coupled to a page table included in a physical memory, and the microcontroller is configured to perform one or more virtual memory operations associated with the physical memory and the page table. In operation, the microcontroller receives a page fault generated by the MMU in response to an invalid memory access via a virtual memory address. To remedy such a page fault, the microcontroller performs actions to map the virtual memory address to an appropriate location in the physical memory. By contrast, in prior-art systems, a fault handler would typically remedy the page fault. Advantageously, because the microcontroller executes these tasks locally with respect to the MMU and the physical memory, latency associated with remedying page faults may be decreased. Consequently, overall system performance may be increased.

Command-driven translation pre-fetch for memory management units

Methods and systems for pre-fetching address translations in a memory management unit (MMU) of a device are disclosed. In an embodiment, the MMU receives a pre-fetch command from an upstream component of the device, the pre-fetch command including an address of an instruction, pre-fetches a translation of the instruction from a translation table in a memory of the device, and stores the translation of the instruction in a translation cache associated with the MMU.