G06F2212/654

Address translation structures to provide separate translations for instruction fetches and data accesses

An address translation capability in which information is obtained from an address translation structure to be used to translate a first address to a second address, the first address being of a first address type and the second address being of a second address type. The address translation structure includes a first set of information to translate the first address to one address of the second address type and a second set of information to translate the first address to another address of the second address type. To obtain the information, the first set of information or the second set of information is selected as the information to be used to translate the first address to the second address, based on an attribute of the first address.

Page table walker with page table entry (PTE) physical address prediction
11494300 · 2022-11-08 · ·

Methods and apparatus provide virtual to physical address translations and a hardware page table walker with region based page table prefetch operation that produces virtual memory region tracking information that includes at least: data representing a virtual base address of a virtual memory region and a physical address of a first page table entry (PTE) corresponding to a virtual page within the virtual memory region. The hardware page table walker, in response to the TLB miss indication, prefetches a physical address of a second page table entry, that provides a final physical address for the missed TLB entry, using the virtual memory region tracking information. In some implementations, the prefetching of the physical PTE address is done in parallel with earlier levels of a page walk operations.

Testing address translation cache

A method, apparatus and product for utilizing address translation structures for testing address translation cache. The method comprises: obtaining a first address translation structure that comprises multiple levels, including a first top level which connects a sub-structure of the first address translation structure using pointers thereto; determining, based on the first address translation structure, a second address translation structure, wherein the second address translation structure comprises a second top level that is determined based on the first top level, wherein the second top level connects the sub-structure of the first address translation structure; executing a test so as to verify operation of an address translation cache of a target processor at least by: adding a plurality of cache lines to the address translation cache, wherein said adding is based on the address translation structures; and verifying the operation of the address translation cache using one or more memory access operations.

Speculative addressing using a virtual address-to-physical address page crossing buffer

A method includes receiving an instruction to be executed by a processor. The method further includes performing a lookup in a page crossing buffer that includes one or more entries to determine if the instruction has an entry in the page crossing buffer. Each of the entries includes a physical address. The method further includes, when the page crossing buffer has the entry in the page crossing buffer, retrieving a particular physical address from the entry in the page crossing buffer.

FAULTING ADDRESS PREDICTION FOR PREFETCH TARGET ADDRESS

An apparatus comprises memory management circuitry to perform a translation table walk for a target address of a memory access request and to signal a fault in response to the translation table walk identifying a fault condition for the target address, prefetch circuitry to generate a prefetch request to request prefetching of information associated with a prefetch target address to a cache; and faulting address prediction circuitry to predict whether the memory management circuitry would identify the fault condition for the prefetch target address if the translation table walk was performed by the memory management circuitry for the prefetch target address. In response to a prediction that the fault condition would be identified for the prefetch target address, the prefetch circuitry suppresses the prefetch request and the memory management circuitry prevents the translation table walk being performed for the prefetch target address of the prefetch request.

Command-driven translation pre-fetch for memory management units

Methods and systems for pre-fetching address translations in a memory management unit (MMU) of a device are disclosed. In an embodiment, the MMU receives a pre-fetch command from an upstream component of the device, the pre-fetch command including an address of an instruction, pre-fetches a translation of the instruction from a translation table in a memory of the device, and stores the translation of the instruction in a translation cache associated with the MMU.

Graphics surface addressing

Techniques are disclosed relating to memory allocation for graphics surfaces. In some embodiments, graphics processing circuitry is configured to access a graphics surface based on an address in a surface space assigned to the graphics surface. In some embodiments, first translation circuitry is configured to translate address information for the surface space to address information in the virtual space based on one or more of the translation entries. In some embodiments, the graphics processing circuitry is configured to provide an address for the access to the graphics surface based on translation by the first translation circuitry and second translation circuitry configured to translate the address in the virtual space to an address in a physical space of a memory configured to store the graphics surface. The disclosed techniques may allow sparse allocation of large graphics surfaces, in various embodiments.

Cache replacement based on traversal tracking
11429535 · 2022-08-30 · ·

Techniques are disclosed relating to controlling cache replacement. In some embodiments, search control circuitry is configured to perform multiple searches of a data structure (e.g., page table walks) where searches traverse multiple links between elements of the data structure. In some embodiments, a traversal cache caches traversal information that is usable by searches to skip one or more links traversed by one or more prior searches. In some embodiments, tracking control circuitry stores tracking information in a first entry, where the tracking information indicates a location in the traversal cache at which prior traversal information for a first search is stored. In some embodiments, replacement control circuitry selects, based on the tracking information in the first entry of the tracking control circuitry, an entry in the traversal cache for new traversal information generated by the first search (which may include selecting the first entry to override a default replacement policy).

TRANSLATION BANDWIDTH OPTIMIZED PREFETCHING STRATEGY THROUGH MULTIPLE TRANSLATION LOOKASIDE BUFFERS

A computer system includes a processor and a prefetch engine. The processor is configured to generate a demand access stream. The prefetch engine is configured to generate a first prefetch request and a second prefetch request based on the demand access stream, to output the first prefetch request to a first translation lookaside buffer (TLB), and to output the second prefetch request to a second TLB that is different from the first TLB. The processor performs a first TLB lookup in the first TLB based on one of the demand access stream or the first prefetch request, and performs a second TLB lookup in the second TLB based on the second prefetch request.

POWER OPTIMIZED PREFETCHING IN SET-ASSOCIATIVE TRANSLATION LOOKASIDE BUFFER STRUCTURE

A computer system includes a processor and a prefetch engine. The processor is configured to generate a demand access stream. The prefetch engine is configured to initiate a first prefetch request based on the demand access stream and perform a first prefetch that includes performing a translation lookaside buffer (TLB) lookup on a TLB structure in response to the first prefetch request. The processor determines a TLB entry in response to performing the TLB lookup and performs at least one second prefetch based on the TLB entry without performing a subsequent TLB lookup on the TLB structure.