G06F2212/682

INTERRUPTIBLE TRANSLATION ENTRY INVALIDATION IN A MULTITHREADED DATA PROCESSING SYSTEM

A processor core among the plurality of processor cores initiates invalidation of translation entries buffered in the plurality of processor cores by executing a translation invalidation instruction in an initiating hardware thread. The processor core also executes, in the initiating hardware thread, a synchronization instruction following the translation invalidation instruction in program order that determines completion of invalidation, at all of the plurality of processor cores, of the translation entries specified by the translation invalidation instruction and draining of any memory referent instructions whose target addresses have been translated by reference to the translation entries. A register is updated to a state based on a result of the determination. The processor core branches execution to re-execute the synchronization instruction based on the state of the register indicating that the translation entries are not invalidated at all of the plurality of processor cores.

Secure processor and a program for a secure processor
10685145 · 2020-06-16 · ·

The instruction code including an instruction code stored in the area where the encrypted instruction code is stored in a non-rewritable format is authenticated using a specific key which is specific to the core where the instruction code is executed or an authenticated key by a specific key to perform an encryption processing for the input and output data between the core and the outside.

TRANSLATION ENTRY INVALIDATION IN A MULTITHREADED DATA PROCESSING SYSTEM
20200183843 · 2020-06-11 ·

A multiprocessor data processing system includes a processor core having a translation structure for buffering a plurality of translation entries. In response to receipt of a translation invalidation request, the processor core determines from the translation invalidation request that the translation invalidation request does not require draining of memory referent instructions for which address translation has been performed by reference to a translation entry to be invalidated. Based on the determination, the processor core invalidates the translation entry in the translation structure and confirms completion of invalidation of the translation entry without regard to draining from the processor core of memory access requests for which address translation was performed by reference to the translation entry.

Reducing translation latency within a memory management unit using external caching structures

Reducing translation latency within a memory management unit (MMU) using external caching structures including requesting, by the MMU on a node, page table entry (PTE) data and coherent ownership of the PTE data from a page table in memory; receiving, by the MMU, the PTE data, a source flag, and an indication that the MMU has coherent ownership of the PTE data, wherein the source flag identifies a source location of the PTE data; performing a lateral cast out to a local high-level cache on the node in response to determining that the source flag indicates that the source location of the PTE data is external to the node; and directing at least one subsequent request for the PTE data to the local high-level cache.

Reducing translation latency within a memory management unit using external caching structures

Reducing translation latency within a memory management unit (MMU) using external caching structures including requesting, by the MMU on a node, page table entry (PTE) data and coherent ownership of the PTE data from a page table in memory; receiving, by the MMU, the PTE data, a source flag, and an indication that the MMU has coherent ownership of the PTE data, wherein the source flag identifies a source location of the PTE data; performing a lateral cast out to a local high-level cache on the node in response to determining that the source flag indicates that the source location of the PTE data is external to the node; and directing at least one subsequent request for the PTE data to the local high-level cache.

Method and apparatus for an efficient TLB lookup

The present disclosure relates to a method of operating a translation lookaside buffer (TLB) arrangement for a processor supporting virtual addressing, wherein multiple translation engines are used to perform translations on request of one of a plurality of dedicated processor units. The method comprises: maintaining by a cache unit a dependency matrix for the engines to track for each processing unit if an engine is assigned to the each processing unit for a table walk. The cache unit may block a processing unit from allocating an engine to a translation request when the engine is already assigned to the processing unit in the dependency matrix.

Auxiliary processor resources
10642752 · 2020-05-05 · ·

Apparatuses, systems and methods associated microprocessor segment registers are disclosed herein. More particularly, the present disclosure relates to providing an auxiliary segment register(s) and/or auxiliary segment descriptor table(s), and various ways for their use, for example, providing new instructions for their access, or remapping existing processor resources. A machine might provide isolated execution regions and/or protected memory by associating or exclusively reserving some or all of the auxiliary segment register(s)/table(s) with a specific task, program, instruction sequence, etc. In some embodiments, such as in Internet of Things (IoT) or wearable devices, auxiliary resources may be employed to isolate mutually-distrustful code regions to facilitate engaging unknown devices. Other embodiments are also described and/or claimed.

Data processing system having a coherency interconnect
10628329 · 2020-04-21 · ·

A processing system includes a first processor configured to issue a first request in a first format, an adapter configured to receive the first request in the first format and send the first request in a second format, and a memory coherency interconnect configured to receive the first request in the second format and determine whether the first request in the second format is for a translation lookaside buffer (TLB) operation or a non-TLB operation based on information in the first request in the second format. When the first request in the second format is for a TLB operation, the interconnect routes the first request in the second format to a TLB global ordering point (GOP). When the first request in the second format is not for a TLB operation, the interconnect routes the first request in the second format to a non-TLB GOP.

Hypervisor direct memory access

This disclosure generally relates to hypervisor memory virtualization. Techniques disclosed herein improve peripheral component interconnect express (PCI-e) device interoperability with a virtual machine. As an example, when a direct-memory access request is received from a PCI-e device but the target memory is currently unmapped, an indication may be provided to a memory paging processor so as to page-in the memory, such that the PCI-e device may continue to function normally. In some examples, the access request may be buffered and replayed once the memory is paged-in, or the access request may be retried, among other examples.

METHOD AND APPARATUS FOR AN EFFICIENT TLB LOOKUP

The present disclosure relates to a method of operating a translation lookaside buffer (TLB) arrangement for a processor supporting virtual addressing, wherein multiple translation engines are used to perform translations on request of one of a plurality of dedicated processor units. The method comprises: maintaining by a cache unit a dependency matrix for the engines to track for each processing unit if an engine is assigned to the each processing unit for a table walk. The cache unit may block a processing unit from allocating an engine to a translation request when the engine is already assigned to the processing unit in the dependency matrix.