G06F2212/682

Shared accelerator memory systems and methods

The present disclosure is directed to systems and methods sharing memory circuitry between processor memory circuitry and accelerator memory circuitry in each of a plurality of peer-to-peer connected accelerator units. Each of the accelerator units includes virtual-to-physical address translation circuitry and migration circuitry. The virtual-to-physical address translation circuitry in each accelerator unit includes pages for each of at least some of the plurality of accelerator units. The migration circuitry causes the transfer of data between the processor memory circuitry and the accelerator memory circuitry in each of the plurality of accelerator circuits. The migration circuitry migrates and evicts data to/from accelerator memory circuitry based on statistical information associated with accesses to at least one of: processor memory circuitry or accelerator memory circuitry in one or more peer accelerator circuits. Thus, the processor memory circuitry and accelerator memory circuitry may be dynamically allocated to advantageously minimize system latency attributable to data access operations.

Interruptible translation entry invalidation in a multithreaded data processing system

A processor core among the plurality of processor cores initiates invalidation of translation entries buffered in the plurality of processor cores by executing a translation invalidation instruction in an initiating hardware thread. The processor core also executes, in the initiating hardware thread, a synchronization instruction following the translation invalidation instruction in program order that determines completion of invalidation, at all of the plurality of processor cores, of the translation entries specified by the translation invalidation instruction and draining of any memory referent instructions whose target addresses have been translated by reference to the translation entries. A register is updated to a state based on a result of the determination. The processor core branches execution to re-execute the synchronization instruction based on the state of the register indicating that the translation entries are not invalidated at all of the plurality of processor cores.

PACKET PROCESSING DEVICE, PACKET PROCESSING METHOD, AND RECORDING MEDIUM
20200327067 · 2020-10-15 · ·

In order to achieve a packet processing device which make it possible to process a packet at high speed, a bus that transfers a communication packet, and a plurality of processors and executes at least one task including either of a first task and a second task are included, wherein the first task performs processing when a first task identifier given to the first task and a second task identifier added to the communication packet received from the bus coincide with each other, the second task performs the processing for the communication packet that is not added with the second task identifier, and the processing executes first processing, based on the packet identifier, and thereafter, adds, to the communication packet, the second task identifier indicating the different first task that executes second processing subsequent to the first processing, and transmits the communication packet to the bus.

IMPLEMENTING FINE GRAIN DATA COHERENCY OF A SHARED MEMORY REGION
20200327048 · 2020-10-15 ·

The disclosure provides an approach for implementing fine grain data coherency of a memory region shared by an application within a virtual machine and a compute accelerator. The approach includes locating within a compute kernel a data write instruction to the shared memory region, and modifying the compute kernel to add a halting point after the data write instruction. The approach further includes configuring a virtualization system on which the virtual machine runs to set a value of a halt variable to true at an interval or in response to an occurrence of an event, wherein setting the halt variable to true causes the compute kernel to suspend execution at the conditional halting point.

SHARED ACCELERATOR MEMORY SYSTEMS AND METHODS

The present disclosure is directed to systems and methods sharing memory circuitry between processor memory circuitry and accelerator memory circuitry in each of a plurality of peer-to-peer connected accelerator units. Each of the accelerator units includes physical-to-virtual address translation circuitry and migration circuitry. The physical-to-virtual address translation circuitry in each accelerator unit includes pages for each of at least some of the plurality of accelerator units. The migration circuitry causes the transfer of data between the processor memory circuitry and the accelerator memory circuitry in each of the plurality of accelerator circuits. The migration circuitry migrates and evicts data to/from accelerator memory circuitry based on statistical information associated with accesses to at least one of: processor memory circuitry or accelerator memory circuitry in one or more peer accelerator circuits. Thus, the processor memory circuitry and accelerator memory circuitry may be dynamically allocated to advantageously minimize system latency attributable to data access operations.

Translation of virtual addresses to physical addresses using translation lookaside buffer information

A memory management unit (MMU) is disclosed. The MMU is configured to receive a translation request from a processing system, wherein the translation request specifies a virtual address to be translated, search a page table stored in a physical memory system for a page table entry that specifies the virtual address, receive a translation lookaside buffer invalidation (TLBI) signal from the processing system, wherein the TLBI signal specifies the virtual address, in response to receiving the TLBI signal specifying the virtual address, invalidate a translation lookaside buffer (TLB) entry in a TLB, wherein the invalidated TLB entry specifies the virtual address and restart the search of the page table for the page table entry that specifies the virtual address.

SECURE PROCESSOR AND A PROGRAM FOR A SECURE PROCESSOR
20200265169 · 2020-08-20 · ·

The instruction code including an instruction code stored in the area where the encrypted instruction code is stored in a non-rewritable format is authenticated using a specific key which is specific to the core where the instruction code is executed or an authenticated key by a specific key to perform an encryption processing for the input and output data between the core and the outside.

Translation entry invalidation in a multithreaded data processing system

A multiprocessor data processing system includes a processor core having a translation structure for buffering a plurality of translation entries. In response to receipt of a translation invalidation request, the processor core determines from the translation invalidation request that the translation invalidation request does not require draining of memory referent instructions for which address translation has been performed by reference to a translation entry to be invalidated. Based on the determination, the processor core invalidates the translation entry in the translation structure and confirms completion of invalidation of the translation entry without regard to draining from the processor core of memory access requests for which address translation was performed by reference to the translation entry.

Suspending translation look-aside buffer purge execution in a multi-processor environment

A method for operating translation look-aside buffers, TLBs, in a multiprocessor system. A purge request is received for purging one or more entries in the TLB. When the thread doesn't require access to the entries to be purged the execution of the purge request at the TLB may start. When an address translation request is rejected due to the TLB purge, a suspension time window may be set. During the suspension time window, the execution of the purge is suspended and address translation requests of the thread are executed. After the suspension window is ended the purge execution may be resumed. When the thread requires access to the entries to be purged, it may be blocked for preventing the thread sending address translation requests to the TLB and upon ending the purge request execution, the thread may be unblocked and the address translation requests may be executed.

DATA LOCATION IDENTIFICATION

A method, computer program product, and a computer system are disclosed for processing information in a processor that in one or more embodiments includes issuing from one of a plurality of processors an address translation invalidation instruction with a return marker, wherein the address translation invalidation instruction is to invalidate one or more address translation entries in one or more storage locations in the computer system; broadcasting the address translation invalidation instruction to one or more storage locations of one or more of the other processors; invalidating an address translation entry located in a storage location of the one or more storage locations; and returning information on the invalidated address translation entry to the issuing processor.