Patent classifications
G06F2212/682
Access frequency caching hardware structure
An access frequency caching hardware structure has entries each storing an access frequency counter indicative of a frequency of accesses to a corresponding page of a memory address space. Access frequency tracking circuitry is responsive to a given memory access request requesting access to a target page, to determine whether the access frequency caching hardware structure already includes a corresponding entry which is valid and corresponds to the target page. When the structure includes the corresponding entry, a corresponding access frequency counter specified by the corresponding entry is incremented. In response to a counter writeback event associated with a selected access frequency counter corresponding to a selected page, an update is made to a global access frequency counter corresponding to the selected page within a global access frequency tracking data structure stored in the memory system.
METHOD AND APPARATUS FOR REDUCING TLB SHOOTDOWN OVERHEADS IN ACCELERATOR-BASED SYSTEMS
A method and apparatus for reducing TLB shootdown operation overheads in accelerator-based computing systems is described. The disclosed method and apparatus may also be used in the areas of near-memory and in-memory computing, where near-memory or in-memory compute units may need to share a host CPU's virtual address space. Metadata is associated with page table entries (PTEs) and mechanisms use the metadata to limit the number of processing elements that participate in a TLB shootdown operation.
Synchronizing updates of page table status indicators and performing bulk operations
A synchronization capability to synchronize updates to page tables by forcing updates in cached entries to be made visible in memory (i.e., in in-memory page table entries). A synchronization instruction is used that ensures after the instruction has completed that updates to the cached entries that occurred prior to the synchronization instruction are made visible in memory. Synchronization may be used to facilitate memory management operations, such as bulk operations used to change a large section of memory to read-only, operations to manage a free list of memory pages, and/or operations associated with terminating processes.
SYSTEM AND METHOD FOR MEMORY SYNCHRONIZATION OF A MULTI-CORE SYSTEM
A system for memory synchronization of a multi-core system is provided, the system comprising: an assigning module which is configured to assign at least one memory partition to at least one core of the multi-core system; a mapping module which is configured to provide information for translation lookaside buffer shootdown for the multi-core system leveraged by sending an interrupt to the at least one core of the multi-core system, if a page table entry associated with the memory partition assigned to the at least one core is modified; and an interface module which is configured to provide an interface to the assigning module from user-space.
VIRTUAL MEMORY PAGE MAPPING OVERLAYS
In some embodiments, a memory overlay system comprises a translation lookaside buffer (TLB) that includes an entry that specifies a virtual address range that is a subset of a virtual address range specified by another entry. In response to an indication from the TLB that both of the entries are TLB hits for the same memory operation, a selection circuit is configured to select, based on one or more selection criteria, one of the two entries. The selection circuit may then cause the selected TLB entry including the corresponding physical address information and memory attributes to be provided to a memory interface.
Implementing fine grain data coherency of a shared memory region
The disclosure provides an approach for implementing fine grain data coherency of a memory region shared by an application within a virtual machine and a compute accelerator. The approach includes locating within a compute kernel a data write instruction to the shared memory region, and modifying the compute kernel to add a halting point after the data write instruction. The approach further includes configuring a virtualization system on which the virtual machine runs to set a value of a halt variable to true at an interval or in response to an occurrence of an event, wherein setting the halt variable to true causes the compute kernel to suspend execution at the conditional halting point.
Multi-core cache coherency built-in test
A system and method for verifying cache coherency in a safety-critical avionics processing environment includes a multi-core processor (MCP) having multiple cores, each core having at least an L1 data cache. The MCP may include a shared L2 cache. The MCP may designate one core as primary and the remainder as secondary. The primary core and secondary cores create valid TLB mappings to a data page in system memory and lock L1 cache lines in their data caches. The primary core locks an L2 cache line in the shared cache and updates its locked L1 cache line. When notified of the update, the secondary cores check the test pattern received from the primary core with the updated test pattern in their own L1 cache lines. If the patterns match, the test passes; the MCP may continue the testing process by updating the primary and secondary statuses of each core.
DATA PROCESSING SYSTEM HAVING A COHERENCY INTERCONNECT
A processing system includes a first processor configured to issue a first request in a first format, an adapter configured to receive the first request in the first format and send the first request in a second format, and a memory coherency interconnect configured to receive the first request in the second format and determine whether the first request in the second format is for a translation lookaside buffer (TLB) operation or a non-TLB operation based on information in the first request in the second format. When the first request in the second format is for a TLB operation, the interconnect routes the first request in the second format to a TLB global ordering point (GOP). When the first request in the second format is not for a TLB operation, the interconnect routes the first request in the second format to a non-TLB GOP.
Verifying selective purging of entries from translation look-aside buffers
An aspect includes include selective purging of entries from translation look-aside buffers (TLBs). A method includes building multiple logical systems in a computing environment, the multiple logical systems including at least two level-two guests. TLB entries are created in a TLB for the level-two guests by executing fetch and store instructions. A subset of the TLB entries is purged in response to a selective TLB purge instruction, the subset including TLB entries created for a first one of the level-two guests. Subsequent to the purging, verifying that the subset of the TLB entries were purged from the TLB, and determining whether a second one of the level-two guests is operational, the determining including executing at least one instruction that accesses a TLB entry of the second one of the level-two guests. Test results are generated based on the verifying and the determining. The test results are output.
PROXY IDENTIFIER FOR DATA ACCESS OPERATION
An apparatus comprises processing circuitry to process data access operations specifying a virtual address of data to be loaded from or stored to a data store, and proxy identifier determining circuitry to determine a proxy identifier for a data access operation to be processed by the data access circuitry, the proxy identifier having fewer bits than a physical address corresponding to the virtual address specified by the data access operation. The processing circuitry comprises at least one buffer to buffer information (including the proxy identifier) associated with one or more pending data access operations awaiting processing. Address translation circuitry determines the physical address corresponding to the virtual address specified for a data access operation after that data access operation has progressed beyond said at least one buffer.