Patent classifications
G06F2212/683
Address translation data invalidation
A data processing system (2) including one or more transaction buffers (16, 18, 20) storing address translation data executes translation buffer invalidation instructions TLBI within respective address translation contexts VMID, ASID, X. Translation buffer invalidation signals generated as a consequence of execution of the translation buffer invalidation instructions are broadcast to respective translation buffers and include signals which specify the address translation context of the translation buffer invalidation instruction that was executed. This address translation context specified within the translation buffer invalidation signals is used to gate whether or not those translation buffer invalidation signals when received by translation buffers which are potential targets for the invalidation are or are not flushed. The address translation context data provided within the translation buffer invalidation signals may also be used to control whether or not local memory transactions for a local transactional memory access are or are not aborted upon receipt of the translation buffer invalidation signals.
Reducing translation lookaside buffer searches for splintered pages
Systems, apparatuses, and methods for performing efficient translation lookaside buffer (TLB) invalidation operations for splintered pages are described. When a TLB receives an invalidation request for a specified translation context, and the invalidation request maps to an entry with a relatively large page size, the TLB does not know if there are multiple translation entries stored in the TLB for smaller splintered pages of the relatively large page. The TLB tracks whether or not splintered pages for each translation context have been installed. If a TLB invalidate (TLBI) request is received, and splintered pages have not been installed, no searches are needed for splintered pages. To refresh the sticky bits, whenever a full TLB search is performed, the TLB rescans for splintered pages for other translation contexts. If no splintered pages are found, the sticky bit can be cleared and the number of full TLBI searches is reduced.
Virtualized-in-hardware input output memory management
Aspects relate to Input/Output (IO) Memory Management Units (MMUs) that include hardware structures for implementing virtualization. Some implementations allow guests to setup and maintain device IO tables within memory regions to which those guests have been given permissions by a hypervisor. Some implementations provide hardware page table walking capability within the IOMMU, while other implementations provide static tables. Such static tables may be maintained by a hypervisor on behalf of guests. Some implementations reduce a frequency of interrupts or invocation of hypervisor by allowing transactions to be setup by guests without hypervisor involvement within their assigned device IO regions. Devices may communicate with IOMMU to setup the requested memory transaction, and completion thereof may be signaled to the guest without hypervisor involvement. Various other aspects will be evident from the disclosure.
TRANSLATION LOOKASIDE BUFFER INVALIDATION
A type of translation lookaside buffer (TLB) invalidation instruction is described which specifically targets a first type of TLB which stores combined stage-1-and-2 entries which depend on both stage 1 translation data and the stage 2 translation data, and which is configured to ignore a TLB invalidation command which invalidates based on a first set of one or more invalidation conditions including an address-based invalidation condition depending on matching of intermediate address. A second type of TLB other than the first type ignores the invalidation command triggered by the first type of TLB invalidation instruction. This approach helps to limit the performance impact of stage 2 invalidations in systems supporting a combined stage-1-and-2 TLB which cannot invalidate by intermediate address.
Poison mechanisms for deferred invalidates
An apparatus includes multiple processors including respective cache memories, the cache memories configured to cache cache-entries for use by the processors. At least a processor among the processors includes cache management logic that is configured to (i) receive, from one or more of the other processors, cache-invalidation commands that request invalidation of specified cache-entries in the cache memory of the processor (ii) mark the specified cache-entries as intended for invalidation but defer actual invalidation of the specified cache-entries, and (iii) upon detecting a synchronization event associated with the cache-invalidation commands, invalidate the cache-entries that were marked as intended for invalidation.
Fine-grained access memory controller
Systems and methods are provided to perform fine-grained memory accesses using a memory controller. The memory controller can access elements stored in memory across multiple dimensions of a matrix. The memory controller can perform accesses to non-contiguous memory locations by skipping zero or more elements across any dimension of the matrix.
VIRTUALIZED-IN-HARDWARE INPUT OUTPUT MEMORY MANAGEMENT
Aspects relate to Input/Output (IO) Memory Management Units (MMUs) that include hardware structures for implementing virtualization. Some implementations allow guests to setup and maintain device IO tables within memory regions to which those guests have been given permissions by a hypervisor. Some implementations provide hardware page table walking capability within the IOMMU, while other implementations provide static tables. Such static tables may be maintained by a hypervisor on behalf of guests. Some implementations reduce a frequency of interrupts or invocation of hypervisor by allowing transactions to be setup by guests without hypervisor involvement within their assigned device IO regions. Devices may communicate with IOMMU to setup the requested memory transaction, and completion thereof may be signaled to the guest without hypervisor involvement. Various other aspects will be evident from the disclosure.
Cache system with a primary cache and an overflow cache that use different indexing schemes
A cache memory system including a primary cache and an overflow cache that are searched together using a search address. The overflow cache operates as an eviction array for the primary cache. The primary cache is addressed using bits of the search address, and the overflow cache is addressed by a hash index generated by a hash function applied to bits of the search address. The hash function operates to distribute victims evicted from the primary cache to different sets of the overflow cache to improve overall cache utilization. A hash generator may be included to perform the hash function. A hash table may be included to store hash indexes of valid entries in the primary cache. The cache memory system may be used to implement a translation lookaside buffer for a microprocessor.
HARDWARE TRANSLATION REQUEST RETRY MECHANISM
A processing system includes a hardware translation lookaside buffer (TLB) retry loop that retries virtual memory address to physical memory address translation requests from a software client independent of a command from the software client. In response to a retry response notification at the TLB, a controller of the TLB waits for a programmable delay period and then retries the request without involvement from the software client. After a retry results in a hit at the TLB, the controller notifies the software client of the hit. Alternatively, if a retry results in an error at the TLB, the controller notifies the software client of the error and the software client initiates error handling.
MANAGING VIRTUAL-ADDRESS CACHES FOR MULTIPLE MEMORY PAGE SIZES
A translation lookaside buffer stores information indicating respective page sizes for different translations. A virtual-address cache module manages entries, where each entry stores a memory block in association with a virtual address and a code representing at least one page size of a memory page on which the memory block is located. The managing includes: receiving a translation lookaside buffer invalidation instruction for invalidating at least one translation lookaside buffer entry in the translation lookaside buffer, where the translation lookaside buffer invalidation instruction includes at least one invalid virtual address; comparing selected bits of the invalid virtual address with selected bits of each of a plurality of virtual addresses associated with respective entries in the virtual-address cache module, based on the codes; and invalidating one or more entries in the virtual-address cache module based on the comparing.