Patent classifications
G06F2212/655
Input/output memory map unit and northbridge
A system including a gasket communicatively coupled between a unified northbridge (UNB) having a cache coherent interconnect (CCI) interface and a processor having an Advanced eXtensible Interface (AXI) coherency extension (ACE). The gasket is configured to translate requests from the processor that include ACE commands into equivalent CCI commands, wherein each request from the processor maps onto a specific CCI request type. The gasket is further configured to translate ACE tags into CCI tags. The gasket is further configured to translate CCI encoded probes from a system resource interface (SRI) into equivalent ACE snoop transactions. The gasket is further configured to translate the memory map to inter-operate with a UNB/coherent HyperTransport (cHT) environment. The gasket is further configured to receive a barrier transaction that is used to provide ordering for transactions.
Method and system for sharing driver pages
Method for sharing driver pages, including, on a computer system having a first driver and performing system services, instantiating Containers; code and data of the driver are loaded sequentially into pages arranged in a virtual memory, and instantiating a next driver upon a request from one of the Containers for system services by: loading physical pages of the next driver into physical memory and allocated virtual memory pages for the next driver; associating the next driver with the first driver and comparing pages of the next driver and the first driver, generating a set of identical physical pages; mapping virtual pages of the next driver to identical physical pages of the first driver while atomically protecting identical physical pages; non-identical pages of the next driver from physical memory remain mapped to virtual pages of the next driver; releasing physical memory occupied by identical physical pages of the next driver.
FILTERING OF REDUNDENTLY SCHEDULED WRITE PASSES
Improving access to a cache by a processing unit. One or more previous requests to access data from a cache are stored. A current request to access data from the cache is retrieved. A determination is made whether the current request is seeking the same data from the cache as at least one of the one or more previous requests. A further determination is made whether the at least one of the one or more previous requests seeking the same data was successful in arbitrating access to a processing unit when seeking access. A next cache write access is suppressed if the at least one of previous requests seeking the same data was successful in arbitrating access to the processing unit.
PREDICTING PAGE MIGRATION GRANULARITY FOR HETEROGENEOUS MEMORY SYSTEMS
Systems, apparatuses, and methods for predicting page migration granularities for phases of an application executing on a non-uniform memory access (NUMA) system architecture are disclosed herein. A system with a plurality of processing units and memory devices executes a software application. The system identifies a plurality of phases of the application based on one or more characteristics (e.g., memory access pattern) of the application. The system predicts which page migration granularity will maximize performance for each phase of the application. The system performs a page migration at a first page migration granularity during a first phase of the application based on a first prediction. The system performs a page migration at a second page migration granularity during a second phase of the application based on a second prediction, wherein the second page migration granularity is different from the first page migration granularity.
PROVIDING HARDWARE-BASED TRANSLATION LOOKASIDE BUFFER (TLB) CONFLICT RESOLUTION IN PROCESSOR-BASED SYSTEMS
Providing hardware-based translation lookaside buffer (TLB) conflict resolution in processor-based systems is disclosed. In this regard, in one aspect, a memory system provides a memory management unit (MMU) and multiple hierarchical page tables, each comprising multiple page table entries comprising corresponding translation preference indicators. The memory system further includes a TLB comprising multiple TLB entries each configured to cache a page table entry. The MMU determines whether a TLB conflict exists between a first TLB entry caching a first page table entry comprising a translation preference indicator that is set and a second TLB entry caching a second page table entry comprising a translation preference indicator that is not set. If so, the MMU selects the first TLB entry for use in a virtual-to-physical address translation operation, based on the translation preference indicator of the first page table entry cached by the first TLB entry being set.
INPUT/OUTPUT MEMORY MAP UNIT AND NORTHBRIDGE
A system including a gasket communicatively coupled between a unified northbridge (UNB) having a cache coherent interconnect (CCI) interface and a processor having an Advanced eXtensible Interface (AXI) coherency extension (ACE). The gasket is configured to translate requests from the processor that include ACE commands into equivalent CCI commands, wherein each request from the processor maps onto a specific CCI request type. The gasket is further configured to translate ACE tags into CCI tags. The gasket is further configured to translate CCI encoded probes from a system resource interface (SRI) into equivalent ACE snoop transactions. The gasket is further configured to translate the memory map to inter-operate with a UNB/coherent HyperTransport (cHT) environment. The gasket is further configured to receive a barrier transaction that is used to provide ordering for transactions.
SELECTIVELY PROCESSING DATA ASSOCIATED WITH A WORKLOAD
A computer-implemented method includes pseudo-invalidating a first Dynamic Address Translation (DAT) table of a DAT structure associated with a workload. A page fault occurring during translation of a virtual memory address of data required by the workload is detected. Responsive to the page fault, the DAT structure is traversed. The DAT structure includes one or more DAT tables, and each DAT entry in each of the one or more DAT tables is associated with an in-use bit indicating whether the DAT entry is in use. Traversing the DAT structure includes pseudo-invalidating each of one or more DAT entries in the DAT structure that are involved in translating the virtual memory address for which the page fault occurred; and identifying a first page frame referenced by the virtual memory address for which the page fault occurred. The data in the first page frame is processed responsive to the page fault.
Filtering of redundently scheduled write passes
Improving access to a cache by a processing unit. One or more previous requests to access data from a cache are stored. A current request to access data from the cache is retrieved. A determination is made whether the current request is seeking the same data from the cache as at least one of the one or more previous requests. A further determination is made whether the at least one of the one or more previous requests seeking the same data was successful in arbitrating access to a processing unit when seeking access. A next cache write access is suppressed if the at least one of previous requests seeking the same data was successful in arbitrating access to the processing unit.
Input/output memory map unit and northbridge
The present invention provides for page table access and dirty bit management in hardware via a new atomic test[0] and OR and Mask. The present invention also provides for a gasket that enables ACE to CCI translations. This gasket further provides request translation between ACE and CCI, deadlock avoidance for victim and probe collision, ARM barrier handling, and power management interactions. The present invention also provides a solution for ARM victim/probe collision handling which deadlocks the unified northbridge. These solutions includes a dedicated writeback virtual channel, probes for IO requests using 4-hop protocol, and a WrBack Reorder Ability in MCT where victims update older requests with data as they pass the requests.
Multi-threaded translation and transaction re-ordering for memory management units
Systems and methods relate to performing address translations in a multithreaded memory management unit (MMU). Two or more address translation requests can be received by the multithreaded MMU and processed in parallel to retrieve address translations to addresses of a system memory. If the address translations are present in a translation cache of the multithreaded MMU, the address translations can be received from the translation cache and scheduled for access of the system memory using the translated addresses. If there is a miss in the translation cache, two or more address translation requests can be scheduled in two or more translation table walks in parallel.