Patent classifications
G06F12/1063
Virtualized cache implementation method and physical machine
A virtualized cache implementation solution, where a memory of a virtual machine stores cache metadata. The cache metadata includes a one-to-one mapping relationship between virtual addresses and first physical addresses. After an operation request that is delivered by the virtual machine and that includes a first virtual address is obtained, when the cache metadata includes a target first physical address corresponding to the first virtual address, a target second physical address corresponding to the target first physical address is searched for based on preconfigured correspondences between the first physical addresses and second physical addresses, and data is read or written from or to a location indicated by the target second physical address.
Memory manager having an address translation function, data processing structure including the same, and method for generating address translation information
A memory manager includes an internal memory and a hash function circuit. The internal memory includes a V2H (virtual address to hash function) table and an exception mapping table. The V2H table stores at least one virtual address group and a type information on a hash function mapped to the virtual address group. The exception mapping table stores at least one exception virtual address not translated into a physical address by the hash function in the virtual address group and a physical address mapped to the exception virtual address. The has function circuit checks, when a virtual address is provided from a host, type information on a hash function mapped to a virtual address group including the virtual address, by referring to the V2H table included in the internal memory. The has function translates the virtual address into a physical address by using the hash function corresponding to the type information.
Store-to-load forwarding using physical address proxies to identify candidate set of store queue entries
A microprocessor includes a physically-indexed-and-tagged second-level set-associative cache. Each cache entry is uniquely identified by a set index and way number. Each store queue (SQ) entry holds store data for writing to a store physical address and a store physical address proxy (PAP) for the store physical line address. The store PAP specifies the set index and way number of the cache entry allocated to the store physical line address. A load unit obtains a load PAP for a load physical line address that specifies the set index and way number of the cache entry allocated to the load physical line address. The SQ compares the load PAP with each valid store PAP for use in identifying a candidate set of SQ entries whose store data overlaps requested load data and selects an entry from the candidate set from which to forward the store data to the load instruction.
SPATIAL CACHE
A cache includes a p-by-q array of memory units; a row addressing unit; and a column addressing unit. Each memory unit has an m-by-n array of memory cells. The column addressing unit has, for each memory unit, m n-to-one multiplexers, one associated with each of the m rows of the memory unit, wherein each n-to-one multiplexer has an input coupled to each of the n memory cells associated with the row associated with that multiplexer. The row addressing unit has, for each memory unit, n m-to-one multiplexers, one associated with each of the n columns of the memory unit, wherein each m-to-one multiplexer has an input coupled to each of the m memory cells associated with the column associated with that multiplexer. The row addressing unit and column addressing unit support reading and/or writing of the array of memory units, e.g. using virtual or physical addresses.
Instruction cache coherence
A data processing apparatus is provided, which includes a cache to store operations produced by decoding instructions fetched from memory. The cache is indexed by virtual addresses of the instructions in the memory. Receiving circuitry receives an incoming invalidation request that references a physical address in the memory. Invalidation circuitry invalidates entries in the cache where the virtual address corresponds with the physical address. Coherency is thereby achieved when using a cache that is indexed using virtual addresses.
VIRTUALIZATION OF PROCESS ADDRESS SPACE IDENTIFIERS FOR SCALABLE VIRTUALIZATION OF INPUT/OUTPUT DEVICES
Implementations of the disclosure provide a processing device comprising an address translation circuit to intercept a work request from an I/O device. The work request comprises a first ASID to map to a work queue. A second ASID of a host is allocated for the first ASID based on the work queue. The second ASID is allocated to at least one of: an ASID register for a dedicated work queue (DWQ) or an ASID translation table for a shared work queue (SWQ). Responsive to receiving a work submission from the SVM client to the I/O device, the first ASID of the application container is translated to the second ASID of the host machine for submission to the I/O device using at least one of: the ASID register for the DWQ or the ASID translation table for the SWQ based on the work queue associated with the I/O device.
Address Fine-Tuning Acceleration System
An address fine-tuning acceleration system in the technical field of address fine-tuning is disclosed. The system includes a scheduling unit, a high-order physical register block, a shared mapping unit, an address checking unit, a low-order physical register block, an immediate value detection unit, a physical memory address fine-tuning detector, a new address generation unit, a reservation station, an execution and virtual-physical memory address conversion unit, and a submission unit.
AUTOMATED TRANSLATION LOOKASIDE BUFFER SET REBALANCING
A translation lookaside buffer (TLB) having a fixed sub-TLB and a configurable sub-TLB and methods of using the TLB are provided. The TLB includes a fixed sub-TLB and a configurable sub-TLB. The fixed sub-TLB, during runtime, may store a first plurality of TLB entries corresponding to a first page size set. The configurable sub-TLB, during runtime, is configurable to store a second plurality of TLB entries of a second page size set. The second page size set includes at least a first page size of the first page size set and includes at least a second page size not of the first page size set.
Memory interface between physical and virtual address spaces
A memory interface for interfacing between a memory bus addressable using a physical address space and a cache memory addressable using a virtual address space, the memory interface comprising: a memory management unit configured to maintain a mapping from the virtual address space to the physical address space; and a coherency manager comprising a reverse translation module configured to maintain a mapping from the physical address space to the virtual address space; wherein the memory interface is configured to: receive a memory read request from the cache memory, the memory read request being addressed in the virtual address space; translate the memory read request, at the memory management unit, to a translated memory read request addressed in the physical address space for transmission on the memory bus; receive a snoop request from the memory bus, the snoop request being addressed in the physical address space; and translate the snoop request, at the coherency manager, to a translated snoop request addressed in the virtual address space for processing in connection with the cache memory.
Method and apparatus for an efficient TLB lookup
The present disclosure relates to a method of operating a translation lookaside buffer (TLB) arrangement for a processor supporting virtual addressing, wherein multiple translation engines are used to perform translations on request of one of a plurality of dedicated processor units. The method comprises: maintaining by a cache unit a dependency matrix for the engines to track for each processing unit if an engine is assigned to the each processing unit for a table walk. The cache unit may block a processing unit from allocating an engine to a translation request when the engine is already assigned to the processing unit in the dependency matrix.