Patent classifications
G06F2212/681
APPARATUS AND METHOD FOR EFFICIENT PROCESS-BASED COMPARTMENTALIZATION
An apparatus and method for efficient process-based compartmentalization. For example, one embodiment of a processor comprises: execution circuitry to execute instructions and process data; memory management circuitry coupled to the execution circuitry, the memory management circuitry to manage access to a system memory by a plurality of related processes using one or more process-specific translation structures and one or more shared translation structures to be shared by the related processes; and one or more control registers to store a process-specific base address pointer associated with a first process of the plurality of related processes and to store a shared base address pointer to identify the shared translation structures; wherein the memory management circuitry is to use the process-specific base address pointer in combination with a first linear address provided by the first process to walk the process-specific translation structures to identify any permissions and/or physical address associated with the first linear address, wherein if permissions are identified, the memory management circuitry is to use the permissions in place of any permissions specified in the shared translation structures.
Complex I/O value prediction for multiple values with physical or virtual addresses
An apparatus, and corresponding method, for input/output (I/O) value determination, generates an I/O instruction for an I/O device, the I/O device including a state machine with state transition logic. The apparatus comprises a controller that includes a simplified state machine with a reduced version of the state transition logic of the state machine of the I/O device. The controller is configured to improve instruction execution performance of a processor core by employing the simplified state machine to predict at least one state value of at least one I/O device true state value to be affected by the I/O instruction at the I/O device.
Cryptographic computing engine for memory load and store units of a microarchitecture pipeline
A processor comprises a first register to store an encoded pointer to a memory location. First context information is stored in first bits of the encoded pointer and a slice of a linear address of the memory location is stored in second bits of the encoded pointer. The processor also includes circuitry to execute a memory access instruction to obtain a physical address of the memory location, access encrypted data at the memory location, derive a first tweak based at least in part on the encoded pointer, and generate a keystream based on the first tweak and a key. The circuitry is to further execute the memory access instruction to store state information associated with memory access instruction in a first buffer, and to decrypt the encrypted data based on the keystream. The keystream is to be generated at least partly in parallel with accessing the encrypted data.
Complex I/O Value Prediction for Multiple Values with Physical or Virtual Addresses
An apparatus, and corresponding method, for input/output (I/O) value determination, generates an I/O instruction for an I/O device, the I/O device including a state machine with state transition logic. The apparatus comprises a controller that includes a simplified state machine with a reduced version of the state transition logic of the state machine of the I/O device. The controller is configured to improve instruction execution performance of a processor core by employing the simplified state machine to predict at least one state value of at least one I/O device true state value to be affected by the I/O instruction at the I/O device.
Zero latency prefetching in caches
This invention involves a cache system in a digital data processing apparatus including: a central processing unit core; a level one instruction cache; and a level two cache. The cache lines in the second level cache are twice the size of the cache lines in the first level instruction cache. The central processing unit core requests additional program instructions when needed via a request address. Upon a miss in the level one instruction cache that causes a hit in the upper half of a level two cache line, the level two cache supplies the upper half level cache line to the level one instruction cache. On a following level two cache memory cycle, the level two cache supplies the lower half of the cache line to the level one instruction cache. This cache technique thus prefetches the lower half level two cache line employing fewer resources than an ordinary prefetch.
Translating virtual addresses in a virtual memory based system
Translating virtual addresses to second addresses by a memory controller local to one or more memory devices, wherein the memory controller is not local to a processor, a buffer for storing a plurality of Page Table Entries, or a Page Walk Cache for storing a plurality of page directory entries, the method including by the memory controller: receiving a page directory base and a plurality of memory offsets from the processor; reading a first level page directory entry using the page directory base and a first level memory offset; combining the second level offset and the first level page directory entry; reading a second level page directory entry using the first level page directory entry and the second level memory offset; sending to the processor the first level page directory entry or the second level page directory entry; and sending a page table entry to the processor.
TRANSLATION LOOKASIDE BUFFER INVALIDATION
A type of translation lookaside buffer (TLB) invalidation instruction is described which specifically targets a first type of TLB which stores combined stage-1-and-2 entries which depend on both stage 1 translation data and the stage 2 translation data, and which is configured to ignore a TLB invalidation command which invalidates based on a first set of one or more invalidation conditions including an address-based invalidation condition depending on matching of intermediate address. A second type of TLB other than the first type ignores the invalidation command triggered by the first type of TLB invalidation instruction. This approach helps to limit the performance impact of stage 2 invalidations in systems supporting a combined stage-1-and-2 TLB which cannot invalidate by intermediate address.
Poison mechanisms for deferred invalidates
An apparatus includes multiple processors including respective cache memories, the cache memories configured to cache cache-entries for use by the processors. At least a processor among the processors includes cache management logic that is configured to (i) receive, from one or more of the other processors, cache-invalidation commands that request invalidation of specified cache-entries in the cache memory of the processor (ii) mark the specified cache-entries as intended for invalidation but defer actual invalidation of the specified cache-entries, and (iii) upon detecting a synchronization event associated with the cache-invalidation commands, invalidate the cache-entries that were marked as intended for invalidation.
System-on-chip performing address translation and operating method thereof
An operating method of a system-on-chip includes outputting a prefetch command in response to an update of mapping information on a first read target address, the update occurring in a first translation lookaside buffer storing first mapping information of a second address with respect to a first address, and storing, in response to the prefetch command, in a second translation lookaside buffer, second mapping information of a third address with respect to at least some second addresses of an address block including a second read target address.
Fine-grained access memory controller
Systems and methods are provided to perform fine-grained memory accesses using a memory controller. The memory controller can access elements stored in memory across multiple dimensions of a matrix. The memory controller can perform accesses to non-contiguous memory locations by skipping zero or more elements across any dimension of the matrix.