Patent classifications
G06F12/0833
Operating system deactivation of storage block write protection absent quiescing of processors
Operating system deactivation of write protection for a storage block is provided absent quiescing of processors in a multi-processor computing environment. The process includes receiving an address translation protection exception interrupt resulting from an attempted write access by a processor to a storage block, and determining by the operating system whether write protection for the storage block is active. Based on write protection for the storage block not being active, the operating system issues an instruction to clear or modify translation lookaside buffer entries of the processor associated with the storage block, absent waiting for an action by another processor of multiple processors of the computing environment, to facilitate write access to the storage block proceeding at the processor.
Scalable System on a Chip
An integrated circuit (IC) including a plurality of processor cores, a plurality of graphics processing units, a plurality of peripheral circuits, and a plurality of memory controllers is configured to support scaling of the system using a unified memory architecture. For example, the IC may include an interconnect fabric configured to provide communication between the one or more memory controller circuits and the processor cores, graphics processing units, and peripheral devices; and an off-chip interconnect coupled to the interconnect fabric and configured to couple the interconnect fabric to a corresponding interconnect fabric on another instance of the integrated circuit, wherein the interconnect fabric and the off-chip interconnect provide an interface that transparently connects the one or more memory controller circuits, the processor cores, graphics processing units, and peripheral devices in either a single instance of the integrated circuit or two or more instances of the integrated circuit.
Reducing cache transfer overhead in a system
A method and a system detects a cache line as a potential or confirmed hot cache line based on receiving an intervention of a processor associated with a fetch of the cache line. The method and system include suppressing an action of operations associated with the hot cache line. A related method and system detect an intervention and, in response, communicates an intervention notification to another processor. An alternative method and system detect a hot data object associated with an intervention event of an application. The method and system can suppress actions of operations associated with the hot data object. An alternative method and system can detect and communicate an intervention associated with a data object.
ARITHMETIC PROCESSING DEVICE AND ARITHMETIC PROCESSING METHOD
An arithmetic processing device including: request issuing units configured to issue an access request to a storage; and banks each of which includes: a first cache area including first entries; a second cache area including second entries; a control unit; and a determination unit that determines a cache hit or a cache miss for each of the banks, wherein the control unit performs: in response that the access requests simultaneously received from the request issuing units make the cache miss, storing the data, which is read from the storage device respectively by the access requests, in one of the first entries and one of the second entries; and in response that the access requests simultaneously received from the request issuing units make the cache hit in the first and second cache areas, outputting the data retained in the first and second entries, to each of issuers of the access requests.
Computer memory expansion device and method of operation
A memory expansion device operable with a host computer system (host) comprises a non-volatile memory (NVM) subsystem, cache memory, and control logic configurable to receive a submission from the host including a read command and specifying a payload in the NVM subsystem and demand data in the payload. The control logic is configured to request ownership of a set of cache lines corresponding to the payload, to indicate completion of the submission after acquiring ownership of the cache lines, and to load the payload to the cache memory. The set of cache lines correspond to a set of cache lines in a coherent destination memory space accessible by the host. The control logic is further configured to, after indicating completion of the submission and in response to a request from the host to read demand data in the payload, return the demand data after determining that the demand data is in the cache memory.
PHYSICAL ADDRESS PROXY REUSE MANAGEMENT
Each load/store queue entry holds a load/store physical address proxy (PAP) for use as a proxy for a load/store physical memory line address (PMLA). The load/store PAP comprises a set index and a way that uniquely identifies an L2 cache entry holding a memory line at the load/store PMLA when an L1 cache provides the load/store PAP during the load/store instruction execution. The microprocessor removes a line at a removal PMLA from an L2 entry, forms a removal PAP as a proxy for the removal PMLA that comprises a set index and a way, snoops the load/store queue with the removal PAP to determine whether the removal PAP is being used as a proxy for the removal PMLA, fills the removed entry with a line at a fill PMLA, and prevents the removal PAP from being used as a proxy for the removal PMLA and the fill PMLA concurrently.
MICROPROCESSOR THAT PREVENTS SAME ADDRESS LOAD-LOAD ORDERING VIOLATIONS USING PHYSICAL ADDRESS PROXIES
A L2 set associative cache that is inclusive of an L1 cache. Each entry of a load queue holds a load physical address proxy (PAP) for a load physical memory line address (PMLA) rather than the load PMLA itself. The load PAP comprises the set index and the way that uniquely identifies the L2 entry that holds a memory line specified by the load PMLA. Each load queue entry indicates whether the load instruction has completed execution. The microprocessor removes a memory line at a removal PMLA from an L2 entry and forms a removal PAP as a proxy for the removal PMLA. The removal PAP comprises a set index and a way that uniquely identifies the removed entry. The microprocessor snoops the load queue with the removal PAP to determine whether the removal PAP matches one or more load PAPs in one or more load queue entries associated with one or more load instructions that have completed execution and, if so, signals an abort request.
UNFORWARDABLE LOAD INSTRUCTION RE-EXECUTION ELIGIBILITY BASED ON CACHE UPDATE BY IDENTIFIED STORE INSTRUCTION
A microprocessor includes a cache memory, a store queue, and a load/store unit. Each entry of the store queue holds store data associated with a store instruction. The load/store unit, during execution of a load instruction, makes a determination that an entry of the store queue holds store data that includes some but not all bytes of load data requested by the load instruction, cancels execution of the load instruction in response to the determination, and writes to an entry of a structure from which the load instruction is subsequently issuable for re-execution an identifier of a store instruction that is older in program order than the load instruction and an indication that the load instruction is not eligible to re-execute until the identified older store instruction updates the cache memory with store data.
STORE-TO-LOAD FORWARDING CORRECTNESS CHECKS USING PHYSICAL ADDRESS PROXIES STORED IN LOAD QUEUE ENTRIES
A microprocessor includes a load/store unit that performs store-to-load forwarding, a PIPT L2 set-associative cache, a store queue having store entries, and a load queue having load entries. Each L2 entry is uniquely identified by a set index and a way. Each store/load entry holds, for an associated store/load instruction, a store/load physical address proxy (PAP) for a store/load physical memory line address (PMLA). The store/load PAP specifies the set index and the way of the L2 entry into which a cache line specified by the store/load PMLA is allocated. Each load entry also holds associated load instruction store-to-load forwarding information. The load/store unit compares the store PAP with the load PAP of each valid load entry whose associated load instruction is younger in program order than the store instruction and uses the comparison and associated forwarding information to check store-to-load forwarding correctness with respect to each younger load instruction.
USING PHYSICAL ADDRESS PROXIES TO HANDLE SYNONYMS WHEN WRITING STORE DATA TO A VIRTUALLY-INDEXED CACHE
A microprocessor includes a virtually-indexed L1 data cache that has an allocation policy that permits multiple synonyms to be co-resident. Each L2 entry is uniquely identified by a set index and a way number. A store unit, during a store instruction execution, receives a store physical address proxy (PAP) for a store physical memory line address (PMLA) from an L1 entry hit upon by a store virtual address, and writes the store PAP to a store queue entry. The store PAP comprises the set index and the way number of an L2 entry that holds a line specified by the store PMLA. The store unit, during the store commit, reads the store PAP from the store queue, looks up the store PAP in the L1 to detect synonyms, writes the store data to one or more of the detected synonyms, and evicts the non-written detected synonyms.