G06F9/30189

INSTRUCTION EXECUTION METHOD AND INSTRUCTION EXECUTION DEVICE
20230161594 · 2023-05-25 ·

An instruction configuration and execution method includes the following steps. A target instruction is received through an instruction cache. The target instruction is decoded by an instruction translator. It is determined whether the target instruction has the authority to read or write the model specific register in an unprivileged state. It is determined whether the model specific register index of the specific instruction corresponds to a specific model specific register, so as to order the microprocessor to perform an instruction serialization operation.

INSTRUCTION EXECUTION METHOD AND INSTRUCTION EXECUTION DEVICE
20230161593 · 2023-05-25 ·

An instruction execution method for a microprocessor is provided. The microprocessor includes a model specific register (MSR). And, the instruction execution method includes the following steps. A target instruction is received using an instruction cache. The target instruction is decoded using an instruction translator to determine whether the target instruction is a specific instruction is a specific instruction. When the target instruction is the specific instruction, a model specific register index of the target instruction is obtained to directly read or write the model specific register.

Unified register file for supporting speculative architectural states
11467839 · 2022-10-11 · ·

A method for supporting architecture speculation in an out of order processor is disclosed. The method comprises fetching two threads into the processor, wherein a first thread executes in a speculative state and a second thread executes in a non-speculative state. The method also comprises enabling a speculative scope for an execution of the first thread and a non-speculative scope for an execution of the second thread in an architecture of the processor, wherein the speculative scope and the non-speculative scope can both be fetched into the architecture and be present concurrently.

VIRTUAL MODE EXECUTION MANAGER

Disclosed herein is hardware for easing the process of changing the execution mode of a virtual machine and its associated resources. By adopting the hardware, it is possible to trigger a change in the execution mode in an automatic way, without software intervention, and without interfering with the execution of other virtual machines. In addition, in case an error has occurred for a virtual machine and it is detected, the hardware can be used to disable the resources associated with that virtual machine and generate notification of the completion this operation to other hardware, which will complete the reset of the virtual machine. By adopting the hardware, the execution mode change is simplified and offers configurability and flexibility for a system running multiple virtual machines.

APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS TO REQUEST A HISTORY RESET OF A PROCESSOR CORE

Systems, methods, and apparatuses relating to instructions to reset software thread runtime property histories in a hardware processor are described. In one embodiment, a hardware processor includes a hardware guide scheduler comprising a plurality of software thread runtime property histories; a decoder to decode a single instruction into a decoded single instruction, the single instruction having a field that identifies a model-specific register; and an execution circuit to execute the decoded single instruction to check that an enable bit of the model-specific register is set, and when the enable bit is set, to reset the plurality of software thread runtime property histories of the hardware guide scheduler.

System for executing new instructions and method for executing new instructions

A method for executing new instructions includes receiving an instruction, and determining whether the received instruction is a new instruction according to an operation code of the received instruction. When the received instruction is a new instruction, the basic decoding information of the received instruction is stored in a private register. And, the system for executing the new instructions enters a system management mode, and simulates the execution of the received instruction according to the basic decoding information stored in the private register in the system management mode; wherein the basic decoding information includes the operation code.

Mechanism for reducing coherence directory controller overhead for near-memory compute elements

A parallel processing (PP) level coherence directory, also referred to as a Processing In-Memory Probe Filter (PimPF), is added to a coherence directory controller. When the coherence directory controller receives a broadcast PIM command from a host, or a PIM command that is directed to multiple memory banks in parallel, the PimPF accelerates processing of the PIM command by maintaining a directory for cache coherence that is separate from existing system level directories in the coherence directory controller. The PimPF maintains a directory according to address signatures that define the memory addresses affected by a broadcast PIM command. Two implementations are described: a lightweight implementation that accelerates PIM loads into registers, and a heavyweight implementation that accelerates both PIM loads into registers and PIM stores into memory.

PACKED DATA ELEMENT PREDICATION PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS
20230108016 · 2023-04-06 ·

A processor includes a first mode where the processor is not to use packed data operation masking, and a second mode where the processor is to use packed data operation masking. A decode unit to decode an unmasked packed data instruction for a given packed data operation in the first mode, and to decode a masked packed data instruction for a masked version of the given packed data operation in the second mode. The instructions have a same instruction length. The masked instruction has bit(s) to specify a mask. Execution unit(s) are coupled with the decode unit. The execution unit(s), in response to the decode unit decoding the unmasked instruction in the first mode, to perform the given packed data operation. The execution unit(s), in response to the decode unit decoding the masked instruction in the second mode, to perform the masked version of the given packed data operation.

EXECUTING MULTIPLE PROGRAMS SIMULTANEOUSLY ON A PROCESSOR CORE

Systems and methods are disclosed for allocating resources to contexts in block-based processor architectures. In one example of the disclosed technology, a processor is configured to spatially allocate resources between multiple contexts being executed by the processor, including caches, functional units, and register files. In a second example of the disclosed technology, a processor is configured to temporally allocate resources between multiple contexts, for example, on a clock cycle basis, including caches, register files, and branch predictors. Each context is guaranteed access to its allocated resources to avoid starvation from contexts competing for resources of the processor. A results buffer can be used for folding larger instruction blocks into portions that can be mapped to smaller-sized instruction windows. The results buffer stores operand results that can be passed to subsequent portions of an instruction block.

ALLOCATION OF SPARE CACHE RESERVED DURING NON-SPECULATIVE EXECUTION AND SPECULATIVE EXECUTION
20230103438 · 2023-04-06 ·

A cache system, having cache sets, a connection to a line identifying an execution type, a connection to a line identifying a status of speculative execution, and a logic circuit that can: allocate a first subset of cache sets when the execution type is a first type indicating non-speculative execution, allocate a second subset when the execution type changes from the first type to a second type indicating speculative execution, and reserve a cache set when the execution type is the second type. When the execution type changes from the second to the first type and the status of speculative execution indicates that a result of speculative execution is to be accepted, the logic circuit can reconfigure the second subset when the execution type is the first type; and allocate the at least one cache set when the execution type changes from the first to the second type.