G06F9/3861

Method for replenishing a thread queue with a target instruction of a jump instruction
11579885 · 2023-02-14 · ·

Methods and electronic circuits for executing instructions in a central processing unit (CPU) are provided. One of the methods includes forming an instruction block by sequentially fetching, from a current thread queue, one or more instructions including one jump instruction, wherein the jump instruction is the last instruction in the instruction block; transmitting the instruction block to a CPU execution unit for execution; replenishing the current thread queue with at least one instruction to form a thread queue to be executed; determining a target instruction of the jump instruction according to an execution result of the CPU execution unit; determining whether the target instruction is contained in the thread queue to be executed; and if not, flushing the thread queue to be executed, obtaining the target instruction and adding the target instruction to the thread queue to be executed.

Security enhancement in hierarchical protection domains

Methods and systems for allowing software components that operate at a specific exception level (e.g., EL-3 to EL-1, etc.) to repeatedly or continuously observe or evaluate the integrity of software components operating at a lower exception level (e.g., EL-2 to EL-0) to ensure that the software components have not been corrupted or compromised (e.g., subjected to malware, cyberattacks, etc.) include a computing device that identifies, by a component operating at a higher exception level (“HEL component”), at least one of a current vector base address (VBA), an exception raising instruction (ERI) address, or a control and system register value associated with a component operating at a lower exception level (“LEL component”). The computing device may perform a responsive action in response to determining that the current VBA, the ERT address, or control and system register value do not match the corresponding reference data.

INTERMODAL CALLING BRANCH INSTRUCTION
20230010863 · 2023-01-12 ·

Processing circuitry has a handler mode and a thread mode. In response to an exception condition, a switch to handler mode is made. In response to an intermodal calling branch instruction specifying a branch target address when the processing circuitry is in the handler mode, an instruction decoder controls the processing circuitry to save a function return address to a function return address storage location; switch a current mode of the processing circuitry to the thread mode; and branch to an instruction identified by the branch target address. This can be useful for deprivileging of exceptions.

Writeback Hazard Elimination
20230011446 · 2023-01-12 ·

A processor includes a processing pipeline, a plurality of result-storage elements, and writeback logic. The processing pipeline is configured to process program operations and to write, to a result storage, up to a predefined maximal number of results of the processed program operations per clock cycle. The result-storage elements are configured to store respective ones of the results. The writeback logic is configured to (i) detect a writeback conflict event in which the processing pipeline produces simultaneous results that exceed the predefined maximal number of results, for writing to the result storage, in a same clock cycle, (ii) in response to detecting the writeback conflict event, to temporarily store at least a given result, from among the simultaneous results, in a given result-storage element, and (iii) to subsequently write the temporarily-stored given result from the given result-storage element to the result storage.

Handling an input/output store instruction

An input/output store instruction is handled. A data processing system includes a system nest coupled to at least one input/output bus by an input/output bus controller. The data processing system further includes at least a data processing unit including a core, system firmware and an asynchronous core-nest interface. The data processing unit is coupled to the system nest via an aggregation buffer. The system nest is configured to asynchronously load from and/or store data to at least one external device which is coupled to the at least one input/output bus. The data processing unit is configured to complete the input/output store instruction before an execution of the input/output store instruction in the system nest is completed. The asynchronous core-nest interface includes an input/output status array with multiple input/output status buffers. The system firmware includes a retry buffer and the core includes an analysis and retry logic.

ADVANCED PROCESSOR ARCHITECTURE
20180004530 · 2018-01-04 ·

The invention relates to a method for processing instructions out-of-order on a processor comprising an arrangement of execution units. The inventive method comprises: 1) looking up operand sources in a Register Positioning Table and setting operand input references of the instruction to be issued accordingly; 2) checking for an Execution Unit (EXU) available for receiving a new instruction; and 3) issuing the instruction to the available Execution Unit and enter a reference of the result register addressed by the instruction to be issued to the Execution Unit into the Register Positioning Table (RPT).

SEQUENTIAL MONITORING AND MANAGEMENT OF CODE SEGMENTS FOR RUN-TIME PARALLELIZATION

A processor includes an instruction pipeline and control circuitry. The instruction pipeline is configured to process instructions of program code. The control circuitry is configured to monitor the processed instructions at run-time, to construct an invocation data structure comprising multiple entries, wherein each entry (i) specifies an initial instruction that is a target of a branch instruction, (ii) specifies a portion of the program code that follows one or more possible flow-control traces beginning from the initial instruction, and (iii) specifies, for each possible flow-control trace specified in the entry, a next entry that is to be processed following processing of that possible flow-control trace, and to configure the instruction pipeline to process segments of the program code, by continually traversing the entries of the invocation data structure.

SPLIT CONTROL STACK AND DATA STACK PLATFORM

In one example, a method includes allocating separate portions of memory for a control stack and a data stack. The method also includes, upon detecting a call instruction, storing a first return address in the control stack and a second return address in the data stack; and upon detecting a return instruction, popping the first return address from the control stack and the second return address from the data stack and raising an exception if the two return addresses do not match. Otherwise, the return instruction returns the first return address. Additionally, the method includes executing an exception handler in response to the return instruction detecting an exception, wherein the exception handler is to pop one or more return addresses from the control stack until the return address on a top of the control stack matches the return address on a top of the data stack.

Method for migrating CPU state from an inoperable core to a spare core

An apparatus is disclosed in which the apparatus may include a plurality of cores, including a first core, a second core and a third core, and circuitry coupled to the first core. The first core may be configured to process a plurality of instructions. The circuitry may be may be configured to detect that the first core stopped committing a subset of the plurality of instructions, and to send an indication to the second core that the first core stopped committing the subset. The second core may be configured to disable the first core from further processing instructions of the subset responsive to receiving the indication, and to copy data from the first core to a third core responsive to disabling the first core. The third core may be configured to resume processing the subset dependent upon the data.

Enabling removal and reconstruction of flag operations in a processor

In one embodiment, a processor includes fetch logic to fetch instructions, decode logic to decode the fetched instructions, and execution logic to execute at least some of the instructions. The decode logic may determine whether a flag portion of a first instruction to be folded is to be performed, and if not, accumulate a first immediate value of the first instruction with a folded immediate value obtained from an entry of an immediate buffer.