Patent classifications
G06F9/30058
PACKING CONDITIONAL BRANCH OPERATIONS
Disclosed in some examples, are systems, methods, devices, and machine readable mediums which use improved dynamic programming algorithms to pack conditional branch instructions. Conditional code branches may be modeled as directed acyclic graphs (DAGs) which have a topological ordering. These DAGs may be used to construct a dynamic programming table to find a partial mapping of one path onto the other path using dynamic programming algorithms.
Variable-length instruction buffer management
A vector processor is disclosed including a variety of variable-length instructions. Computer-implemented methods are disclosed for efficiently carrying out a variety of operations in a time-conscious, memory-efficient, and power-efficient manner. Methods for more efficiently managing a buffer by controlling the threshold based on the length of delay line instructions are disclosed. Methods for disposing multi-type and multi-size operations in hardware are disclosed. Methods for condensing look-up tables are disclosed. Methods for in-line alteration of variables are disclosed.
INTERMODAL CALLING BRANCH INSTRUCTION
Processing circuitry has a handler mode and a thread mode. In response to an exception condition, a switch to handler mode is made. In response to an intermodal calling branch instruction specifying a branch target address when the processing circuitry is in the handler mode, an instruction decoder controls the processing circuitry to save a function return address to a function return address storage location; switch a current mode of the processing circuitry to the thread mode; and branch to an instruction identified by the branch target address. This can be useful for deprivileging of exceptions.
SEQUENTIAL MONITORING AND MANAGEMENT OF CODE SEGMENTS FOR RUN-TIME PARALLELIZATION
A processor includes an instruction pipeline and control circuitry. The instruction pipeline is configured to process instructions of program code. The control circuitry is configured to monitor the processed instructions at run-time, to construct an invocation data structure comprising multiple entries, wherein each entry (i) specifies an initial instruction that is a target of a branch instruction, (ii) specifies a portion of the program code that follows one or more possible flow-control traces beginning from the initial instruction, and (iii) specifies, for each possible flow-control trace specified in the entry, a next entry that is to be processed following processing of that possible flow-control trace, and to configure the instruction pipeline to process segments of the program code, by continually traversing the entries of the invocation data structure.
Control registers to store thread identifiers for threaded loop execution in a self-scheduling reconfigurable computing fabric
Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array. A representative configurable circuit includes a configurable computation circuit and a configuration memory having a first, instruction memory storing a plurality of data path configuration instructions to configure a data path of the configurable computation circuit; and a second, instruction and instruction index memory storing a plurality of spoke instructions and data path configuration instruction indices for selection of a master synchronous input, a current data path configuration instruction, and a next data path configuration instruction for a next configurable computation circuit.
Optimized branching using safe static keys
Systems and methods for managing optimized branching in executable instructions are disclosed. In one implementation, a processing device may identify, in a sequence of executable instructions, a branching instruction associated with a safe static key, the branching instruction specifying a first target location. The processing device may determine whether a value of the safe static key is initialized. Responsive to determining that the value of the safe static key is initialized, the processing device may further replace the branching instruction with an unconditional branching instruction specifying the first target location. Responsive to determining that the value of the safe static key is uninitialized, the processing device may replace the branching instruction with a conditional branching instruction specifying the first target location.
Inferring future value for speculative branch resolution
Aspects of the invention include includes determining a first instruction in a processing pipeline, wherein the first instruction includes a compare instruction, determining a second instruction in the processing pipeline, wherein the second instruction includes a conditional branch instruction relying on the compare instruction, determining a predicted result of the compare instruction, and completing the conditional branch instruction using the predicted result prior to executing the compare instruction.
STATEFUL MICROCODE BRANCHING
Stateful microbranch instructions, including: generating, based on an instruction, a first one or more microinstructions including a stateful microbranch instruction, wherein the stateful microbranch instruction includes: an address of a next instruction after the instruction; a branch target address; one or more microcode attributes; and executing the first one or more microinstructions.
Moving entries between multiple levels of a branch predictor based on a performance loss resulting from fewer than a pre-set number of instructions being stored in an instruction cache register
An instruction processing device and an instruction processing method are provided. The instruction processing device includes: a first-level branch target buffer, configured to store entries of a first plurality of branch instructions; a second-level branch target buffer, configured to store entries of a second plurality of branch instructions, wherein the entries in the first-level branch target buffer are accessed faster than the entries in the second-level branch target buffer; an instruction fetch unit coupled to the first-level branch target buffer and the second-level branch target buffer, the instruction fetch unit including circuitry configured to add, for a first branch instruction, one or more entries corresponding to the first branch instruction into the first-level branch target buffer when the one or more entries corresponding to the first branch instruction are identified in the second-level branch target buffer; and an execution unit including circuitry configured to execute the first branch instruction.
Method and apparatus to process SHA-2 secure hashing algorithm
A processor includes an instruction decoder to receive a first instruction to process a secure hash algorithm 2 (SHA-2) hash algorithm, the first instruction having a first operand associated with a first storage location to store a SHA-2 state and a second operand associated with a second storage location to store a plurality of messages and round constants. The processor further includes an execution unit coupled to the instruction decoder to perform one or more iterations of the SHA-2 hash algorithm on the SHA-2 state specified by the first operand and the plurality of messages and round constants specified by the second operand, in response to the first instruction.