G06F9/30058

PACKING CONDITIONAL BRANCH OPERATIONS
20230214219 · 2023-07-06 ·

Disclosed in some examples, are systems, methods, devices, and machine readable mediums which use improved dynamic programming algorithms to pack conditional branch instructions. Conditional code branches may be modeled as directed acyclic graphs (DAGs) which have a topological ordering. These DAGs may be used to construct a dynamic programming table to find a partial mapping of one path onto the other path using dynamic programming algorithms.

INFERRING FUTURE VALUE FOR SPECULATIVE BRANCH RESOLUTION IN A MICROPROCESSOR

A system, processor, programming product and/or method including: an instruction dispatch unit configured to dispatch instructions of a compare immediate-conditional branch instruction sequence; and a compare register having at least one entry to hold information in a plurality of fields. Operations include: writing information from a first instruction of the compare immediate-conditional branch instruction sequence into one or more of the plurality of fields in an entry in the compare register; writing an immediate field and the ITAG of a compare immediate instruction into the entry in the compare register; writing, in response to dispatching a conditional branch instruction, an inferred compare result value into the entry in the compare register; comparing a computed compare result value to the inferred compare result value stored in the entry in the compare register; and not execute the compare immediate instruction or the conditional branch instruction.

Control flow mechanism for execution of graphics processor instructions using active channel packing

An apparatus to facilitate control flow in a graphics processing system is disclosed. The apparatus includes logic a plurality of execution units to execute single instruction, multiple data (SIMD) and flow control logic to detect a diverging control flow in a plurality of SIMD channels and reduce the execution of the control flow to a subset of the SIMD channels.

Backpressure control using a stop signal for a multi-threaded, self-scheduling reconfigurable computing fabric
11531543 · 2022-12-20 · ·

Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array. A representative configurable circuit includes a configurable computation circuit and a configuration memory having a first, instruction memory storing a plurality of data path configuration instructions to configure a data path of the configurable computation circuit; and a second, instruction and instruction index memory storing a plurality of spoke instructions and data path configuration instruction indices for selection of a master synchronous input, a current data path configuration instruction, and a next data path configuration instruction for a next configurable computation circuit.

CENTRALIZED CONTROL OF EXECUTION OF QUANTUM PROGRAM

Embodiments are provided for centralized control of execution of a quantum program. In some embodiments, a system can include a processor that executes computer-executable components stored in memory. The computer-executable components include a synchronization component that causes multiple controller devices remotely located relative to the system to be synchronized with one another and the system. The computer-executable components also include an ingestion component that accesses measurement data resulting from one or more measurements at respective qubit devices. The computer-executable components further include a composition component that generates, using the measurement data, one or more control messages for respective second controller devices of the multiple controller devices.

TECHNIQUES FOR EFFICIENTLY SYNCHRONIZING MULTIPLE PROGRAM THREADS

Various embodiments include a parallel processing computer system that enables parallel instances of a program to synchronize at disparate addresses in memory. When the parallel program instances need to exchange data, the program instances synchronize based on a mask that identifies the program instances that are synchronizing. As each program instance reaches the point of synchronization, the program instance blocks and waits for all other program instances to reach the point of synchronization. When all program instances have reached the point of synchronization, at least one program instance executes a synchronous operation to exchange data. The program instances then continue execution at respective and disparate return addresses.

Devices and methods for efficient execution of rules using pre-compiled directed acyclic graphs

In one aspect, a computer implemented method for translating and executing rules using a directed acyclic graph is provided. The method includes transforming a ruleset into a directed acyclic graph. The directed acyclic graph includes a plurality of nodes and a plurality of branches. The method further includes identifying similarities across the plurality of branches. The method further includes grouping branches of the directed acyclic graph based on the identified similarities. The method further includes creating a modified directed acyclic graph based on the grouping. The method further includes selecting and using a method of processing a group of the modified directed acyclic graph based on an aspect of the group.

Controlling accesses to a branch prediction unit for sequences of fetch groups

An electronic device is described that handles control transfer instructions (CTIs) when executing instructions in program code. The electronic device has a processor that includes a branch prediction functional block and a sequential fetch logic functional block. The sequential fetch logic functional block determines, based on a record associated with a CTI, that a specified number of fetch groups of instructions that were previously determined to include no CTIs are to be fetched for execution in sequence following the CTI. When each of the specified number of fetch groups is fetched and prepared for execution, the sequential fetch logic prevents corresponding accesses of the branch prediction functional block for acquiring branch prediction information for instructions in that fetch group.

System and method for obfuscating opcode commands in a semiconductor device
11509461 · 2022-11-22 · ·

A method for securing an integrated circuit chip includes obtaining a first value from a first storage area in the chip, obtaining a second value from a second storage area in the chip, generating a third value based on the first value and the second value, and converting a first opcode command obfuscated as a second opcode command into a non-obfuscated form of the first opcode command based on the third value. The first value corresponds to a physically unclonable function (PUF) of the chip. The second value is a key including information indicating a type of obfuscation performed to obfuscate the first opcode command as the second opcode command. The third value may be an inversion flag indicating a type of obfuscation performed to obfuscate the first opcode command as the second opcode command.

VARIABLE FORMATTING OF BRANCH TARGET BUFFER

Embodiments include a hierarchical metadata prediction system that includes a first line-based predictor having a first line for storage of metadata entries, and a second line-based predictor configured to store metadata entries from the first line-based predictor. The second line-based predictor has a second line, the second line including a plurality of containers, the plurality of containers including at least a first set of containers having a first size and a second set of containers having a second size. The system also includes a processing device configured to transfer one or more metadata entries between the first line-based predictor and the second-line based predictor. Embodiments also include a computer-implemented method and a computer program product.