G06F9/3557

Distinct system registers for logical processors

Distinct system registers for logical processors are disclosed. In one example of the disclosed technology, a processor includes a plurality of block-based physical processor cores for executing a program comprising a plurality of instruction blocks. The processor also includes a thread scheduler configured to schedule a thread of the program for execution, the thread using the one or more instruction blocks. The processor further includes at least one system register. The at least one system register stores data indicating a number and placement of the plurality of physical processor cores to form a logical processor. The logical processor executes the scheduled thread. The logical processor is configured to execute the thread in a continuous instruction window.

Systems and methods for selectively bypassing address-generation hardware in processor instruction pipelines

Systems and methods selectively bypass address-generation hardware in processor instruction pipelines. In an embodiment, a processor includes an address-generation stage and an address-generation-bypass-determination unit (ABDU). The ABDU receives a load/store instruction. If an effective address for the load/store instruction is not known at the ABDU, the ABDU routes the load/store instruction via the address-generation stage of the processor. If, however, the effective address of the load/store instruction is known at the ABDU, the ABDU routes the load/store instruction to bypass the address-generation stage of the processor.

Predicting a table of contents pointer value responsive to branching to a subroutine

Predicting a Table of Contents (TOC) pointer value responsive to branching to a subroutine. A subroutine is called from a calling module executing on a processor. Based on calling the subroutine, a value of a pointer to a reference data structure, such as a TOC, is predicted. The predicting is performed prior to executing a sequence of one or more instructions in the subroutine to compute the value. The value that is predicted is used to access the reference data structure to obtain a variable value for a variable of the subroutine.

Secure communication between operating system and processes

According to an embodiment, an information processing apparatus includes one or more processor. The processor is configured to run a process and a process manager to manage the process. The process includes a first key generator, a first authentication code generator, and a first output unit. The first key generator is configured to generate a first message authentication key by using process unique data assigned by the process manager. The first authentication code generator is configured to generate a first message authentication code by using the first message authentication key and a first message. The first output unit is configured to transmit the first message and the first message authentication code to the process manager.

Dense read encoding for dataflow ISA

Apparatus and methods are disclosed for controlling execution of memory access instructions in a block-based processor architecture using an instruction decoder that decodes instructions having variable numbers of target operands. In one example of the disclosed technology, a block-based processor core includes an instruction decoder configured to decode target operands for an instruction in an instruction block, the instruction being encoded to allow for a variable number of target operands and a control unit configured to send data for at least one of the decoded target operands for an operation performed by the at least one of the cores. In some examples, the instruction indicates target instructions with a vector encoding. In other examples, a variable length format allows for the indication of one or more targets.

UNIVERSAL FLOATING-POINT INSTRUCTION SET ARCHITECTURE FOR COMPUTING DIRECTLY WITH DECIMAL CHARACTER SEQUENCES AND BINARY FORMATS IN ANY COMBINATION
20210049011 · 2021-02-18 ·

A universal floating-point Instruction Set Architecture (ISA) implemented entirely in hardware. Using a single instruction, the universal floating-point ISA has the ability, in hardware, to compute directly with dual decimal character sequences up to IEEE 754-2008 H=20 in length, without first having to explicitly perform a conversion-to-binary-format process in software before computing with these human-readable floating-point or integer representations. The ISA does not employ opcodes, but rather pushes and pulls gobs of data without the encumbering opcode fetch, decode, and execute bottleneck. Instead, the ISA employs stand-alone, memory-mapped operators, complete with their own pipeline that is completely decoupled from the processor's primary push-pull pipeline. The ISA employs special three-port, 1024-bit wide SRAMS; a special dual asymmetric system stack; memory-mapped stand-alone hardware operators with private result buffers having simultaneously readable side-A and side-B read ports; and dual hardware H=20 convertFromDecimalCharacter conversion operators.

Implementing a received add program counter immediate shift (ADDPCIS) instruction using a micro-coded or cracked sequence

A computer program product for implementing a received add program counter immediate shift (ADDPCIS) instruction using a micro-coded or cracked sequence is provided. The computer program product includes a computer readable storage medium having program instructions embodied therewith. The program instructions are readable and executable by a processing circuit to cause the processing circuit to recognize register operand and integer terms associated with the ADDPCIS instruction, set a value of a target register associated with the ADDPCIS instruction in accordance with the integer term summed with another term by obtaining a next instruction address (NIA), moving an architecturally defined register file from a first temporary register to a general purpose register and adding a shifted immediate constant to a value stored in a second temporary register.

Implementing a received add program counter immediate shift (ADDPCIS) instruction using a micro-coded or cracked sequence

A computer program product for implementing a received add program counter immediate shift (ADDPCIS) instruction using a micro-coded or cracked sequence is provided. The computer program product includes a computer readable storage medium having program instructions embodied therewith. The program instructions are readable and executable by a processing circuit to cause the processing circuit to recognize register operand and integer terms associated with the ADDPCIS instruction, set a value of a target register associated with the ADDPCIS instruction in accordance with the integer term summed with another term by obtaining a next instruction address (NIA), moving an architecturally defined register file from a first temporary register to a general purpose register and adding a shifted immediate constant to a value stored in a second temporary register.

Register read/write ordering

Apparatus and methods are disclosed for controlling execution of register access instructions in a block-based processor architecture using a hardware structure that indicates a relative ordering of register access instruction in an instruction block. In one example of the disclosed technology, a method of operating a processor includes selecting a register access instruction of the plurality of instructions to execute based at least in part on dependencies encoded within a previous block of instructions and on stored data indicating which of the register write instructions have executed for the previous block, and executing the selected instruction. In some examples, one or more of a write mask, a read mask, a register write vector register, or a counter are used to determine register read/write dependences. Based on the encoded dependencies and the masked write vector, the next instruction block can issue when its register dependencies are available.

SYSTEM AND METHOD FOR ADDRESSING DATA IN MEMORY

A digital signal processor having a CPU with a program counter register and, optionally, an event context stack pointer register for saving and restoring the event handler context when higher priority event preempts a lower priority event handler. The CPU is configured to use a minimized set of addressing modes that includes using the event context stack pointer register and program counter register to compute an address for storing data in memory. The CPU may also eliminate post-decrement, pre-increment and post-decrement addressing and rely only on post-increment addressing.