G06F9/30174

Inter-environment communication with environment isolation
11537411 · 2022-12-27 · ·

Described techniques enable inter-environment communication, including isolating two runtime environments from one another as needed to ensure that operations of one runtime environment do not negatively affect operations of the other runtime environment during the inter-environment communication. Such isolation may be maintained when the two runtime environments use different addressing schemes, and when the two runtime environments use different call linkage techniques for identifying, locating, and passing stored parameters or other data.

Instruction decoding using hash tables

Systems and methods for instruction decoding using hash tables. An example method of constructing a decoding tree comprises: generating an aggregated vector of differentiating bit scores representing at least a subset of a set of processor instructions; identifying, based on the aggregated vector of differentiating bit scores, one or more opcode bit positions; and constructing a hash table implementing a current level of a decoding tree representing the subset of the set of processor instructions, wherein the hash table is indexed by one or more opcode bits identified by the one or more opcode bit positions.

COMPUTER-READABLE RECORDING MEDIUM STORING COMMAND CONVERSION PROGRAM, COMMAND CONVERSION METHOD, AND COMMAND CONVERSION APPARATUS
20230056168 · 2023-02-23 · ·

A recording medium stores a program for causing a computer to execute a process including: converting, in a first source code corresponding to a first-type processor, a first load command for a first mask register included in the first-type processor into a second load command for a second mask register included in a second-type processor; and converting, when a first SIMD command for performing an arithmetic operation using the first mask register exists after the first load command in the first source code and a state of a value of the first mask register does not coincide with a state of a value of the first mask register, the first SIMD command into a second SIMD command corresponding to the second-type processor and a change command for changing a state of a value of the second mask register to a state of a value of the second mask register.

Architecture for table-based mathematical operations for inference acceleration in machine learning

A processing unit to support inference acceleration for machine learning (ML) comprises an inline post processing unit configured to accept and maintain one or more lookup tables for performing each of one or more non-linear mathematical operations. The inline post processing unit is further configured to accept data from a set of registers maintaining output from a processing block instead of streaming the data from an on-chip memory (OCM), perform the one or more non-linear mathematical operations on elements of the data from the processing block via their corresponding lookup tables, and stream post processing result of the one or more non-linear mathematical operations back to the OCM after the one or more non-linear mathematical operations are complete.

COMPUTATIONAL STORAGE WITH PRE-PROGRAMMED SLOTS USING DEDICATED PROCESSOR CORE
20220350604 · 2022-11-03 ·

The technology disclosed herein provides a method including determining one or more dedicated computations storage programs (CSPs) used in a target market for a computational storage device, storing the dedicated CSPs in one or more pre-programmed computing instruction set (CIS) slots in the computational storage device, translating one or more instructions of the dedicated CSPs for processing using a native processor, loading one or more instructions of programmable CSPs to a CSP processor implemented within an application specific integrated circuit (ASIC) of the computational storage device, and processing the one or more instructions of the programmable CSPs using the CSP processor.

COMPUTER-READABLE RECORDING MEDIUM STORING TRANSLATION PROGRAM AND TRANSLATION METHOD
20230030788 · 2023-02-02 · ·

A recording medium stores a program for causing a computer to execute processing including: incrementing a counter every time translating a CISC instruction into a RISC instruction; updating previously referenced translation timing of a register to be used for translation with a value of the counter; in a case where a use register number that stores a register number to be used for translation of the memory operand is an initial value, selecting the register number, and updating the use register number to the selected register number; in a case where the use register number is not the initial value, and when the use register number is not used, skipping data restoration and data saving for the register of the use register number, and generating an instruction to read data of the memory operand by using the register; and generating the RISC instruction equivalent to the CISC instruction.

Performance scaling for binary translation

Embodiments relate to improving user experiences when executing binary code that has been translated from other binary code. Binary code (instructions) for a source instruction set architecture (ISA) cannot natively execute on a processor that implements a target ISA. The instructions in the source ISA are binary-translated to instructions in the target ISA and are executed on the processor. The overhead of performing binary translation and/or the overhead of executing binary-translated code are compensated for by increasing the speed at which the translated code is executed, relative to non-translated code. Translated code may be executed on hardware that has one or more power-performance parameters of the processor set to increase the performance of the processor with respect to the translated code. The increase in power-performance for translated code may be proportional to the degree of translation overhead.

Instruction translation support method and information processing apparatus
11635947 · 2023-04-25 · ·

A process includes receiving a table data set that represents mappings between a plurality of operand patterns indicating types of operands possibly included in a first instruction used in a first assembly language and a plurality of second instructions used in a second assembly language or a machine language corresponding to the second assembly language. The table data set maps two or more of the second instructions to each of the operand patterns. The process also includes generating, based on the table data set, a translation program used to translate first code written in the first assembly language into second code written in the second assembly language or the machine language. The translation program defines a process of determining an operand pattern of an instruction included in the first code and outputting two or more instructions of the second code according to the determined operand pattern.

INSTRUCTION EXECUTION METHOD AND INSTRUCTION EXECUTION DEVICE
20230161594 · 2023-05-25 ·

An instruction configuration and execution method includes the following steps. A target instruction is received through an instruction cache. The target instruction is decoded by an instruction translator. It is determined whether the target instruction has the authority to read or write the model specific register in an unprivileged state. It is determined whether the model specific register index of the specific instruction corresponds to a specific model specific register, so as to order the microprocessor to perform an instruction serialization operation.

Method and system for instruction block to execution unit grouping
11656875 · 2023-05-23 · ·

A method for emulating a guest centralized flag architecture by using a native distributed flag architecture. The method includes receiving an incoming instruction sequence using a global front end; grouping the instructions to form instruction blocks, wherein each of the instruction blocks comprise two half blocks; scheduling the instructions of the instruction block to execute in accordance with a scheduler; and using a distributed flag architecture to emulate a centralized flag architecture for the emulation of guest instruction execution.