G06F9/30156

SM4 ACCELERATION PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS
20190109705 · 2019-04-11 ·

A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode an instruction. The instruction is to indicate one or more source packed data operands. The one or more source packed data operands are to have four 32-bit results of four prior SM4 cryptographic rounds, and four 32-bit values. The processor also includes an execution unit coupled with the decode unit and the plurality of the packed data registers. The execution unit, in response to the instruction, is to store four 32-bit results of four immediately subsequent and sequential SM4 cryptographic rounds in a destination storage location that is to be indicated by the instruction.

INSTRUCTION COMPRESSION METHOD, INSTRUCTION DECOMPRESSION METHOD AND PROCESS COMPRESSION METHOD
20240231828 · 2024-07-11 ·

A process compression method for compressing a process that includes a jump instruction includes the following steps: dividing the process into multiple blocks according to a position of the jump instruction in the process and a destination of the jump instruction; storing a jump relationship between these blocks; performing instruction compression on these blocks; updating a jump address of the jump instruction according to the jump relationship; determining multiple groups according to the sizes of the blocks and the jump relationship; and determining whether the jump instruction is a jump instruction of a first type or a jump instruction of a second type according to the relationship between the jump instruction and the groups.

Transient side-channel aware architecture for cryptographic computing

In one embodiment, a processor includes circuitry to decode an instruction referencing an encoded data pointer that includes a set of plaintext linear address bits and a set of encrypted linear address bits. The processor also includes circuitry to perform a speculative lookup in a translation lookaside buffer (TLB) using the plaintext linear address bits to obtain physical address, buffer a set of architectural predictor state values based on the speculative TLB lookup, and speculatively execute the instruction using the physical address obtained from the speculative TLB lookup. The processor also includes circuitry to determine whether the speculative TLB lookup was correct and update a set of architectural predictor state values of the core using the buffered architectural predictor state values based on a determination that the speculative TLB lookup was correct.

Data processing

A processing element comprises a plurality of function units (16) operable to execute respective functions in dependence upon received instructions in parallel with one another. An instruction controller includes an instruction register (41) having a plurality of register entries, each of which is operable to store an instruction word therein, and a plurality of instruction pipelines (42). Each of the pipelines (42) is associated with a function unit (16), and is operable to deliver instructions to the function unit concerned for execution thereby. Each pipeline also includes a timing controller operable to receive timing information for a received instruction, and to determine an initial location in the pipeline into which the instruction is to be loaded, and an instruction handler operable to receive an instruction for the function unit associated with the instruction pipeline concerned, and to load that instruction into the initial location determined by the timing controller.

SM4 ACCELERATION PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS
20190036683 · 2019-01-31 ·

A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode an instruction. The instruction is to indicate one or more source packed data operands. The one or more source packed data operands are to have four 32-bit results of four prior SM4 cryptographic rounds, and four 32-bit values. The processor also includes an execution unit coupled with the decode unit and the plurality of the packed data registers. The execution unit, in response to the instruction, is to store four 32-bit results of four immediately subsequent and sequential SM4 cryptographic rounds in a destination storage location that is to be indicated by the instruction.

SM4 ACCELERATION PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS
20190036684 · 2019-01-31 ·

A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode an instruction. The instruction is to indicate one or more source packed data operands. The one or more source packed data operands are to have four 32-bit results of four prior SM4 cryptographic rounds, and four 32-bit values. The processor also includes an execution unit coupled with the decode unit and the plurality of the packed data registers. The execution unit, in response to the instruction, is to store four 32-bit results of four immediately subsequent and sequential SM4 cryptographic rounds in a destination storage location that is to be indicated by the instruction.

SM4 ACCELERATION PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS
20180375642 · 2018-12-27 ·

A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode an instruction. The instruction is to indicate one or more source packed data operands. The one or more source packed data operands are to have four 32-bit results of four prior SM4 cryptographic rounds, and four 32-bit values. The processor also includes an execution unit coupled with the decode unit and the plurality of the packed data registers. The execution unit, in response to the instruction, is to store four 32-bit results of four immediately subsequent and sequential SM4 cryptographic rounds in a destination storage location that is to be indicated by the instruction.

Method and apparatus to process SHA-2 secure hashing algorithm

A processor includes an instruction decoder to receive a first instruction to process a secure hash algorithm 2 (SHA-2) hash algorithm, the first instruction having a first operand associated with a first storage location to store a SHA-2 state and a second operand associated with a second storage location to store a plurality of messages and round constants. The processor further includes an execution unit coupled to the instruction decoder to perform one or more iterations of the SHA-2 hash algorithm on the SHA-2 state specified by the first operand and the plurality of messages and round constants specified by the second operand, in response to the first instruction.

Method and apparatus to process SHA-2 secure hashing algorithm

A processor includes an instruction decoder to receive a first instruction to process a secure hash algorithm 2 (SHA-2) hash algorithm, the first instruction having a first operand associated with a first storage location to store a SHA-2 state and a second operand associated with a second storage location to store a plurality of messages and round constants. The processor further includes an execution unit coupled to the instruction decoder to perform one or more iterations of the SHA-2 hash algorithm on the SHA-2 state specified by the first operand and the plurality of messages and round constants specified by the second operand, in response to the first instruction.

Methods and apparatus for storage and translation of an entropy encoded instruction sequence to executable form

A method of compressing a sequence of program instructions begins by examining a program instruction stream to identify a sequence of two or more instructions that meet a parameter. The identified sequence of two or more instructions is replaced by a selected type of layout instruction which is then compressed. A method of decompressing accesses an X-index and a Y-index together as a compressed value. The compressed value is decompressed to a selected type of layout instruction which is decoded and replaced with a sequence of two or more instructions. An apparatus for decompressing includes a storage subsystem configured for storing compressed instructions, wherein a compressed instruction comprises an X-index and a Y-index. A decompressor is configured for translating an X-index and Y-index accessed from the storage subsystem to a selected type of layout instruction which is decoded and replaced with a sequence of two or more instructions.