IPIQ

G06F9/382

Streaming engine with error detection, correction and restart

11099933 · 2021-08-24 ·

Texas Instruments Incorporated

Disclosed embodiments relate to a streaming engine employed in, for example, a digital signal processor. A fixed data stream sequence including plural nested loops is specified by a control register. The streaming engine includes an address generator producing addresses of data elements and a steam head register storing data elements next to be supplied as operands. The streaming engine fetches stream data ahead of use by the central processing unit core in a stream buffer. Parity bits are formed upon storage of data in the stream buffer which are stored with the corresponding data. Upon transfer to the stream head register a second parity is calculated and compared with the stored parity. The streaming engine signals a parity fault if the parities do not match. The streaming engine preferably restarts fetching the data stream at the data element generating a parity fault.

Processing device and method of controlling processing device

11080057 · 2021-08-03 ·

Fujitsu Limited

Atushi Fusejima

A processing device includes an instruction decode circuit including decoders that decode instructions respectively assigned an instruction number that is determined for every one of the decoders, an instruction execution circuit that executes the instructions decoded by the instruction decode circuit, an instruction complete holding circuit including hold blocks provided in correspondence with each of the decoders and respectively including hold regions assigned the instruction number, and used for an instruction complete process, and an instruction complete controller that stores instruction information that is generated by decoding the instructions by the decoders, in one of the hold regions of the hold block corresponding to the decoder that decodes the instruction, based on the instruction number, and obtain, in order, the instruction information corresponding to the instructions executed by the instruction execution circuit from the instruction complete holding circuit, to perform the instruction complete process.

Compressed instruction format

11048507 · 2021-06-29 ·

Intel Corporation

A technique for decoding an instruction in a variable-length instruction set. In one embodiment, an instruction encoding is described, in which legacy, present, and future instruction set extensions are supported, and increased functionality is provided, without expanding the code size and, in some cases, reducing the code size.

System, apparatus and method for a hybrid reservation station for a processor

11126438 · 2021-09-21 ·

Intel Corporation

In one embodiment, a reservation station of a processor includes: a plurality of first lanes having a plurality of entries to store information for instructions having in-order dependencies; a variable latency tracking table including a second plurality of entries to store information for instructions having a variable latency; and a scheduler circuit to access a head entry of the plurality of first lanes to schedule, for execution on at least one execution unit, at least one instruction from the head entry of at least one of the plurality of first lanes. Other embodiments are described and claimed.

PROCESSOR AND INSTRUCTION SET FOR FLEXIBLE QUBIT CONTROL WITH LOW MEMORY OVERHEAD

20210182071 · 2021-06-17 ·

Nader Khammassi

Apparatus and method for specifying quantum operations such as qubit rotations in a quantum instruction. For example, one embodiment of an apparatus comprises: a quantum instruction processing pipeline to process a quantum instruction having one or more opcodes to specify quantum operations and one or more operands and/or fields to specify values to be used to perform the quantum operations; a quantum waveform synthesizer to synthesize a waveform to control a qubit based on the values specified by the operands and/or fields of the quantum instruction.

INSTRUCTIONS AND LOGIC FOR VECTOR MULTIPLY ADD WITH ZERO SKIPPING

20210191724 · 2021-06-24 ·

Intel Corporation

Embodiments described herein provide for an instruction and associated logic to enable a vector multiply add instructions with automatic zero skipping for sparse input. One embodiment provides for a general-purpose graphics processor comprising logic to perform operations comprising fetching a hardware macro instruction having a predicate mask, a repeat count, and a set of initial operands, where the initial operands include a destination operand and multiple source operands. The hardware macro instruction is configured to perform one or more multiply/add operations on input data associated with a set of matrices.

METHODS AND SYSTEMS FOR PROCESSING DATA IN A PROGRAMMABLE DATA PROCESSING PIPELINE THAT INCLUDES OUT-OF-PIPELINE PROCESSING

20210263744 · 2021-08-26 ·

Methods and system for processing data in a programmable processing pipeline are disclosed. In an embodiment, a method for processing packets in a programmable packet processing pipeline is disclosed. The method involves processing data corresponding to a packet through a match-action pipeline of a programmable packet processing pipeline, and diverting the processing of data corresponding to the packet from the match-action pipeline to a processor core for out-of-pipeline processing.

METHOD AND APPARATUS FOR VIRTUALIZING THE MICRO-OP CACHE

20210149672 · 2021-05-20 ·

Systems, apparatuses, and methods for virtualizing a micro-operation cache are disclosed. A processor includes at least a micro-operation cache, a conventional cache subsystem, a decode unit, and control logic. The decode unit decodes instructions into micro-operations which are then stored in the micro-operation cache. The micro-operation cache has limited capacity for storing micro-operations. When new micro-operations are decoded from pending instructions, existing micro-operations are evicted from the micro-operation cache to make room for the new micro-operations. Rather than being discarded, micro-operations evicted from the micro-operation cache are stored in the conventional cache subsystem. This prevents the original instruction from having to be decoded again on subsequent executions. When the control logic determines that micro-operations for one or more fetched instructions are stored in either the micro-operation cache or the conventional cache subsystem, the control logic causes the decode unit to transition to a reduced-power state.

System and method for executing instructions

11016776 · 2021-05-25 ·

Alibaba Group Holding Limited

The present disclosure provides systems and methods for executing instructions. The system can include: processing unit having a core configured to execute instructions; and a host unit configured to: compile computer code into a plurality of instructions that includes a set of instructions that are determined to be executed in parallel on the core, wherein the set of instructions each includes an operation instruction and an indication bit and wherein the indication bit is set to identify the last instruction of the set of instructions, and provide the set of instructions to the core.

Entering protected pipeline mode without annulling pending instructions

11029997 · 2021-06-08 ·

Texas Instruments Incorporated

Techniques related to executing a plurality of instructions by a processor comprising receiving a first instruction for execution on an instruction execution pipeline, wherein the instruction execution pipeline is in a first execution mode, and wherein the first instruction is configured to utilize a first memory location, begin execution of the first instruction on the instruction execution pipeline, receiving an execution mode instruction to switch the instruction execution pipeline to a second execution mode, switching the instruction execution pipeline to the second execution mode based on the received execution mode instruction, receiving a second instruction for execution on the instruction execution pipeline, the second instruction configured to utilize the first memory location, determining that the first instruction and the second instruction utilize the first memory location, and stalling execution of the second instruction based on the determining.

Patent classifications

G06F9/382