G06F9/3822

Variable-length instruction buffer management

A vector processor is disclosed including a variety of variable-length instructions. Computer-implemented methods are disclosed for efficiently carrying out a variety of operations in a time-conscious, memory-efficient, and power-efficient manner. Methods for more efficiently managing a buffer by controlling the threshold based on the length of delay line instructions are disclosed. Methods for disposing multi-type and multi-size operations in hardware are disclosed. Methods for condensing look-up tables are disclosed. Methods for in-line alteration of variables are disclosed.

METHOD AND APPARATUS FOR VECTOR SORTING USING VECTOR PERMUTATION LOGIC
20230037321 · 2023-02-09 ·

A method for sorting of a vector in a processor is provided that includes performing, by the processor in response to a vector sort instruction, generating a control input vector for vector permutation logic comprised in the processor based on values in lanes of the vector and a sort order for the vector indicated by the vector sort instruction and storing the control input vector in a storage location.

Instruction packing scheme for VLIW CPU architecture

A processor is provided and includes a core that is configured to perform a decode operation on a multi-instruction packet comprising multiple instructions. The decode operation includes receiving the multi-instruction packet that includes first and second instructions. The first instruction includes a primary portion at a fixed first location and a secondary portion. The second instruction includes a primary portion at a fixed second location between the primary portion of the first instruction and the secondary portion of the first instruction. An operational code portion of the primary portion of each of the first and second instructions is accessed and decoded. An instruction packet including the primary and secondary portions of the first instruction is created, and a second instruction packet including the primary portion of the second instruction is created. The first and second instructions packets are dispatched to respective first and second functional units.

Streaming engine with multi dimensional circular addressing selectable at each dimension
11709779 · 2023-07-25 · ·

A streaming engine employed in a digital data processor may specify a fixed read-only data stream defined by plural nested loops. An address generator produces address of data elements for the nested loops. A steam head register stores data elements next to be supplied to functional units for use as operands. A stream template register independently specifies a linear address or a circular address mode for each of the nested loops.

METHOD AND APPARATUS TO SORT A VECTOR FOR A BITONIC SORTING ALGORITHM
20230229448 · 2023-07-20 ·

A method is provided that includes performing, by a processor in response to a vector sort instruction, sorting of values stored in lanes of the vector to generate a sorted vector, wherein the values in a first portion of the lanes are sorted in a first order indicated by the vector sort instruction and the values in a second portion of the lanes are sorted in a second order indicated by the vector sort instruction; and storing the sorted vector in a storage location.

Decompression engine for decompressing compressed input data that includes multiple streams of data
11561797 · 2023-01-24 · ·

An electronic device that includes a decompression engine that includes N decoders and a decompressor decompresses compressed input data that includes N streams of data. Upon receiving a command to decompress compressed input data, the decompression engine causes each of the N decoders to decode a respective one of the N streams from the compressed input data separately and substantially in parallel with others of the N decoders. Each decoder outputs a stream of decoded data of a respective type for generating commands associated with a compression standard for decompressing the compressed input data. The decompressor next generates, from the streams of decoded data output by the N decoders, commands for decompressing the data using the compression standard to recreate the original data. The decompressor next executes the commands to recreate the original data and stores the original data in a memory or provides the original data to another entity.

PROCESSOR CIRCUIT AND DATA PROCESSING METHOD
20230020505 · 2023-01-19 ·

A processor circuit includes an instruction decode unit, an instruction detector, an address generator and a data buffer. The instruction decode unit is configured to decode a first load instruction included in a plurality of load instructions to generate a first decoding result. The instruction detector, coupled to the instruction decode unit, is configured to detect if the load instructions use a same register. The address generator, coupled to the instruction decode unit, is configured to generate a first address requested by the first load instruction according to the first decoding result. The data buffer is coupled to the instruction detector and the address generator. When the instruction detector detects that the load instructions use the same register, the data buffer is configured to store the first address generated from the address generator, and store data requested by the first load instruction according to the first address.

INSTRUCTION PACKING SCHEME FOR VLIW CPU ARCHITECTURE

A processor is provided and includes a core that is configured to perform a decode operation on a multi-instruction packet comprising multiple instructions. The decode operation includes receiving the multi-instruction packet that includes first and second instructions. The first instruction includes a primary portion at a fixed first location and a secondary portion. The second instruction includes a primary portion at a fixed second location between the primary portion of the first instruction and the secondary portion of the first instruction. An operational code portion of the primary portion of each of the first and second instructions is accessed and decoded. An instruction packet including the primary and secondary portions of the first instruction is created, and a second instruction packet including the primary portion of the second instruction is created. The first and second instructions packets are dispatched to respective first and second functional units.

LOCATION AGNOSTIC DATA ACCESS

Apparatuses, systems, and techniques to enable a program to access data regardless of where said data is stored. In at least one embodiment, a system enables a program to access data regardless of where said data is stored, based on, for example, one or more locations encoding one or more addresses of said data.

Mechanism for interrupting and resuming execution on an unprotected pipeline processor

Techniques related to executing a plurality of instructions by a processor comprising receiving a first instruction for execution on an instruction execution pipeline, beginning execution of the first instruction, receiving one or more second instructions for execution on the instruction execution pipeline, the one or more second instructions associated with a higher priority task than the first instruction, storing a register state associated with the execution of the first instruction in one or more registers of a capture queue associated with the instruction execution pipeline, copying the register state from the capture queue to a memory, determining that the one or more second instructions have been executed, copying the register state from the memory to the one or more registers of the capture queue, and restoring the register state to the instruction execution pipeline from the capture queue.