Patent classifications
H03H17/0664
Method and apparatus for dual issue multiply instructions
A method is provided that includes performing, by a processor in response to a dual issue multiply instruction, multiplication of operands of the dual issue multiply instruction using multiplication units comprised in a data path of the processor and configured to operate together to determine a product of the operands, and storing, by the processor, the product in a storage location indicated by the dual issue multiply instruction.
METHOD AND APPARATUS FOR VECTOR SORTING
A method for sorting of a vector in a processor is provided that includes performing, by the processor in response to a vector sort instruction, sorting of values stored in lanes of the vector to generate a sorted vector, wherein the values are sorted in an order indicated by the vector sort instruction, and storing the sorted vector in a storage location.
METHOD AND APPARATUS FOR VECTOR PERMUTATION
A method is provided that includes performing, by a processor in response to a vector permutation instruction, permutation of values stored in lanes of a vector to generate a permuted vector, wherein the permutation is responsive to a control storage location storing permute control input for each lane of the permuted vector, wherein the permute control input corresponding to each lane of the permuted vector indicates a value to be stored in the lane of the permuted vector, wherein the permute control input for at least one lane of the permuted vector indicates a value of a selected lane of the vector is to be stored in the at least one lane, and storing the permuted vector in a storage location indicated by an operand of the vector permutation instruction.
METHOD AND APPARATUS FOR PERMUTING STREAMED DATA ELEMENTS
A method is provided that includes receiving, in a permute network, a plurality of data elements for a vector instruction from a streaming engine, and mapping, by the permute network, the plurality of data elements to vector locations for execution of the vector instruction by a vector functional unit in a vector data path of a processor.
Method and Apparatus for Vector Based Finite Impulse Response (FIR) Filtering
A method includes executing, by a processor a vector finite impulse response (VFIR) filter instruction that specifies coefficients, data elements, and a storage location. The executing includes reordering a subset of the data elements to provide each data element of the reordered subset of the data elements to a respective slice multiply component of a vector multiplier of the processor, generating, by the vector multiplier, filter outputs based on the coefficients and data elements, and storing the filter outputs in the storage location.
Tracking streaming engine vector predicates to control processor execution
In a method of operating a computer system, an instruction loop is executed by a processor in which each iteration of the instruction loop accesses a current data vector and an associated current vector predicate. The instruction loop is repeated when the current vector predicate indicates the current data vector contains at least one valid data element and the instruction loop is exited when the current vector predicate indicates the current data vector contains no valid data elements.
METHOD AND APPARATUS FOR VECTOR SORTING
A method for sorting of a vector in a processor is provided that includes performing, by the processor in response to a vector sort instruction, sorting of values stored in lanes of the vector to generate a sorted vector, wherein the values are sorted in an order indicated by the vector sort instruction, and storing the sorted vector in a storage location.
Method and apparatus to sort a vector for a bitonic sorting algorithm
A method is provided that includes performing, by a processor in response to a vector sort instruction, sorting of values stored in lanes of the vector to generate a sorted vector, wherein the values in a first portion of the lanes are sorted in a first order indicated by the vector sort instruction and the values in a second portion of the lanes are sorted in a second order indicated by the vector sort instruction; and storing the sorted vector in a storage location.
Configurable multiplier-free multirate filter
A finite impulse response (FIR) filter including a delay line and a plurality of arithmetic units. Each arithmetic unit is coupled to a different one of a plurality of tap points of the delay line, is configured to receive a respective signal value over the delay line, and is associated with a respective coefficient. Any given one of the arithmetic units is configured to receive a respective control word. The respective control word specifying: (i) a plurality of trivial multiplication operations, and (ii) a plurality of bit shift operations. Any given one of the arithmetic units is further configured to estimate or calculate a product of the respective signal of the arithmetic unit respective signal value and the respective coefficient of the arithmetic unit by performing the trivial multiplication operations and bit shift operations that are specified by the respective control word that is received at the given arithmetic unit.
Method and Apparatus for Dual Issue Multiply Instructions
Various configurations of processors are provided. In a configuration, the processor comprises first and second multiplication unit. Each of these multiplication units includes carry-save adder circuitry with a respective outputs, partial product alignment multiplexing logic coupled to the outputs of the associated carry-save adder circuitry. The processor further comprises communication paths coupled between the outputs of the carry-save adder circuitry of the first multiplication unit and the partial product alignment multiplexing logic of the second multiplication unit. In other configurations, each of the first and second multiplication units may include one or more instances of masking logic, one or more instances of a multiplier array coupled to the associated instance(s) of masking logic, and one or more instances of a multiplexer set coupled to the associated instance(s) of multiplier array(s). Each of multiplexer set instance(s) of a particular multiplication unit is coupled to the carry-save adder circuitry of that multiplication unit.