G06F9/30149

Method and apparatus for dual issue multiply instructions

A method is provided that includes performing, by a processor in response to a dual issue multiply instruction, multiplication of operands of the dual issue multiply instruction using multiplication units comprised in a data path of the processor and configured to operate together to determine a product of the operands, and storing, by the processor, the product in a storage location indicated by the dual issue multiply instruction.

Address generation method, related apparatus, and storage medium

A system parses a very long instruction word (VLIW) to obtain an execution parameter. The system obtains a first sliding window width count, a first sliding window height count, a first feature map width count, and a first feature map height count that correspond to first target data. In accordance with a determination that the first sliding window width count falls within the sliding window width range, the first sliding window height count falls within the sliding window height range, (the first feature map width count falls within the feature map width range, and the first feature map height count falls within the feature map height range, the system determines an offset of the first target data. The system also obtains a starting address of the first target data, and adds the starting address to the offset to obtain a first target address of the first target data.

VECTOR FRIENDLY INSTRUCTION FORMAT AND EXECUTION THEREOF

A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.

Vector length querying instruction

A data processing system 2 supporting vector processing operations uses scaling vector length querying instructions. The scaling vector length querying instructions return a result which is dependent upon a number of elements in a vector for a variable vector element size specified by the instruction and multiplied by a scaling value specified by the instruction. The scaling vector length querying instructions may be in the form of count instructions, increment instructions or decrement instructions. The instructions may include a pattern constraint applying a constraint, such as modulo(M) or power of 2 to the partial result value representing the number of vector elements provided for the register element size specified for the instruction.

BIT STRING COMPRESSION
20220021399 · 2022-01-20 ·

Systems, apparatuses, and methods related to bit string compression are described. A method for bit string compression can include determining that a particular operation is to be performed using a bit string formatted according to a universal number format or a posit format to alter a bit width associated with the bit string from a first bit width to a second bit width and performing a compression operation on a bit string formatted according to a universal number format or a posit format to alter a bit width associated with the bit string from a first bit width to a second bit width. The method can further include writing the bit string having the second bit width to a first register, performing an arithmetic operation or a logical operation, or both using the bit string having the second bit string width, and monitoring a quantity of bits of a result of the operation.

Pairing issue queues for complex instructions and instruction fusion

Support for instruction fusion is provided. An indication whether an instruction is a paired instruction is received from an instruction decoder. Based on the indication, one dispatch slot or a paired dispatch slot is allocated in the instruction dispatcher queue. A mapper converts logical addresses of sources and targets of the instruction to physical addresses. Either one issue slot or a paired issue slot is allocated in an issue queue based on the indication from the instruction decoder. The instruction execution environment is loaded into the issue queue and issued to an execution unit.

METHOD AND APPARATUS FOR VECTOR SORTING
20210357218 · 2021-11-18 ·

A method for sorting of a vector in a processor is provided that includes performing, by the processor in response to a vector sort instruction, sorting of values stored in lanes of the vector to generate a sorted vector, wherein the values are sorted in an order indicated by the vector sort instruction, and storing the sorted vector in a storage location.

Reconfigurable Crypto-Processor
20220012052 · 2022-01-13 ·

The present disclosure relates to systems and methods that provide a reconfigurable cryptographic coprocessor. An example system includes an instruction memory configured to provide ARX instructions and mode control instructions. The system also includes an adjustable-width arithmetic logic unit, an adjustable-width rotator, and a coefficient memory. A bit width of the adjustable-width arithmetic logic unit and a bit width of the adjustable-width rotator are adjusted according to the mode control instructions. The coefficient memory is configured to provide variable-width words to the arithmetic logic unit and the rotator. The arithmetic logic unit and the rotator are configured to carry out the ARX instructions on the provided variable-width words. The systems and methods described herein could accelerate various applications, such as deep learning, by assigning one or more of the disclosed reconfigurable coprocessors to work as a central computation unit in a neural network.

METHOD AND APPARATUS FOR VECTOR PERMUTATION

A method is provided that includes performing, by a processor in response to a vector permutation instruction, permutation of values stored in lanes of a vector to generate a permuted vector, wherein the permutation is responsive to a control storage location storing permute control input for each lane of the permuted vector, wherein the permute control input corresponding to each lane of the permuted vector indicates a value to be stored in the lane of the permuted vector, wherein the permute control input for at least one lane of the permuted vector indicates a value of a selected lane of the vector is to be stored in the at least one lane, and storing the permuted vector in a storage location indicated by an operand of the vector permutation instruction.

CHIP AND CHIP-BASED DATA PROCESSING METHOD

Embodiments of the present specification provide chips and chip-based data processing methods. In an embodiment, a method comprises: obtaining data associated with one or more neural networks transmitted from a server; for each layer of a neural network of the one or more neural networks, configuring, based on the data, a plurality of operator units based on a type of computation each operator unit performs; and invoking the plurality of operator units to perform computations, based on neurons of a layer of the neural network immediately above, of the data for each neuron to produce a value of the neuron.