G06F9/30192

APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS TO MULTIPLY VALUES OF ONE
20210182056 · 2021-06-17 ·

Systems, methods, and apparatuses relating to instructions to multiply values of one are described. In one embodiment, a hardware processor includes a decoder to decode a single instruction into a decoded single instruction, the single instruction having a first field that identifies a first number, a second field that identifies a second number, and a third field that indicates a number format for the first number and the second number; and an execution circuit to execute the decoded single instruction to: cause a first comparison of the first number to a one value in the number format of the first number, cause a second comparison of the second number to a one value in the number format of the second number, provide as a resultant of the single instruction the first number when the second comparison indicates the second number equals the one value in the number format of the second number, provide as the resultant of the single instruction the second number when the first comparison indicates the first number equals the one value in the number format of the first number, and provide as the resultant of the single instruction a product of a multiplication of the first number and the second number when the first comparison indicates the first number does not equal the one value in the number format of the first number and the second comparison indicates the second number does not equal the one value in the number format of the second number.

Instruction and logic for processing text strings

Method, apparatus, and program means for performing a string comparison operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources store a result of a comparison between each data element of a first and second operand corresponding to a first and second text string, respectively.

Apparatus and method for interpreting permissions associated with a capability
11023237 · 2021-06-01 · ·

An apparatus and method are provided for interpreting permissions associated with a capability. The apparatus has processing circuitry for executing instructions in order to perform operations, and a capability storage element accessible to the processing circuitry and arranged to store a capability used to constrain at least one operation performed by the processing circuitry when executing the instructions. The capability identifies a plurality N of default permissions whose state, in accordance with a default interpretation, is determined from N permission flags provided in the capability. In accordance with the default interpretation, each permission flag is associated with one of the default permissions. The processing circuitry is then arranged to analyse the capability in accordance with an alternative interpretation, in order to derive, from logical combinations of the N permission flags, state for an enlarged set of permissions, where the enlarged set comprises at least N+1 permissions. This provides a mechanism for encoding additional permissions into capabilities without increasing the number of permission flags required, whilst still retaining desirable behaviour.

DATA STRUCTURE DESCRIPTORS FOR DEEP LEARNING ACCELERATION

Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Instructions executed by the compute element include operand specifiers, some specifying a data structure register storing a data structure descriptor describing an operand as a fabric vector or a memory vector. The data structure descriptor further describes the memory vector as one of a one-dimensional vector, a four-dimensional vector, or a circular buffer vector. Optionally, the data structure descriptor specifies an extended data structure register storing an extended data structure descriptor. The extended data structure descriptor specifies parameters relating to a four-dimensional vector or a circular buffer vector.

DATA PROCESSING METHOD AND APPARATUS, AND RELATED PRODUCT

The present disclosure provides a data processing method and an apparatus and related products. The products include a control module including an instruction caching unit, an instruction processing unit, and a storage queue unit. The instruction caching unit is configured to store computation instructions associated with an artificial neural network operation; the instruction processing unit is configured to parse the computation instructions to obtain a plurality of operation instructions; and the storage queue unit is configured to store an instruction queue, where the instruction queue includes a plurality of operation instructions or computation instructions to be executed in the sequence of the queue. By adopting the above-mentioned method, the present disclosure can improve the operation efficiency of related products when performing operations of a neural network model.

Instruction and logic for processing text strings

Method, apparatus, and program means for performing a string comparison operation. In one embodiment, an apparatus includes execution resources to execute a first instruction. In response to the first instruction, said execution resources store a result of a comparison between each data element of a first and second operand corresponding to a first and second text string, respectively.

Backpressure for Accelerated Deep Learning

Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element comprises a respective compute element and a respective routing element. Each compute element comprises virtual input queues. Each router enables communication via wavelets with at least nearest neighbors in a 2D mesh. Routing is controlled by respective virtual channel specifiers in each wavelet and routing configuration information in each router. Each router comprises data queues. The virtual input queues of the compute element and the data queues of the router are managed in accordance with the virtual channels. Backpressure information, per each of the virtual channels, is generated, communicated, and used to prevent overrun of the virtual input queues and the data queues.

MANAGED MULTIPLE DIE MEMORY QOS
20210109756 · 2021-04-15 ·

Devices and techniques for implementing quality-of-service (QoS) parameters in a managed memory device having a number of memory dies are disclosed herein.

Semiconductor device and debug method

Debugging a program in an apparatus using a lockstep method are more efficiently performed and a semiconductor apparatus includes a first processor core, a second processor core, a first debug circuit, a second debug circuit, and an error control circuit capable of outputting an error signal for stopping execution of a program by the first processor core and the second processor core. The second debug circuit performs setting regarding debugging different from that of the first processor core with respect to the second processor core. Even if a first processing result of the first processor core and a second processing result of the second processor core do not coincide with each other, the error control circuit invalidates the output of the error signal when the first processor core executes the program and the second processor core stops execution of the program based on the setting regarding debugging.

DATA REFORMAT OPERATION

Devices, methods, and systems are provided. In one example, a device is described to include circuitry that collects data received from a data source, references a descriptor that describes a data reformat operation to perform on the data received from the data source, reformats the data received from the data source according to the data reformat operation, and provides the reformatted data to the data target via the second device interface.