Patent classifications
G06F7/48
Floating point multiply-add, accumulate unit with carry-save accumulator
Floating point Multiply-Add, Accumulate Unit, supporting BF16 format for Multiply-Accumulate operations, and FP32 Single-Precision Addition complying with the IEEE 754 Standard. The Multiply-Accumulate unit uses higher radix and longer internal 2's complement significand representation to facilitate precision as well as comparison and operation with negative numbers. The addition is performed using Carry-Save format to avoid long carry propagation and speed up the operation. Operations including overflow detection, zero detection and sign extension are adopted for 2s complement and Carry-Save format. Handling of Overflow and Sign Extension allows for fast operation relatively independent on the size of the accumulator.
Multiplication circuit, system on chip, and electronic device
A multiplication circuit includes an addition subcircuit configured to obtain logarithmic field data a and b that correspond to A and B, and perform an addition operation on a and b to obtain c, where c includes an integral part and a fractional part, an exponentiation operation subcircuit configured to perform an exponentiation operation in which a base is 2 and an exponent is the fractional part of c, to obtain an exponentiation operation result, a shift subcircuit configured to shift the exponentiation operation result based on the integral part of c to obtain a shift result, and an output subcircuit configured to output a product of A and B based on signs of a and b and with reference to the shift result.
DMA CONTROLLER WITH ARITHMETIC UNIT
A digital signal processor (DSP) includes a CPU, and a DMA controller. The DMA controller transfers data from a source to a destination as a function of an initialization command from the CPU. The DMA controller has a logic unit that performs filter operations and other arithmetic operations on-the-fly on a data stream transferred therethrough. The filter operations include multiplication by filter coefficients and addition, without processing by the CPU. The DMA controller may have subsets of hardware configurations that can perform different operations that are selectable as a function of the initialization command.
DMA CONTROLLER WITH ARITHMETIC UNIT
A digital signal processor (DSP) includes a CPU, and a DMA controller. The DMA controller transfers data from a source to a destination as a function of an initialization command from the CPU. The DMA controller has a logic unit that performs filter operations and other arithmetic operations on-the-fly on a data stream transferred therethrough. The filter operations include multiplication by filter coefficients and addition, without processing by the CPU. The DMA controller may have subsets of hardware configurations that can perform different operations that are selectable as a function of the initialization command.
APPARATUS FOR HARDWARE ACCELERATED MACHINE LEARNING
An architecture and associated techniques of an apparatus for hardware accelerated machine learning are disclosed. The architecture features multiple memory banks storing tensor data. The tensor data may be concurrently fetched by a number of execution units working in parallel. Each operational unit supports an instruction set specific to certain primitive operations for machine learning. An instruction decoder is employed to decode a machine learning instruction and reveal one or more of the primitive operations to be performed by the execution units, as well as the memory addresses of the operands of the primitive operations as stored in the memory banks. The primitive operations, upon performed or executed by the execution units, may generate some output that can be saved into the memory banks. The fetching of the operands and the saving of the output may involve permutation and duplication of the data elements involved.
EXECUTING INTERRUPT PROCESSING OF VIRTUAL MACHINES USING PROCESSOR'S ARITHMETIC UNIT
A data processing device that can monitor properly the state of the interrupt processing of a virtual machine is provided. The data processing device according to an aspect of the present disclosure includes an arithmetic unit that executes multiple virtual machines, respectively, and an interrupt controller that instructs execution of the interrupt processing to the arithmetic unit with the virtual machine information to specify at least one of the multiple virtual machines. The interrupt controller includes a counter to count the number of interrupts for each virtual machine based on the virtual machine information.
EXECUTING INTERRUPT PROCESSING OF VIRTUAL MACHINES USING PROCESSOR'S ARITHMETIC UNIT
A data processing device that can monitor properly the state of the interrupt processing of a virtual machine is provided. The data processing device according to an aspect of the present disclosure includes an arithmetic unit that executes multiple virtual machines, respectively, and an interrupt controller that instructs execution of the interrupt processing to the arithmetic unit with the virtual machine information to specify at least one of the multiple virtual machines. The interrupt controller includes a counter to count the number of interrupts for each virtual machine based on the virtual machine information.
MEASUREMENT BASED UNCOMPUTATION FOR QUANTUM CIRCUIT OPTIMIZATION
Methods and apparatus for optimizing a quantum circuit. In one aspect, a method includes identifying one or more sequences of operations in the quantum circuit that un-compute respective qubits on which the quantum circuit operates; generating an adjusted quantum circuit, comprising, for each identified sequence of operations in the quantum circuit, replacing the sequence of operations with an X basis measurement and a classically-controlled phase correction operation, wherein a result of the X basis measurement acts as a control for the classically-controlled correction phase operation; and executing the adjusted quantum circuit.
Data Processing Device Having A Logic Circuit for Calculating a Modified Cross Sum
A logic circuit configured to calculate a quotient Q based on a modified cross-sum of an input word CP, a digital circuit having a first input for the input word CP that is a bit-wise inverted value of a number N of M-bit digits having a radix 2.sup.M from a least significant digit to a most significant digit, the circuit configured to calculate a quotient Q, M and N being positive integer numbers larger than one, wherein the digital circuit has a second input RIN that is configured to be set to zero, or to receive a remainder value from another logic circuit, and wherein the digital circuit provides for an output word Q having N digits, each digit of radix 2.sup.M, the output word Q being a raw quotient of the bit-wise inverted value of the input word CP.
NEUROMORPHIC OPERATIONS USING POSITS
Systems, apparatuses, and methods related to a neuron built with posits are described. An example system may include a memory device and the memory device may include a plurality of memory cells. The plurality of memory cells can store data including a bit string in an analog format. A neuromorphic operation can be performed on the data in the analog format. The example system may include an analog to digital converter coupled to the memory device. The analog to digital converter may convert the bit string in the analog format stored in at least one of the plurality of memory cells to a format that supports arithmetic operations to a particular level of precision.