G06F2207/382

Signed multiplication using unsigned multiplier with dynamic fine-grained operand isolation

An N×N multiplier may include a N/2×N first multiplier, a N/2×N/2 second multiplier, and a N/2×N/2 third multiplier. The N×N multiplier receives two operands to multiply. The first, second and/or third multipliers are selectively disabled if an operand equals zero or has a small value. If the operands are both less than 2.sup.N/2, the second or the third multiplier are used to multiply the operands. If one operand is less than 2.sup.N/2 and the other operand is equal to or greater than 2.sup.N/2, the first multiplier is used or the second and third multipliers are used to multiply the operands. If both operands are equal to or greater than 2.sup.N/2, the first, second and third multipliers are used to multiply the operands.

Multiplication and accumulation (MAC) operator
11579870 · 2023-02-14 · ·

A MAC operator includes a plurality of multipliers, a plurality of floating-point to fixed-point converters, an adder tree, an accumulator, and a fixed-point to floating-point converter. Each of the plurality of multipliers may perform a multiplication operation on first data and second data of a single-precision floating-point (FP32) format to output multiplication result data of the FP 32 format. Each of the plurality of floating-point to fixed-point converters may convert the FP 32 format into a fixed-point format. The adder tree may perform a first addition operation on the data of the fixed-point format. The accumulator may perform an accumulation operation on the data output from the adder tree. And the fixed-point to floating-point converter may convert the data of the fixed-point format into data of the FP32 format.

Computing device and method

The present disclosure provides a computation device. The computation device is configured to perform a machine learning computation, and includes an operation unit, a controller unit, and a conversion unit. The storage unit is configured to obtain input data and a computation instruction. The controller unit is configured to extract and parse the computation instruction from the storage unit to obtain one or more operation instructions, and to send the one or more operation instructions and the input data to the operation unit. The operation unit is configured to perform operations on the input data according to one or more operation instructions to obtain a computation result of the computation instruction. In the examples of the present disclosure, the input data involved in machine learning computations is represented by fixed-point data, thereby improving the processing speed and efficiency of training operations.

Information processor, information processing method, and storage medium
11551087 · 2023-01-10 · ·

An information processor includes a memory; and a processor coupled to the memory and the processor configured to: acquire first statistical information about distribution of most significant bit position that is not a sign or least significant bit position that is not zero for each of a plurality of first fixed-point number data, the data being a computation result of the computation in the first layer; execute computation on a plurality of output data of the first layer according to a predetermined rule, in the computation in the second layer; and acquire second statistical information based on the predetermined rule and the first statistical information, and determine a bit range for limiting a bit width when a plurality of second fixed-point number data, the data being a computation result of the computation in the second layer, are stored in a register, based on the second statistical information.

Variable accuracy computing system
11693626 · 2023-07-04 · ·

The present disclosure relates to a computing system. The computing system comprises a data input configured to receive an input data signal, a computation unit having an input coupled with the data input, the computation unit being operative to apply a weight to a signal received at its input to generate a weighted output signal, and a controller. The controller is configured to monitor a parameter of the input signal and/or a parameter of the output signal and to issue a control signal to the computation unit to control a level of accuracy of the weighted output signal based at least in part on the monitored parameter.

Neural network method and apparatus with floating point processing

A processor-implemented includes receiving a first floating point operand and a second floating point operand, each having an n-bit format comprising a sign field, an exponent field, and a significand field, normalizing a binary value obtained by performing arithmetic operations for fields corresponding to each other in the first and second floating point operands for an n-bit multiplication operation, determining whether the normalized binary value is a number that is representable in the n-bit format or an extended normal number that is not representable in the n-bit format, according to a result of the determining, encoding the normalized binary value using an extension bit format in which an extension pin identifying whether the normalized binary value is the extended normal number is added to the n-bit format, and outputting the encoded binary value using the extended bit format, as a result of the n-bit multiplication operation.

PROCESSING ELEMENT AND NEURAL PROCESSING DEVICE INCLUDING SAME
20220374691 · 2022-11-24 ·

The present disclosure discloses a processing element and a neural processing device including the processing element. The processing element includes a weight register configured to store a weight, an input activation register configured to store an input activation, a flexible multiplier configured to receive a first sub-weight of a first precision included in the weight, receive a first sub-input activation of the first precision included in the input activation, and generate result data by performing multiplication calculation of the first sub-weight and the first sub-input activation as the first precision or a second precision different from the first precision according to the first sub-weight and the first sub-input activation and a saturating adder configured to generate a partial sum by using the result data.

Adaptive quantization and mixed precision in a network

A method of adaptive quantization for a convolutional neural network, includes at least one of receiving an acceptable model accuracy, determining a float value multiply accumulate for the layer based on a float value weight and a float value input, quantizing the float value weight at multiple weight quantization precisions, quantizing the float value input at multiple input quantization precisions, determining a multiply accumulate at multiple multiply accumulate quantization precisions based on the weight quantization precisions and the input quantization precisions, determining multiple quantization errors based on differences between the float value multiply accumulate and the multiple multiply accumulate quantization precisions and selecting one of the multiple weight quantization precisions, one of the multiple input quantization precisions and one of the multiple multiply accumulate quantization precisions based on the predetermined acceptable model accuracy and the multiple quantization errors.

Reconfigurable Processor Circuit Architecture

A representative reconfigurable processing circuit and a reconfigurable arithmetic circuit are disclosed, each of which may include input reordering queues; a multiplier shifter and combiner network coupled to the input reordering queues; an accumulator circuit; and a control logic circuit, along with a processor and various interconnection networks. A representative reconfigurable arithmetic circuit has a plurality of operating modes, such as floating point and integer arithmetic modes, logical manipulation modes, Boolean logic, shift, rotate, conditional operations, and format conversion, and is configurable for a wide variety of multiplication modes. Dedicated routing connecting multiplier adder trees allows multiple reconfigurable arithmetic circuits to be reconfigurably combined, in pair or quad configurations, for larger adders, complex multiplies and general sum of products use, for example.

METHOD AND APPARATUS WITH CALCULATION
20230058095 · 2023-02-23 · ·

A processor-implemented method includes: receiving a plurality of pieces of input data expressed as floating point; adjusting a bit-width of mantissa by performing masking on the mantissa of each piece of the input data based on a size of an exponent of each piece of the input data; and performing an operation between the input data with the adjusted bit-width.