Patent classifications
G06F7/4876
ELECTRONIC MULTIPLICATION CIRCUIT AND CORRESPONDING MULTIPLICATION METHOD
In an embodiment, after a first phase of multiplication, in an electronic multiplication circuit, of a first operand by a second operand leading to a successive delivery of least significant words of the result of the first multiplication, a second multiplication, of the first operand by a supplementary operand is implemented in the electronic multiplication circuit, during a second phase of multiplication. The supplementary operands are not all identical.
SPARSE MATRIX MULTIPLICATION IN HARDWARE
Aspects of the disclosure provide for methods, systems, and apparatuses, including computer-readable storage media, for sparse matrix multiplication. A system for matrix multiplication includes an array of sparse shards. Each sparse shard can be configured to receive an input sub-matrix and an input sub-vector, where the input sub-matrix has a number of non-zero values equal to or less than a predetermined maximum non-zero threshold. The sparse shard can, by a plurality of multiplier circuits, compute one or more products of vector values multiplied with respective non-zero values of the input sub-matrix. The sparse shard can generate, as output to the sparse shard and using the one or more products, a shard output vector that is the product of applying the shard input vector to the shard input matrix.
HARDWARE ACCELERATOR METHOD AND DEVICE
A processor-implemented hardware accelerator method includes: receiving input data; loading a lookup table (LUT); determining an address of the LUT by inputting the input data to a comparator; obtaining a value of the LUT corresponding to the input data based on the address; and determining a value of a nonlinear function corresponding to the input data based on the value of the LUT, wherein the LUT is determined based on a weight of a neural network that outputs the value of the nonlinear function.
Neural network method and apparatus with floating point processing
A processor-implemented includes receiving a first floating point operand and a second floating point operand, each having an n-bit format comprising a sign field, an exponent field, and a significand field, normalizing a binary value obtained by performing arithmetic operations for fields corresponding to each other in the first and second floating point operands for an n-bit multiplication operation, determining whether the normalized binary value is a number that is representable in the n-bit format or an extended normal number that is not representable in the n-bit format, according to a result of the determining, encoding the normalized binary value using an extension bit format in which an extension pin identifying whether the normalized binary value is the extended normal number is added to the n-bit format, and outputting the encoded binary value using the extended bit format, as a result of the n-bit multiplication operation.
MULTI-MODE FUSION MULTIPLIER
A multiplier is configured to implement a binary single-multiplication operation A[m.sub.1-1:0]×B[m.sub.2-1:0], or an accumulated sum operation of 2N binary multiplications A0[m.sub.3-1:0]×B0[m.sub.4-1:0]. The multiplier includes P precoders, Q groups of fusion coders, and a compressor. The P precoders and the Q groups of fusion coders are configured to code a first value and a second value in the single-multiplication operation or the multi-multiplication accumulated sum operation, and output a plurality of partial products to the compressor. The compressor may be configured to compress the plurality of partial products corresponding to the single-multiplication operation or the multi-multiplication accumulated sum operation to obtain two accumulated values.
METHOD AND APPARATUS WITH CALCULATION
A processor-implemented method includes: receiving a plurality of pieces of input data expressed as floating point; adjusting a bit-width of mantissa by performing masking on the mantissa of each piece of the input data based on a size of an exponent of each piece of the input data; and performing an operation between the input data with the adjusted bit-width.
METHOD AND APPARATUS WITH FLOATING POINT PROCESSING
A processor-implemented includes receiving a first floating point operand and a second floating point operand, each having an n-bit format comprising a sign field, an exponent field, and a significand field, normalizing a binary value obtained by performing arithmetic operations for fields corresponding to each other in the first and second floating point operands for an n-bit multiplication operation, determining whether the normalized binary value is a number that is representable in the n-bit format or an extended normal number that is not representable in the n-bit format, according to a result of the determining, encoding the normalized binary value using an extension bit format in which an extension pin identifying whether the normalized binary value is the extended normal number is added to the n-bit format, and outputting the encoded binary value using the extended bit format, as a result of the n-bit multiplication operation.
MEMORY LOOKUP COMPUTING MECHANISMS
According to some example embodiments of the present disclosure, in a method for a memory lookup mechanism in a high-bandwidth memory system, the method includes: using a memory die to conduct a multiplication operation using a lookup table (LUT) methodology by accessing a LUT, which includes floating point operation results, stored on the memory die; sending, by the memory die, a result of the multiplication operation to a logic die including a processor and a buffer; and conducting, by the logic die, a matrix multiplication operation using computation units.
Outlier quantization for training and inference
Machine learning may include training and drawing inference from artificial neural networks, processes which may include performing convolution and matrix multiplication operations. Convolution and matrix multiplication operations are performed using vectors of block floating-point (BFP) values that may include outliers. BFP format stores floating-point values using a plurality of mantissas of a fixed bit width and a shared exponent. Elements are outliers when they are too large to be represented precisely with the fixed bit width mantissa and shared exponent. Outlier values are split into two mantissas. One mantissa is stored in the vector with non-outliers, while the other mantissa is stored outside the vector. Operations, such as a dot product, may be performed on the vectors in part by combining the in-vector mantissa and exponent of an outlier value with the out-of-vector mantissa and exponent.
CIRCUIT AND METHOD OF TRANSMITTING DIGITAL DATA WITH ERROR DETECTION
There is disclosed a system for transmitting digital data with error detection, the system comprising a sender, configured to receive source data and to send transfer data, and a receiver configured to receive the transfer data and to output result data, wherein the sender is further configured to receive the source data, to numerically multiply the source data by an integer number greater than 2, and to output the multiplied source data as the transfer data, and wherein the receiver is further configured to receive the transfer data, to check if dividing the transfer data by the integer number results in an integer result, and, if the checking fails, to output an error indication, and, if the checking succeeds, to output the transfer data divided by the integer number as the result data. Also, a corresponding method is disclosed.