Patent classifications
G06F2207/483
Variable precision floating-point multiplier
Integrated circuits with specialized processing blocks are provided. The specialized processing blocks may include floating-point multiplier circuits that can be configured to support variable precision. A multiplier circuit may include a first carry-propagate adder (CPA), a second carry-propagate adder (CPA), and an associated rounding circuit. The first CPA may be wide enough to handle the required precision of the mantissa. In a bridged mode, the first CPA may borrow an additional bit from the second CPA while the rounding circuit will monitor the appropriate bits to select the proper multiplier output. A parallel prefix tree operable in a non-bridged mode or the bridged mode may be used to compute multiple multiplier outputs. The multiplier circuit may also include exponent and exception handling circuitry using various masks corresponding to the desired precision width.
Floating point unit with support for variable length numbers
Embodiments of a processor are disclosed for performing arithmetic operations on a machine independent number format. The processor may include a floating point unit, and a number unit. The number format may include a sign/exponent block, a length block, and multiple mantissa digits. The number unit may be configured to perform an operation on two operands by converting the digit format of each mantissa digit of each operand, to perform the operation using the converted mantissa digits, and then to convert each mantissa digit of the result of the operation back into the original digit format.
USING EMBEDDING FUNCTIONS WITH A DEEP NETWORK
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using embedded function with a deep network. One of the methods includes receiving an input comprising a plurality of features, wherein each of the features is of a different feature type; processing each of the features using a respective embedding function to generate one or more numeric values, wherein each of the embedding functions operates independently of each other embedding function, and wherein each of the embedding functions is used for features of a respective feature type; processing the numeric values using a deep network to generate a first alternative representation of the input, wherein the deep network is a machine learning model composed of a plurality of levels of non-linear operations; and processing the first alternative representation of the input using a logistic regression classifier to predict a label for the input.
Hybrid analog-digital floating point number representation and arithmetic
A hybrid floating-point arithmetic processor includes a scheduler, a hybrid register file, and a hybrid arithmetic operation circuit. The scheduler has an input for receiving floating-point instructions, and an output for providing decoded register numbers in response to the floating-point instructions. The hybrid register file is coupled to the scheduler and contains circuitry for storing a plurality of floating-point numbers each represented by a digital sign bit, a digital exponent, and an analog mantissa. The hybrid register file has an output for providing selected ones of the plurality of floating-point numbers in response to the decoded register numbers. The hybrid arithmetic operation circuit is coupled to the scheduler and to the hybrid register file, for performing a hybrid arithmetic operation between two floating-point numbers selected by the scheduler and providing a hybrid result represented by a result digital sign bit, a result digital exponent, and a result analog mantissa.
HYBRID ANALOG-DIGITAL FLOATING POINT NUMBER REPRESENTATION AND ARITHMETIC
A hybrid floating-point arithmetic processor includes a scheduler, a hybrid register file, and a hybrid arithmetic operation circuit. The scheduler has an input for receiving floating-point instructions, and an output for providing decoded register numbers in response to the floating-point instructions. The hybrid register file is coupled to the scheduler and contains circuitry for storing a plurality of floating-point numbers each represented by a digital sign bit, a digital exponent, and an analog mantissa. The hybrid register file has an output for providing selected ones of the plurality of floating-point numbers in response to the decoded register numbers. The hybrid arithmetic operation circuit is coupled to the scheduler and to the hybrid register file, for performing a hybrid arithmetic operation between two floating-point numbers selected by the scheduler and providing a hybrid result represented by a result digital sign bit, a result digital exponent, and a result analog mantissa.
REDUCED FLOATING-POINT PRECISION ARITHMETIC CIRCUITRY
The present embodiments relate to performing reduced-precision floating-point arithmetic operations using specialized processing blocks with higher-precision floating-point arithmetic circuitry. A specialized processing block may receive four floating-point numbers that represent two single-precision floating-point numbers, each separated into an LSB portion and an MSB portion, or four half-precision floating-point numbers. A first partial product generator may generate a first partial product of first and second input signals, while a second partial product generator may generate a second partial product of third and fourth input signals. A compressor circuit may generate carry and sum vector signals based on the first and second partial products; and circuitry may anticipate rounding and normalization operations by generating in parallel based on the carry and sum vector signals at least two results when performing the single-precision floating-point operation and at least four results when performing the two half-precision floating-point operations.
Method and apparatus for converting real number modeling to synthesizable register-transfer level emulation in digital mixed signal environments
A method for converting a real number modeling to a synthesizable register-transfer level emulation in digital mixed signal environments is provided. The method includes verifying an input in a file including a real number modeling code and cleaning the real number modeling code in the file. The method also includes separating a clean register-transfer level code from the real number modeling code, converting the file to a cycle-driven simulation interface file, and verifying the cycle-driven simulation interface file. The method further includes converting the cycle-driven simulation interface file into a register-transfer level file suitable to perform a circuit emulation in digital mixed signal environments, and verifying that the register-transfer level file is ready to perform circuit emulation in the digital mixed signal environments. A system and a non-transitory, computer readable medium storing commands to perform the above method are also provided.
REPRODUCIBLE STOCHASTIC ROUNDING FOR OUT OF ORDER PROCESSORS
A method for generating a random number for use in a stochastic rounding operation is provided. The method includes executing an instruction that causes at least two operands to produce an intermediate result and incrementing a state of a random number generator. The method d further includes causing the random number generator to generate a random number in accordance with the state and producing a final result by utilizing the random number to determine a rounding of the intermediate result.
VARIABLE PRECISION FLOATING-POINT MULTIPLIER
Integrated circuits with specialized processing blocks are provided. The specialized processing blocks may include floating-point multiplier circuits that can be configured to support variable precision. A multiplier circuit may include a first carry-propagate adder (CPA), a second carry-propagate adder (CPA), and an associated rounding circuit. The first CPA may be wide enough to handle the required precision of the mantissa. In a bridged mode, the first CPA may borrow an additional bit from the second CPA while the rounding circuit will monitor the appropriate bits to select the proper multiplier output. A parallel prefix tree operable in a non-bridged mode or the bridged mode may be used to compute multiple multiplier outputs. The multiplier circuit may also include exponent and exception handling circuitry using various masks corresponding to the desired precision width.
Reproducible stochastic rounding for out of order processors
A method for generating a random number for use in a stochastic rounding operation is provided. The method includes executing an instruction that causes at least two operands to produce an intermediate result and incrementing a state of a random number generator. The method d further includes causing the random number generator to generate a random number in accordance with the state and producing a final result by utilizing the random number to determine a rounding of the intermediate result.