Patent classifications
G06F7/556
CONFIGURABLE NONLINEAR ACTIVATION FUNCTION CIRCUITS
Certain aspects of the present disclosure provide a method for processing input data by a set of configurable nonlinear activation function circuits, including generating an exponent output by processing input data using one or more first configurable nonlinear activation function circuits configured to perform an exponential function, summing the exponent output of the one or more first configurable nonlinear activation function circuits, and generating an approximated log softmax output by processing the summed exponent output using a second configurable nonlinear activation function circuit configured to perform a natural logarithm function.
Verification of Hardware Designs to Implement Floating Point Power Functions
A method of exhaustively verifying a property of a hardware design to implement a floating point power function. The method includes, formally verifying that the hardware design is recurrent over sets of β input exponents, wherein β is an integer that is a multiple of the reciprocal of the exponent of the power function; and for each recurrent input range of the hardware design, exhaustively simulating the hardware design over a simulation range to verify the property is true over the simulation range, wherein the simulation range comprises only β input exponents.
Verification of Hardware Designs to Implement Floating Point Power Functions
A method of exhaustively verifying a property of a hardware design to implement a floating point power function. The method includes, formally verifying that the hardware design is recurrent over sets of β input exponents, wherein β is an integer that is a multiple of the reciprocal of the exponent of the power function; and for each recurrent input range of the hardware design, exhaustively simulating the hardware design over a simulation range to verify the property is true over the simulation range, wherein the simulation range comprises only β input exponents.
EFFICIENT HARDWARE IMPLEMENTATION OF THE EXPONENTIAL FUNCTION USING HYPERBOLIC FUNCTIONS
Apparatus and associated methods relate to determining a natural exponent from a digital word input by splitting the digital word, and retrieving a precalculated and predetermined value from a data store at an address defined by the first word. In an illustrative example, the retrieved value may be a hyperbolic sum. The hyperbolic sum may be multiplied by the second word. The hyperbolic sum may be scaled, and summed with the multiplication result to generate a scaled exponential value. The scaled exponential value may be scaled to produce an exponential value representing e.sup.X. In various examples, the digital word input may be in a fixed point or a floating point format, or converted therebetween. In various embodiments, the data store may be a lookup table. Various examples may provide a compact and versatile architecture for determining a natural exponent with minimized hardware resources.
EFFICIENT HARDWARE IMPLEMENTATION OF THE EXPONENTIAL FUNCTION USING HYPERBOLIC FUNCTIONS
Apparatus and associated methods relate to determining a natural exponent from a digital word input by splitting the digital word, and retrieving a precalculated and predetermined value from a data store at an address defined by the first word. In an illustrative example, the retrieved value may be a hyperbolic sum. The hyperbolic sum may be multiplied by the second word. The hyperbolic sum may be scaled, and summed with the multiplication result to generate a scaled exponential value. The scaled exponential value may be scaled to produce an exponential value representing e.sup.X. In various examples, the digital word input may be in a fixed point or a floating point format, or converted therebetween. In various embodiments, the data store may be a lookup table. Various examples may provide a compact and versatile architecture for determining a natural exponent with minimized hardware resources.
LOGARITHM AND POWER (EXPONENTIATION) COMPUTATIONS USING MODERN COMPUTER ARCHITECTURES
Embodiments of the present invention may provide the capability to evaluate logarithm and power (exponentiation) functions using either hardware specific instructions, or a hardware specific implementation with reduced memory requirements. An input comprising a floating point representation of a real number may be received and a mantissa and an exponent may be extracted. A function of a logarithm of a mantissa of the real number may be approximated by utilizing a polynomial based on the mantissa. The approximated function of the logarithm may be combined with the exponent for calculating a value comprising a logarithm of the real number. Likewise, an input comprising a floating point representation of a real number and a representation of a second number may be received and an approximation of the real number to the power of the second number may be generated.
LOGARITHM AND POWER (EXPONENTIATION) COMPUTATIONS USING MODERN COMPUTER ARCHITECTURES
Embodiments of the present invention may provide the capability to evaluate logarithm and power (exponentiation) functions using either hardware specific instructions, or a hardware specific implementation with reduced memory requirements. An input comprising a floating point representation of a real number may be received and a mantissa and an exponent may be extracted. A function of a logarithm of a mantissa of the real number may be approximated by utilizing a polynomial based on the mantissa. The approximated function of the logarithm may be combined with the exponent for calculating a value comprising a logarithm of the real number. Likewise, an input comprising a floating point representation of a real number and a representation of a second number may be received and an approximation of the real number to the power of the second number may be generated.
PIPELINED PROCESSING OF POLYNOMIAL COMPUTATION
Circuits and methods for computing an order N polynomial include V decimation stages, each stage including respective multiply-and-accumulate circuitry. The multiply-and-accumulate circuitry in each stage k, in response to an input r-term and a plurality of input z-terms 0 through (N.sub.k−1), generates output z-terms 0 through (N.sub.k/2−1) and an output r-term as a square of the input r-term. Each output z-term i is a sum of input z-term (2i+1) of the input z-terms and a product of input z-term 2i and the input r-term. The multiply-and-accumulate circuitry in decimation stages k for k≤(V−1) provides the output r-term and one or more output z-terms from decimation stage k as the input r-term and one or more input z-terms to the respective multiply-and-accumulate circuitry of decimation stage k+1. A recursive stage inputs from decimation stage V, the output r-term as a recursive r-term and the output z-terms as a-terms, and generates a polynomial output value z by a recursive evaluation of the recursive r-term, the a-terms, and a modulus, p.
PIPELINED PROCESSING OF POLYNOMIAL COMPUTATION
Circuits and methods for computing an order N polynomial include V decimation stages, each stage including respective multiply-and-accumulate circuitry. The multiply-and-accumulate circuitry in each stage k, in response to an input r-term and a plurality of input z-terms 0 through (N.sub.k−1), generates output z-terms 0 through (N.sub.k/2−1) and an output r-term as a square of the input r-term. Each output z-term i is a sum of input z-term (2i+1) of the input z-terms and a product of input z-term 2i and the input r-term. The multiply-and-accumulate circuitry in decimation stages k for k≤(V−1) provides the output r-term and one or more output z-terms from decimation stage k as the input r-term and one or more input z-terms to the respective multiply-and-accumulate circuitry of decimation stage k+1. A recursive stage inputs from decimation stage V, the output r-term as a recursive r-term and the output z-terms as a-terms, and generates a polynomial output value z by a recursive evaluation of the recursive r-term, the a-terms, and a modulus, p.
Communication system and method for achieving high data rates using modified nearly-equiangular tight frame (NETF) matrices
A method includes generating a set of symbols based on an incoming data vector. The set of symbols includes K symbols, K being a positive integer. A first transformation matrix including an equiangular tight frame (ETF) transformation or a nearly equiangular tight frame (NETF) transformation is generated, having dimensions N×K, where N is a positive integer and has a value less than K. A second transformation matrix having dimensions K×K is generated based on the first transformation matrix. A third transformation matrix having dimensions K×K is generated by performing a series of unitary transformations on the second transformation matrix. A first data vector is transformed into a second data vector having a length N based on the third transformation matrix and the set of symbols. A signal representing the second data vector is sent to a transmitter for transmission of a signal representing the second data vector to a receiver.