Patent classifications
G06F7/49936
SYSTEMS AND METHODS OF HIGH SPEED SCRUBBING OF AIRSPACE RADAR RETURNS
High speed scrubbing of airspace radar returns is provided. A system can include a central processing unit (“CPU”) and a graphical processing unit (“GPU”). The CPU loads time-ordered airspace radar return data that includes radar returns each encoded as an object with location information, time information, and property information. The GPU generates arrays including the location information, the time information, and the property information reorganized into a location array, a time array, and a property-based array. The GPU receives an indication to scrub a display of at least a portion of the airspace radar return data to a time window prior to a current display time or subsequent to the current display time. The GPU retrieves, from the arrays, a location entry and a property-based entry that satisfy the time window. The GPU renders frames with pixels corresponding to the location entry, the time entry, and the property-based entry.
Look Ahead Normaliser
Apparatus includes hardware logic arranged to normalise an n-bit input number. The hardware logic comprises at least a first hardware logic stage, an intermediate hardware logic stage and a final hardware logic stage. Each stage comprises a left shifting logic element, the first and intermediate stages each also comprise a plurality of OR-reduction logic elements and the intermediate and final stages each also comprise one or more multiplexers. The OR-reduction logic elements operate on different subsets of bits from the number input to the particular stage. In the intermediate and final hardware logic stages, a first of the multiplexers selects an OR-reduction result received from a previous hardware logic stage and the left shifting logic element is arranged to perform left shifting on the updated binary number received from an immediately previous hardware logic stage dependent upon the selected OR-reduction result.
SYSTOLIC ARRAY INCLUDING FUSED MULTIPLY ACCUMULATE WITH EFFICIENT PRENORMALIZATION AND EXTENDED DYNAMIC RANGE
Systems and methods are provided to perform multiply-accumulate operations of normalized numbers in a systolic array to enable greater computational density, reduce the size of systolic arrays required to perform multiply-accumulate operations of normalized numbers, and/or enable higher throughput operation. The systolic array can be provided normalized numbers by a column of normalizers and can lack support for denormal numbers. Each normalizer can normalize the inputs to each processing element in the systolic array. The systolic array can include a multiplier and an adder. The multiplier can have multiple data paths that correspond to the data type of the input. The multiplier and adder can employ expanded exponent range to operate on normalized floating-point numbers and can lack support for denormal numbers.
Computational Units for Batch Normalization
Herein are disclosed computation units for batch normalization. A computation unit may include a first circuit to traverse a batch of input elements x.sub.i having a first format, to produce a mean μ.sub.1 in the first format and a mean μ.sub.2 in a second format, the second format having more bits than the first format. The computation unit may further include a second circuit operatively coupled to the first circuit to traverse the batch of input elements x.sub.i to produce a standard deviation σ for the batch using the mean μ.sub.1 in the first format. The computation unit may also include a third circuit operatively coupled to the second circuit to traverse the batch of input elements x.sub.i to produce a normalized set of values y.sub.i using the mean μ.sub.2 in the second format and the standard deviation σ.
Systems and methods for efficient scaling of quantized integers
The disclosed computer-implemented method may include receiving an input value and a floating-point scaling factor and determining (1) an integer scaling factor based on the floating-point scaling factor, (2) a pre-scaling adjustment value representative of a number of places by which to shift a binary representation of the input value prior to a scaling operation, and (3) a post-scaling adjustment value representative of a number of places by which to shift the binary representation of the input value following the scaling operation. The method may further include calculating a scaled result value by (1) shifting rightwards the binary representation of the input value by the pre-scaling adjustment value, (2) scaling the shifted binary representation of the input value by the integer scaling factor, and (3) shifting rightwards the shifted and scaled binary value by the post-scaling adjustment value. Various other methods, systems, and computer-readable media are also disclosed.
Transcendental calculation unit apparatus and method
A transcendental calculation unit includes a configuration table storing a set of constants and provide a selected one of the constants, a power series multiplier that iteratively develops a power series, a coefficient series multiplier and accumulator that develops an accumulated product of the power series and the constant, and a round and normalize stage that rounds the accumulated product and normalizes rounded product.
RESIDUE CHECKING OF ENTIRE NORMALIZER OUTPUT OF AN EXTENDED RESULT
A method includes generating an extended result from a first operation circuitry having a result register bit width greater than a bus width associated with a residue check path of a second operation circuitry associated with a floating point unit. An extended result residue less a first portion residue of the extended result received from the residue check path is stored as a first partial result residue. The first partial result residue is compared with a first result residue of the second operation circuitry. The extended result residue less both the first partial result residue and a second portion residue of the extended result received from the residue check path as a second partial result residue is compared with a second result residue of the second operation circuitry.
FUSED MULTIPLY-ADD OPERATOR FOR MIXED PRECISION FLOATING-POINT NUMBERS WITH CORRECT ROUNDING
A fused multiply-add hardware operator comprising a multiplier receiving two multiplicands as floating-point numbers encoded in a first precision format; an alignment circuit associated with the multiplier configured to convert the result of the multiplication into a first fixed-point number ; and an adder configured to add the first fixed-point number and an addition operand. The addition operand is a floating-point number encoded in a second precision format , and the operator comprises an alignment circuit associated with the addition operand, configured to convert the addition operand into a second fixed-point number of reduced dynamic range relative to the dynamic range of the addition operand, having a number of bits equal to the number of bits of the first fixed-point number, extended on both sides by at least the size of the mantissa of the addition operand; the adder configured to add the first and second fixed-point numbers without loss.
FLOATING POINT DOT-PRODUCT OPERATOR WITH CORRECT ROUNDING
The disclosure relates to a hardware operator for dot-product computation, comprising a plurality of multipliers each receiving two multiplicands in the form of floating-point numbers encoded in a first precision format; an alignment circuit associated with each multiplier, configured to, based on the exponents of the corresponding multiplicands, convert the result of the multiplication into a respective fixed-point number having a sufficient number of bits to cover the full dynamic range of the multiplication; and a multi-adder configured to add without loss the fixed-point numbers provided by the multipliers, providing a sum in the form of a fixed-point number.
TRANSCENDENTAL CALCULATION UNIT APPARATUS AND METHOD
A transcendental calculation unit includes a configuration table storing a set of constants and provide a selected one of the constants, a power series multiplier that iteratively develops a power series, a coefficient series multiplier and accumulator that develops an accumulated product of the power series and the constant, and a round and normalize stage that rounds the accumulated product and normalizes rounded product.