H03M7/46

NEAR-OPTIMAL TRANSITION ENCODING CODES
20230041347 · 2023-02-09 ·

A method of encoding input data includes dividing the input data into a plurality of data packets, an input packet of the plurality of data packets including a plurality of digits in a first base system, base-converting the input packet from the first base system to generate a base-converted packet including a plurality of converted digits in a second base system, the second base system having a base value lower than that of the first base system, and incrementing the converted digits to generate a coded packet for transmission through a communication channel.

NEAR-OPTIMAL TRANSITION ENCODING CODES
20230041347 · 2023-02-09 ·

A method of encoding input data includes dividing the input data into a plurality of data packets, an input packet of the plurality of data packets including a plurality of digits in a first base system, base-converting the input packet from the first base system to generate a base-converted packet including a plurality of converted digits in a second base system, the second base system having a base value lower than that of the first base system, and incrementing the converted digits to generate a coded packet for transmission through a communication channel.

Graphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format

Described herein is a graphics processing unit (GPU) comprising a first processing cluster to perform parallel processing operations, the parallel processing operations including a ray tracing operation and a matrix multiply operation; and a second processing cluster coupled to the first processing cluster, wherein the first processing cluster includes a floating-point unit to perform floating point operations, the floating-point unit is configured to process an instruction using a bfloat16 (BF16) format with a multiplier to multiply second and third source operands while an accumulator adds a first source operand with output from the multiplier.

Graphics processors and graphics processing units having dot product accumulate instruction for hybrid floating point format

Described herein is a graphics processing unit (GPU) comprising a first processing cluster to perform parallel processing operations, the parallel processing operations including a ray tracing operation and a matrix multiply operation; and a second processing cluster coupled to the first processing cluster, wherein the first processing cluster includes a floating-point unit to perform floating point operations, the floating-point unit is configured to process an instruction using a bfloat16 (BF16) format with a multiplier to multiply second and third source operands while an accumulator adds a first source operand with output from the multiplier.

Neural network activation compression with non-uniform mantissas

Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Neural network activation compression with non-uniform mantissas

Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.

Data compression techniques
11558067 · 2023-01-17 · ·

Techniques and solutions are described for compressing data and facilitating access to compressed data. Compression can be applied to proper data subsets of a data set, such as to columns of a table. Using various methods, the proper data subsets can be evaluated to be included in a group of proper data subsets to be compressed using a first compression technique, where unselected proper data subsets are not compressed using the first compression technique. Data in the data set can be reordered based on a reordering sequence for the proper data subsets. Reordering data in the data set can improve compression when at least a portion of the proper data subsets are compressed. A data structure is provided that facilitates accessing specified data stored in a compressed format.

Data compression techniques
11558067 · 2023-01-17 · ·

Techniques and solutions are described for compressing data and facilitating access to compressed data. Compression can be applied to proper data subsets of a data set, such as to columns of a table. Using various methods, the proper data subsets can be evaluated to be included in a group of proper data subsets to be compressed using a first compression technique, where unselected proper data subsets are not compressed using the first compression technique. Data in the data set can be reordered based on a reordering sequence for the proper data subsets. Reordering data in the data set can improve compression when at least a portion of the proper data subsets are compressed. A data structure is provided that facilitates accessing specified data stored in a compressed format.

Weight data compression method, weight data decompression method, weight data compression device, and weight data decompression device
11700014 · 2023-07-11 · ·

A weight data compression method includes: generating a 4-bit data string of 4-bit data items each expressed as any one of nine 4-bit values, by dividing ternary weight data into data items each having 4 bits; and generating first compressed data including a first flag value string and a first non-zero value string by (i) generating the first flag value string by assigning one of 0 and 1 as a first flag value of a 1-bit flag to a 4-bit data item 0000 and assigning an other of 0 and 1 as a second flag value of the 1-bit flag to a 4-bit data item other than 0000 among the 4-bit data items in the 4-bit data string and (ii) generating the first non-zero value string by converting the 4-bit data item other than 0000 into a 3-bit data item having any one of eight 3-bit values.

Feature reordering based on sparsity for improved memory compression transfers during machine learning jobs

A processing device for executing a machine learning neural network operation includes memory and a processor. The processor is configured to receive input data at a layer of the machine learning neural network operation, receive a plurality of sorted filters to be applied to the input data, apply the plurality of sorted filters to the input data to produce a plurality of different feature maps, compress the plurality of different feature maps according to a sparsity of the feature maps and store the plurality of different feature maps in the memory.