Patent classifications
H03M7/4037
ENTROPY ENCODING AND DECODING SCHEME
Decomposing a value range of the respective syntax elements into a sequence of n partitions with coding the components of z laying within the respective partitions separately with at least one by VLC coding and with at least one by PIPE or entropy coding is used to greatly increase the compression efficiency at a moderate coding overhead since the coding scheme used may be better adapted to the syntax element statistics. Accordingly, syntax elements are decomposed into a respective number n of source symbols s.sub.i with i=1 . . . n, the respective number n of source symbols depending on as to which of a sequence of n partitions into which a value range of the respective syntax elements is sub-divided, a value z of the respective syntax elements falls into, so that a sum of values of the respective number of source symbols s.sub.i yields z, and, if n>1, for all i=1 . . . n−1, the value of s.sub.i corresponds to a range of the i.sup.th partition.
PARTITIONAL DATA COMPRESSION
A system collects statistical data for a data page, divides the data page into parts, analyzes the data page and the statistical data, based on compression efficiency of one or more compression methods for each part of each page, to determine a compression method for each part of page, and compresses, based on the analyzing, the parts of the data page.
Selection of the maximum dynamic range of transformed data and the data precision of transform matrices according to the bit depth of input data
A method of encoding image data, including: frequency-transforming input image data to generate an array of frequency-transformed input image coefficients by a matrix-multiplication process, according to a maximum dynamic range of the transformed data and using transform matrices having a data precision; and selecting the maximum dynamic range and/or the data precision of the transform matrices according to the bit depth of the input image data.
METHODS AND APPARATUS TO PARALLELIZE DATA DECOMPRESSION
Methods and apparatus to parallelize data decompression are disclosed. An example method selecting initial starting positions in a compressed data bitstream; adjusting a first one of the initial starting positions to determine a first adjusted starting position by decoding the bitstream starting at a training position in the bitstream, the decoding including traversing the bitstream from the training position as though first data located at the training position is a valid token; outputting first decoded data generated by decoding a first segment of the bitstream starting from the first adjusted starting position; and merging the first decoded data with second decoded data generated by decoding a second segment of the bitstream, the decoding of the second segment starting from a second position in the bitstream and being performed in parallel with the decoding of the first segment, and the second segment preceding the first segment in the bitstream.
Entropy encoding and decoding scheme
Decomposing a value range of the respective syntax elements into a sequence of n partitions with coding the components of z laying within the respective partitions separately with at least one by VLC coding and with at least one by PIPE or entropy coding is used to greatly increase the compression efficiency at a moderate coding overhead since the coding scheme used may be better adapted to the syntax element statistics. Accordingly, syntax elements are decomposed into a respective number n of source symbols s.sub.i with i=1 . . . n, the respective number n of source symbols depending on as to which of a sequence of n partitions into which a value range of the respective syntax elements is sub-divided, a value z of the respective syntax elements falls into, so that a sum of values of the respective number of source symbols s.sub.i yields z, and, if n>1, for all i=1 . . . n−1, the value of s.sub.i corresponds to a range of the i.sup.th partition.
Apparatus and method for packing a bit stream
A data packer forma bit stream for forwarding values to memory. The bit stream includes the values and respective prefixes for identifying the values and the data packer is configured to insert the prefixes at predetermined boundaries in the bit stream, such that a prefix for identifying one value is inserted between bits that define a value identified by a preceding prefix. A data unpacker unpacks a bit stream that comprises values and respective prefixes for identifying those values that are located at predetermined boundaries in the bit stream, such that a prefix for identifying one value is inserted between bits that define a value identified by a preceding prefix. The data unpacker identifies a prefix at a predetermined boundary in the bit stream and determine, in dependence on that prefix and the predetermined boundaries, a location of the next prefix in the bit stream.
SEMI-SORTING COMPRESSION WITH ENCODING AND DECODING TABLES
A data processing platform, method, and program product perform compression and decompression of a set of data items. Suffix data and a prefix are selected for each respective data item in the set of data items based on data content of the respective data item. The set of data items is sorted based on the prefixes. The prefixes are encoded by querying multiple encoding tables to create a code word containing compressed information representing values of all prefixes for the set of data items. The code word and suffix data for each of the data items are stored in memory. The code word is decompressed to recover the prefixes. The recovered prefixes are paired with their respective suffix data.
Semi-sorting compression with encoding and decoding tables
A data processing platform, method, and program product perform compression and decompression of a set of data items. Suffix data and a prefix are selected for each respective data item in the set of data items based on data content of the respective data item. The set of data items is sorted based on the prefixes. The prefixes are encoded by querying multiple encoding tables to create a code word containing compressed information representing values of all prefixes for the set of data items. The code word and suffix data for each of the data items are stored in memory. The code word is decompressed to recover the prefixes. The recovered prefixes are paired with their respective suffix data.
SPLIT ACCUMULATOR FOR CONVOLUTIONAL NEURAL NETWORK ACCELERATOR
Disclosed embodiments relate to a split accumulator for a convolutional neural network accelerator, comprising: arranging original weights in a computation sequence and aligning by bit to obtain a weight matrix, removing slack bits in the weight matrix, allowing essential bits in each column of the weight matrix to fill the vacancies according to the computation sequence to obtain an intermediate matrix, removing null rows in the intermediate matrix, obtain a kneading matrix, wherein each row of the kneading matrix serves as a kneading weight; obtaining positional information of the activation corresponding to each bit of the kneading weight; divides the kneading weight by bit into multiple weight segments, processing summation of the weight segments and the corresponding activations according to the positional information, and sending a processing result to an adder tree to obtain an output feature map by means of executing shift-and-add on the processing result.
CONVOLUTIONAL NEURAL NETWORK ACCELERATOR
Disclosed embodiments relate to a convolutional neural network accelerator, comprising: arranging original weights in a computation sequence and aligning by bit to obtain a weight matrix, removing slack bits in the weight matrix, allowing essential bits in each column of the weight matrix to fill the vacancies according to the computation sequence to obtain an intermediate matrix, removing null rows in the intermediate matrix, obtain a kneading matrix, wherein each row of the kneading matrix serves as a kneading weight; obtaining positional information of the activation corresponding to each bit of the kneading weight; divides the kneading weight by bit into multiple weight segments, processing summation of the weight segments and the corresponding activations according to the positional information, and sending a processing result to an adder tree to obtain an output feature map by means of executing shift-and-add on the processing result.