Patent classifications
H03M7/46
NEURAL NETWORK PROCESSOR USING COMPRESSION AND DECOMPRESSION OF ACTIVATION DATA TO REDUCE MEMORY BANDWIDTH UTILIZATION
A deep neural network (“DNN”) module can compress and decompress neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit can receive an uncompressed chunk of data generated by a neuron in the DNN module. The compression unit generates a mask portion and a data portion of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit can receive a compressed chunk of data from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion and the data portion. This can reduce memory bus utilization, allow a DNN module to complete processing operations more quickly, and reduce power consumption.
Deflate compression using sub-literals for reduced complexity Huffman coding
A literal element that has a plurality of bits is received. The plurality of bits in the literal element is divided into a first sub-literal comprising a first set of bits and a second sub-literal comprising a second set of bits. The first sub-literal is encoded using a first Huffman code tree to obtain a first sub-literal codeword; the second sub-literal is encoded using a second Huffman code tree to obtain a second sub-literal codeword. Encoded data that includes information associated with the first Huffman code tree, information associated with the second Huffman code tree, the first sub-literal codeword, and the second sub-literal codeword is output.
Deflate compression using sub-literals for reduced complexity Huffman coding
A literal element that has a plurality of bits is received. The plurality of bits in the literal element is divided into a first sub-literal comprising a first set of bits and a second sub-literal comprising a second set of bits. The first sub-literal is encoded using a first Huffman code tree to obtain a first sub-literal codeword; the second sub-literal is encoded using a second Huffman code tree to obtain a second sub-literal codeword. Encoded data that includes information associated with the first Huffman code tree, information associated with the second Huffman code tree, the first sub-literal codeword, and the second sub-literal codeword is output.
DATA COMPRESSION TECHNIQUES
Techniques and solutions are described for compressing data and facilitating access to compressed data. Compression can be applied to proper data subsets of a data set, such as to columns of a table. Using various methods, the proper data subsets can be evaluated to be included in a group of proper data subsets to be compressed using a first compression technique, where unselected proper data subsets are not compressed using the first compression technique. Data in the data set can be reordered based on a reordering sequence for the proper data subsets. Reordering data in the data set can improve compression when at least a portion of the proper data subsets are compressed. A data structure is provided that facilitates accessing specified data stored in a compressed format.
DATA COMPRESSION TECHNIQUES
Techniques and solutions are described for compressing data and facilitating access to compressed data. Compression can be applied to proper data subsets of a data set, such as to columns of a table. Using various methods, the proper data subsets can be evaluated to be included in a group of proper data subsets to be compressed using a first compression technique, where unselected proper data subsets are not compressed using the first compression technique. Data in the data set can be reordered based on a reordering sequence for the proper data subsets. Reordering data in the data set can improve compression when at least a portion of the proper data subsets are compressed. A data structure is provided that facilitates accessing specified data stored in a compressed format.
SYSTEM AND METHOD FOR TRANSITION ENCODING WITH REDUCED ERROR PROPAGATION
A method of encoding input data includes receiving the input data that includes a plurality of input words including a first input word and a second input word, generating a plurality of converted words including a first converted word and a second converted word, the first converted word being based at least on the first input word, the second converted word being based on the first converted word and the second input word, identifying a key value based on the plurality of converted words, and generating a plurality of coded words based on the key value and the plurality of converted words.
SYSTEM AND METHOD FOR TRANSITION ENCODING WITH REDUCED ERROR PROPAGATION
A method of encoding input data includes receiving the input data that includes a plurality of input words including a first input word and a second input word, generating a plurality of converted words including a first converted word and a second converted word, the first converted word being based at least on the first input word, the second converted word being based on the first converted word and the second input word, identifying a key value based on the plurality of converted words, and generating a plurality of coded words based on the key value and the plurality of converted words.
Systems and methods for improving cache efficiency and utilization
- Altug Koker ,
- Joydeep Ray ,
- Ben Ashbaugh ,
- Jonathan Pearce ,
- Abhishek Appu ,
- Vasanth Ranganathan ,
- Lakshminarayanan Striramassarma ,
- Elmoustapha Ould-Ahmed-Vall ,
- Aravindh Anantaraman ,
- Valentin Andrei ,
- Nicolas Galoppo von Borries ,
- Varghese George ,
- Yoav Harel ,
- Arthur Hunter, JR. ,
- Brent Insko ,
- Scott Janus ,
- Pattabhiraman K ,
- Mike Macpherson ,
- Subramaniam Maiyuran ,
- Marian Alin Petre ,
- Murali Ramadoss ,
- Shailesh Shah ,
- Kamal Sinha ,
- Prasoonkumar Surti ,
- Vikranth Vemulapalli
Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.
Systems and methods for improving cache efficiency and utilization
- Altug Koker ,
- Joydeep Ray ,
- Ben Ashbaugh ,
- Jonathan Pearce ,
- Abhishek Appu ,
- Vasanth Ranganathan ,
- Lakshminarayanan Striramassarma ,
- Elmoustapha Ould-Ahmed-Vall ,
- Aravindh Anantaraman ,
- Valentin Andrei ,
- Nicolas Galoppo von Borries ,
- Varghese George ,
- Yoav Harel ,
- Arthur Hunter, JR. ,
- Brent Insko ,
- Scott Janus ,
- Pattabhiraman K ,
- Mike Macpherson ,
- Subramaniam Maiyuran ,
- Marian Alin Petre ,
- Murali Ramadoss ,
- Shailesh Shah ,
- Kamal Sinha ,
- Prasoonkumar Surti ,
- Vikranth Vemulapalli
Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.
CONVERSION OF FILLED AREAS TO RUN LENGTH ENCODED VECTORS
A method and system for converting a filled shape to a run length encoded RLE vector is disclosed. The method includes creating a virtual pixel array of pixel cells corresponding to a graphical array of pixels comprising the filled shape. The method includes determining a border on the virtual pixel array corresponding to the filled shape, storing a pixel-type value within each pixel cell that corresponds to a border line element within the pixel, and creating a shape RLE group corresponding to a line of pixels aligned along a first axis of the virtual pixel array. Once created, the position and length of the shape RLE group is stored as an RLE vector. The method for clipping filled shapes is also disclosed, which includes converting a clipping region to a clip RLE group, then comparing the clip RLE group to the shape RLE group, forming a clipped image RLE vector.