Patent classifications
H03M7/702
DATA COMPRESSION FOR A NEURAL NETWORK
Systems and methods for generating a representative value of a data set by first compressing a portion of values in the data set to determine a first common value and further compressing a subset of the portion of values to determine a second common value. The representative value is generated by taking the difference between the first common value and the second common value, wherein the representative value corresponds to a mathematical relationship between the first and second common values and each value within the subset of the portion of values. The representative value requires less storage than the first and second common values.
COMPRESSION OF DEEP NEURAL NETWORKS
In an approach for compressing a neural network, a processor receives a neural network, wherein the neural network has been trained on a set of training data. A processor receives a compression ratio. A processor compresses the neural network based on the compression ratio using an optimization model to solve for sparse weights. A processor re-trains the compressed neural network with the sparse weights. A processor outputs the re-trained neural network.
ARTIFICIAL NEURAL NETWORK COMPRESSION VIA ITERATIVE HYBRID REINFORCEMENT LEARNING APPROACH
Systems and computer-implemented methods for facilitating automated compression of artificial neural networks using an iterative hybrid reinforcement learning approach are provided. In various embodiments, a compression architecture can receive as input an original neural network to be compressed. The architecture can perform one or more compression actions to compress the original neural network into a compressed neural network. The architecture can then generate a reward signal quantifying how well the original neural network was compressed. In ()-proportion of compression iterations/episodes, where [0,1], the reward signal can be computed in model-free fashion based on a compression ratio and accuracy ratio of the compressed neural network. In (1)-proportion of compression iterations/episodes, the reward signal can be predicted in model-based fashion using a compression model learned/trained on the reward signals computed in model-free fashion. This hybrid model-free-and-model-based architecture can greatly reduce convergence time without sacrificing substantial accuracy.
NEURAL NETWORK ACTIVATION COMPRESSION WITH NON-UNIFORM MANTISSAS
Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format having lossy or non-uniform mantissas for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a non-uniform and/or lossy mantissa. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
REDUCING STORAGE OF BLOCKCHAIN METADATA VIA DICTIONARY-STYLE COMPRESSION
A method of reducing the storage requirements of blockchain metadata via dictionary-style compression includes receiving a request to add a transaction block to a blockchain. The method further includes determining an identifier (ID) of a dictionary block most recently stored on the blockchain. The method further includes compressing, by a processing device, one or more transactions of the transaction block based on the dictionary block to generate a compressed transaction block. The method further includes adding the ID of the dictionary block to the compressed transaction block. The method further includes providing the compressed transaction block, including the ID of the dictionary block, for storage on the blockchain.
Encoding and Decoding Variable Length Instructions
Methods of encoding and decoding are described which use a variable number of instruction words to encode instructions from an instruction set, such that different instructions within the instruction set may be encoded using different numbers of instruction words. To encode an instruction, the bits within the instruction are reordered and formed into instruction words based upon their variance as determined using empirical or simulation data. The bits in the instruction words are compared to corresponding predicted values and some or all of the instruction words that match the predicted values are omitted from the encoded instruction.
LOSSY COMPRESSION OF NEURAL NETWORK ACTIVATION MAPS
A system and a method provide compression and decompression of an activation map of a layer of a neural network. For compression, the values of the activation map are sparsified and the activation map is configured as a tensor having a tensor size of HWC in which H represents a height of the tensor, W represents a width of the tensor, and C represents a number of channels of the tensor. The tensor is formatted into at least one block of values. Each block is encoded independently from other blocks of the tensor using at least one lossless compression mode. For decoding, each block is decoded independently from other blocks using at least one decompression mode corresponding to the at least one compression mode used to compress the block; and deformatted into a tensor having the size of HWC.
Decompression Engine for Executable Microcontroller Code
A code decompression engine reads compressed code from a memory containing a compressed code part and a dictionary part. The compressed code part comprises series of instructions comprising either an uncompressed instruction preceded by an uncompressed code bit, or a compressed instruction comprising a compressed code bit followed by a number of segments field followed by segments, followed by a directory index indication a directory location to read. Each segment consists of a mask type, a mask offset, and a mask.
MULTI-PIXEL CACHING SCHEME FOR LOSSLESS ENCODING
Systems and methods are provided for encoding a multi-pixel caching scheme for lossless encoders. The systems and methods can include obtaining a sequence of pixels, determining repeating sub-sequences of the sequence of pixels consisting of a single repeated pixel and non-repeating sub-sequences of the sequence of pixels, responsive to the determination, encoding the repeating sub-sequences using a run-length of the repeated pixel and encoding the non-repeating sub-sequences using a multi-pixel cache, wherein the encoding using a multi-pixel cache comprises, encoding non-repeating sub-sequences stored in the multi-pixel cache as the location of the non-repeating sub-sequences in the multi-pixel cache, and encoding non-repeating sub-sequences not stored in the multi-pixel cache using the value of the pixels in the non-repeating sub-sequences.
Encoding and decoding variable length instructions
Methods of encoding and decoding are described which use a variable number of instruction words to encode instructions from an instruction set, such that different instructions within the instruction set may be encoded using different numbers of instruction words. To encode an instruction, the bits within the instruction are re-ordered and formed into instruction words based upon their variance as determined using empirical or simulation data. The bits in the instruction words are compared to corresponding predicted values and some or all of the instruction words that match the predicted values are omitted from the encoded instruction.