H03M7/702

MULTI-PIXEL CACHING SCHEME FOR LOSSLESS ENCODING
20220116634 · 2022-04-14 · ·

Systems and methods are provided for encoding a multi-pixel caching scheme for lossless encoders. The systems and methods can include obtaining a sequence of pixels, determining repeating sub-sequences of the sequence of pixels consisting of a single repeated pixel and non-repeating sub-sequences of the sequence of pixels, responsive to the determination, encoding the repeating sub-sequences using a run-length of the repeated pixel and encoding the non-repeating sub-sequences using a multi-pixel cache, wherein the encoding using a multi-pixel cache comprises, encoding non-repeating sub-sequences stored in the multi-pixel cache as the location of the non-repeating sub-sequences in the multi-pixel cache, and encoding non-repeating sub-sequences not stored in the multi-pixel cache using the value of the pixels in the non-repeating sub-sequences.

NEURAL NETWORK COMPRESSION
20220067527 · 2022-03-03 · ·

A neural network model is trained, where the training includes multiple training iterations. Weights of a particular layer of the neural network are pruned during a forward pass of a particular one of the training iterations. During the same forward pass of the particular training iteration, values of weights of the particular layer are quantized to determine a quantized-sparsified subset of weights for the particular layer. A compressed version of the neural network model is generated from the training based at least in part on the quantized-sparsified subset of weights.

EFFICIENT STORAGE AND RETRIEVAL OF RESOURCE DATA
20220075940 · 2022-03-10 · ·

A method of and system of for compressing and decompressing a localized software resource is disclosed. The method may include receiving a software resource, the software resource being in a first language, receiving a localized software resource for compression, where the software resource in the first language is a counterpart of the localized software resource in the second language. Upon receiving the software resources creating a first local dictionary for the localized software resource based at least in part on one or more first language words in the software resource and on data from a global dictionary, and compressing the localized software resource based on the local dictionary.

METHODS AND APPARATUS FOR THREAD-BASED SCHEDULING IN MULTICORE NEURAL NETWORKS
20220012060 · 2022-01-13 · ·

Systems, apparatus, and methods for thread-based scheduling within a multicore processor. Neural networking uses a network of connected nodes (aka neurons) to loosely model the neuro-biological functionality found in the human brain. Various embodiments of the present disclosure use thread dependency graphs analysis to decouple scheduling across many distributed cores. Rather than using thread dependency graphs to generate a sequential ordering for a centralized scheduler, the individual thread dependencies define a count value for each thread at compile-time. Threads and their thread dependency count are distributed to each core at run-time. Thereafter, each core can dynamically determine which threads to execute based on fulfilled thread dependencies without requiring a centralized scheduler.

METHODS AND APPARATUS FOR LOCALIZED PROCESSING WITHIN MULTICORE NEURAL NETWORKS

Methods and apparatus for localized processing within multicore neural networks. Unlike existing solutions that rely on commodity software and hardware to perform “brute force” large scale neural network processing the various techniques described herein map and partition a neural network into the hardware limitations of a target platform. Specifically, the various implementations described herein synergistically leverage localization, sparsity, and distributed scheduling, to enable neural network processing within embedded hardware applications. As described herein, hardware-aware mapping/partitioning enhances neural network performance by e.g., avoiding pin-limited memory accesses, processing data in compressed formats/skipping unnecessary operations, and decoupling scheduling between cores.

METHODS AND APPARATUS FOR MATRIX AND VECTOR STORAGE AND OPERATIONS

Methods and apparatus for matrix and vector storage and operations are disclosed. Vectors and matrices may be represented differently to further enhance performance of operations. Exemplary embodiments compress sparse neural network data structures based on actual, non-null, connectivity (rather than all possible connections). This greatly reduces storage requirements as well as computational complexity. In some variants, the compression and reduction in complexity is sized to fit within the memory footprint and processing capabilities of a core. The exemplary compression schemes represent sparse matrices with links to compressed column data structures, where each compressed column data structure only stores non-null entries to optimize column-based lookups of non-null entries. Similarly, sparse vector addressing skips nulled entries to optimize for vector-specific non-null multiply-accumulate operations.

Reducing storage of blockchain metadata via dictionary-style compression

A method of reducing the storage requirements of blockchain metadata via dictionary-style compression includes receiving a request to add a transaction block to a blockchain. The method further includes determining an identifier (ID) of a dictionary block most recently stored on the blockchain. The method further includes compressing, by a processing device, one or more transactions of the transaction block based on the dictionary block to generate a compressed transaction block. The method further includes adding the ID of the dictionary block to the compressed transaction block. The method further includes providing the compressed transaction block, including the ID of the dictionary block, for storage on the blockchain.

NEURAL NETWORK MODEL COMPRESSION WITH QUANTIZABILITY REGULARIZATION
20210201157 · 2021-07-01 · ·

A method, computer program, and computer system is provided for compressing a neural network model. A multi-dimensional tensor corresponding to a set of weight coefficients associated with a neural network is reshaped. A subset of weight coefficients is identified from among the set of weight coefficients. A model of the neural network is compressed based on the identified subset of weight coefficients.

DEEP LEARNING NUMERIC DATA AND SPARSE MATRIX COMPRESSION
20210174259 · 2021-06-10 · ·

An apparatus to facilitate deep learning numeric data and sparse matrix compression is disclosed. The apparatus includes a processor comprising a compression engine to: receive a data packet comprising a plurality of cycles of data samples, and for each cycle of the data samples: pass the data samples of the cycle to a compressor dictionary; identify, from the compressor dictionary, tags for each of the data samples, wherein the compressor dictionary comprises at least a first tag for data having a value of zero and a second tag for data having a value of one; and compress the data samples into compressed cycle data by storing the tags as compressed data, wherein the data samples identified with the first tag are compressed using the first tag and the data samples identified with the second tag are compressed using the second tag at the same time as values of the data samples identified with the first tag or the second tag are excluded from the compressed cycle data.

METHODS AND APPARATUS TO FACILITATE MALWARE DETECTION USING COMPRESSED DATA
20210192048 · 2021-06-24 ·

Methods, apparatus, systems and articles of manufacture are disclosed to facilitate malware detection using compressed data. An example apparatus includes an input processor to obtain a model, the model identifying a first sequence associated with a first trace of data known to be repetitive, a sequence identifier to identify a second sequence associated with a second trace of data, a comparator to compare the first sequence with the second sequence, and an output processor to when the first sequence matches the second sequence, transmit an encoded representation of the second sequence to the central processing facility using a first channel of communication, and when the first sequence fails to match the second sequence, transmit the second sequence to the central processing facility using a second channel of communication, the second sequence to be analyzed by the central processing facility to identify whether the second sequence is indicative of malware.