H03M7/4062

Derived data dictionary for optimizing transformations of encoded data

A database-management system evaluates a query that retrieves and transforms encoded symbols stored in a database. If the stored symbols assume a relatively small set of distinct values, the system initially performs the transformation on every value in the set. During execution of subsequent queries, rather than performing the transformation upon every stored symbol fetched from the database, the system merely returns the previously derived encoded transformation results that correspond to the decoded value of each fetched symbol. If the symbols stored in the database span a relatively large set of distinct values, the system does not initially perform the transformation upon every value in the set. Instead, the first time the system fetches a symbol that has a particular value, it saves that symbol's encoded transformation result and reuses that result the next time it fetches an encoded symbol with the same value.

Realtime multimodel lossless data compression system and method
11128935 · 2021-09-21 · ·

Methods and systems for processing telemetry data that contains multiple data types is disclosed. Optimum multimodal encoding approaches can be used which can achieve data-specific compression performance for heterogeneous datasets by distinguishing data types and their characteristics at real-time and applying most effective compression method to a given data type. Using an optimum encoding diagram for heterogeneous data, a data classification algorithm classifies input data blocks into predefined categories, such as Unicode, telemetry, RCS and IR for telemetry datasets, and a class of unknown which includes non-studied data types, and then assigns them into corresponding compression models.

Real-time history-based byte stream compression
11025272 · 2021-06-01 · ·

Systems and methods for stream-based compression include an encoder of a first device that may receive an input stream of bytes including a first byte preceded by one or more second bytes. The encoder may determine to identify a prefix code for the first byte. The encoder may select a prefix code table using the one or more second bytes. The encoder may identify, from the selected prefix code table, the prefix code of the first byte. The encoder may generate an output stream of bytes by replacing the first byte in the input stream with the prefix code of the first byte. The encoder may transmit the output stream from the encoder of the first device to a decoder of a second device. The output stream may have a fewer number of bits than the input stream.

Computerized data compression and analysis using potentially non-adjacent pairs
20210157818 · 2021-05-27 ·

A computerized method of compressing symbolic information organized into a plurality of documents, each document having a plurality of symbols, includes: (i) automatically identifying a plurality of sequential and non-sequential symbol pairs in an input document; (ii) counting the number of appearances of each unique symbol pair; and (iii) producing a compressed document that includes a replacement symbol at each position associated with one of the plurality of symbol pairs, at least one of which corresponds to a non-sequential symbol pair. For each non-sequential pair the compressed document includes corresponding indicia indicating a distance between locations of the non-sequential symbols of the pair in the input document. In some instances the plurality of symbol pairs includes only those pairs of non-sequential symbols for which the distance between locations of the non-sequential symbols of the pair in the input document is less than a numeric distance cap.

Compression of machine learned models
10970470 · 2021-04-06 · ·

Devices and techniques are generally described for compression of natural language processing models. A first index value to a first address of a weight table may be stored in a hash table. The first address may store a first weight associated with a first feature of a natural language processing model. A second index value to a second address of the weight table may be stored in the hash table. The second address may store a second weight associated with a second feature of the natural language processing model. A first code associated with the first feature and comprising a first number of bits may be generated. A second code may be generated associated with the second feature and comprising a second number of bits greater than the first number of bits based on a magnitude of the second weight being greater than a magnitude of the first weight.

Systems and methods for variable length codeword based, hybrid data encoding and decoding using dynamic memory allocation

A data encoding system includes a non-transitory memory, a processor, a digital-to-analog converter (DAC) and a transmitter. The non-transitory memory stores a predetermined file size threshold. The processor is in operable communication with the memory, and is configured to receive data. The processor detects a file size associated with the data. When the file size is below the predetermined file size threshold, the processor compresses the data using a variable length codeword (VLC) encoder. When the file size is not below the predetermined file size threshold, the processor compresses the data, using a hash table algorithm. The DAC is configured to receive a digital representation of the compressed data from the processor and convert the digital representation of the compressed data into an analog representation of the compressed data. The transmitter is coupled to the DAC and configured to transmit the analog representation of the compressed data.

SEMI-SORTING COMPRESSION WITH ENCODING AND DECODING TABLES

A data processing platform, method, and program product perform compression and decompression of a set of data items. Suffix data and a prefix are selected for each respective data item in the set of data items based on data content of the respective data item. The set of data items is sorted based on the prefixes. The prefixes are encoded by querying multiple encoding tables to create a code word containing compressed information representing values of all prefixes for the set of data items. The code word and suffix data for each of the data items are stored in memory. The code word is decompressed to recover the prefixes. The recovered prefixes are paired with their respective suffix data.

FEATURE DICTIONARY FOR BANDWIDTH ENHANCEMENT

A system having multiple devices that can host different versions of an artificial neural network (ANN) as well as different versions of a feature dictionary. In the system, encoded inputs for the ANN can be decoded by the feature dictionary, which allows for encoded input to be sent to a master version of the ANN over a network instead of an original version of the input which usually includes more data than the encoded input. Thus, by using the feature dictionary for training of a master ANN there can be reduction of data transmission.

Data processing unit having hardware-based range encoding and decoding

A highly programmable device, referred to generally as a data processing unit, having multiple processing units for processing streams of information, such as network packets or storage packets, is described. The data processing unit includes one or more specialized hardware accelerators configured to perform acceleration for various data-processing functions. This disclosure describes examples of retrieving speculative probability values for range coding a plurality of bits with a single read instruction to a on-chip memory that stores a table of probability values. This disclosure also describes examples of storing state information used for context-coding packets of a data stream so that the state information is available after switching between data streams.

REAL-TIME HISTORY-BASED BYTE STREAM COMPRESSION
20210013900 · 2021-01-14 ·

Described embodiments provide systems and methods for stream-based compression. An encoder of a first device receives an input stream of bytes including a first byte preceded by one or more second bytes. The encoder may determine to identify a prefix code for the first byte. The encoder may select a prefix code table using the one or more second bytes. The encoder may identify, from the selected prefix code table, the prefix code of the first byte. The encoder may generate an output stream of bytes by replacing the first byte in the input stream with the prefix code of the first byte. The encoder may transmit the output stream from the encoder of the first device to a decoder of a second device. The output stream may have a fewer number of bits than the input stream.