H03M7/3079

Techniques for optimizing entropy computations

Techniques for data processing may include: determining a data layout for a configuration of counters stored in registers, wherein each of the registers is configured to store at least two counters, and each counter is associated with a particular data item allowable in the data set and denotes a current frequency of the particular data item; receiving data items of a data chunk of the data set; for each data item received, performing processing including: determining a first of the counters corresponding to the data item, wherein the first counter is stored in a first of the registers and denotes a current frequency of the data item; and incrementing the first counter stored in the first register by one; and determining, in accordance with the counters stored in the registers, an entropy value for the data chunk.

Data compression with inline compression metadata
10489350 · 2019-11-26 · ·

Techniques for handling data compression in which metadata that indicates which portions of data are compressed are which portions of data are not compressed are disclosed. Segments of a buffer referred to as block groups store compressed blocks of data along with uncompressed blocks of data and hash blocks. If a block group includes a block that is a hash of another block in the block group, then the other block is considered to be compressed. If the block group does not include a block that is a hash of another block in the block group, then the blocks in the block group are uncompressed. The hash function to generate the hash is selected to prevent collisions, which occur when the data being stored in the buffer is such that it is possible for a hash block and an uncompressed block to be the same.

Systems and methods for variable length codeword based, hybrid data encoding and decoding using dynamic memory allocation

A data encoding system includes a non-transitory memory, a processor, a digital-to-analog converter (DAC) and a transmitter. The non-transitory memory stores a predetermined file size threshold. The processor is in operable communication with the memory, and is configured to receive data. The processor detects a file size associated with the data. When the file size is below the predetermined file size threshold, the processor compresses the data using a variable length codeword (VLC) encoder. When the file size is not below the predetermined file size threshold, the processor compresses the data, using a hash table algorithm. The DAC is configured to receive a digital representation of the compressed data from the processor and convert the digital representation of the compressed data into an analog representation of the compressed data. The transmitter is coupled to the DAC and configured to transmit the analog representation of the compressed data.

SYSTEMS AND METHODS FOR SCALABLE HIERARCHICAL COREFERENCE

A scalable hierarchical coreference method that employs a homomorphic compression scheme that supports addition and partial subtraction to more efficiently represent the data and the evolving intermediate results of probabilistic inference. The method may encode the features underlying conditional random field models of coreference resolution so that cosine similarities can be efficiently computed. The method may be applied to compressing features and intermediate inference results for conditional random fields. The method may allow compressed representations to be added and subtracted in a way that preserves the cosine similarities.

PROCESS AWARE DATA COMPRESSION

Determining an expected compression rate for a prospective process in a federated system includes obtaining compression rate data for existing processes in the federated system, compiling the compression rate data into a plurality of entries in a process name table according to process identifier, client, and industry, determining a specific entry in the process name table for an existing process that most closely matches the prospective process, and determining an expected compression rate of the prospective process based on the compression rate data for the specific entry. Compression rate data may be provided by a driver at host systems that sends compression rate information to a central repository. The central repository may be provided by a host system at a data center of the federated system. The compression rate data may use a sliding average that weighs the data more heavily to favor more recent data.

Systems and methods for coding

The present disclosure relates to systems and methods for coding. The methods may include receiving at least two contexts, for each of the at least two contexts, obtaining at least one coding parameter corresponding to the context from at least one lookup table, determining a probability interval value corresponding to the context based on a previous probability interval value and the at least one coding parameter, determining a normalized probability interval value corresponding to the context by performing a normalization operation on the probability interval value, determining a probability interval lower limit corresponding to the context based on a previous probability interval lower limit and the at least one coding parameter, determining a normalized probability interval lower limit corresponding to the context by performing the normalization operation on the probability interval lower limit, and outputting at least one byte based on the normalized probability interval lower limit.

Methods and systems for dynamic compression and transmission of application log data

Certain aspects of the present disclosure provide techniques for committing log data in an application to a log data repository. An example method generally includes receiving, from an application, data to be committed to a remote storage location. A type of the received data is determined. The type of the received data is generally associated with a prioritization level and a compression mechanism to be used in committing the data to the remote storage location. An application execution context associated with the received data is determined. At a dispatch time associated with the prioritization level of the received data and the application execution context associated with the received data, a compressed data payload is generated and transmitted to the remote storage location. Generally, to compress the data payload, at least the received data is generally compressed based on the determined compression mechanism.

Predicting compression ratio of data with compressible decision

A data-compression analyzer can rapidly make a binary decision to compress or not compress an input data block or can use a slower neural network to predict the block's compression ratio with a regression model. A Concentration Value (CV) that is the sum of the squares of the frequencies and a Number of Zero (NZ) symbols are calculated from an un-sorted symbol frequency table. A rapid decision to compress is signaled when their product CV*NZ exceeds a horizontal threshold THH. During training, CV*NZ is plotted as a function of compression ratio C % for many training data blocks. Different test values of THH are applied to the plot to determine true and false positive rates that are plotted as a Receiver Operating Characteristic (ROC) curve. The point on the ROC curve having the largest Youden index is selected as the optimum THH for use in future binary decisions.

QUANTIZER DETERMINATION, COMPUTER-READABLE MEDIUM AND APPARATUS THAT IMPLEMENTS AT LEAST TWO QUANTIZERS
20190333250 · 2019-10-31 ·

A method for determining a second quantizer for quantizing digital images, wherein the second quantizer is determined for a specified number of levels, which is at least two. For the determination, a first quantizer with a lower number of levels than the specified one is taken into consideration. Furthermore, a method for coding an image comprising a plurality of pixels, a computer-readable medium, an apparatus, which implements at least two quantizers as a digital circuit and a digital camera with such an apparatus is disclosed.

Method and system for facilitating compression
10454496 · 2019-10-22 ·

During operation, embodiments of the subject matter can perform compression of an n-dimensional m-channel patch based on unsupervised learning (clustering). Embodiments of the subject matter can perform multiple such compressions of patches tessellated (tiled) across a space. Embodiments of the subject matter can also perform hierarchical compression through recursive application of embodiments of the subject matter. Embodiments of the subject matter can compress but are not limited to compressing the following: a database, a sequence, an image, a video, and a volumetric video.