Patent classifications
H03M7/3079
SYSTEM AND METHOD FOR COMPUTER DATA TYPE IDENTIFICATION
A system and method for file type identification involving extraction of a file-print of a file, the file-print being a unique or practically-unique representation of statistical characteristics associated with the distribution of bits in the binary contents of the file, similar to a fingerprint. The file-print is then passed to a machine learning algorithm that has been trained to recognize file types from their file-prints. The machine learning algorithm returns a predicted file type and, in some cases, a probability of correctness of the prediction. The file may then be encoded using an encoding algorithm chosen based on the predicted file type.
DATA COMPRESSION METHOD AND APPARATUS, AND COMPUTER DEVICE
A data compression method includes: obtaining a to-be-compressed object; searching a recommendation record for a recommended compression coding rule that meets a compression rate condition, the recommendation record being configured to record a compression coding rule of a historical compressed object and corresponding compression rate information, and the historical compressed object being of a same type as the to-be-compressed object; and if the recommended compression coding rule that meets the compression rate condition is found, compressing the to-be-compressed object by using the recommended compression coding rule; and if the recommended compression coding rule that meets the compression rate condition is not found, starting a regular compression coding process to obtain estimated compression rates of a plurality of compression coding rules for the to-be-compressed object, selecting a target compression coding rule based on at least the estimated compression rates, and compressing the to-be-compressed object by using the target compression coding rule.
Methods and apparatuses for compressing parameters of neural networks
An encoder for encoding weight parameters of a neural network is configured to obtain a plurality of weight parameters of the neural network, to encode the weight parameters of the neural network using a context-dependent arithmetic coding, to select a context for an encoding of a weight parameter, or for an encoding of a syntax element of a number representation of the weight parameter, in dependence on one or more previously encoded weight parameters and/or in dependence on one or more previously encoded syntax elements of a number representation of one or more weight parameters, and to encode the weight parameter, or a syntax element of the weight parameter, using the selected context. Corresponding decoder, quantizer, methods and computer programs are also described.
ENCODER, DECODER, ENCODING METHOD, DECODING METHOD AND PROGRAM
A sequence of integer values is encoded and decoded with a number of bits of a decimal value substantially assigned per sample or/and with a smaller memory amount or calculation processing amount than in the prior art. The encoder receives the sequence of integer values as input and outputs an integer code corresponding to the sequence of integer values. An integer transformer (11) obtains one integer value (transformed integer) through algebraically-representable bijective transformation for each of a plurality of sets of integer values included in the inputted sequence of integer values. An integer encoder (12) encodes the transformed integer to thereby obtain an integer code.
Partitional data compression
A system collects statistical data for a data page, divides the data page into parts, analyzes the data page and the statistical data, based on compression efficiency of one or more compression methods for each part of each page, to determine a compression method for each part of page, and compresses, based on the analyzing, the parts of the data page.
CODE TABLE GENERATION DEVICE, MEMORY SYSTEM, AND CODE TABLE GENERATION METHOD
According to one embodiment, a code table generation device includes a table generation unit, a merge unit and a tree generation unit. The table generation unit generates a frequency table including symbols and frequencies of occurrence respectively associated with the symbols, based on a frequency of occurrence for each symbol of input symbols. The merge unit acquires top K symbols in descending order of the frequencies of occurrence and remaining symbols from the symbols, divides the remaining symbols into one or more symbol sets, and determines a frequency of occurrence associated with a root node of each of subtrees correspond to the respective symbol sets. The tree generation unit generates a Huffman tree using the K symbols and the root node of each of the subtrees.
Train-linking lossless compressor of numeric values
A train-linking lossless data compressor examines a block of data and uses a same coder to generate a same code when all data values in the input block are identical. When the input data is not all the same value, then a Gaussian coder, a Laplace coder, and a delta coder are activated in parallel. The three compressed code lengths are compared and the smallest code length is output as the compressed code when it is smaller than a copy code length. The copy code is a tag followed by copying all the data in the input block. When the smallest of the three compressed code lengths is larger than the copy code length, the file is not compressible, and the copy code is output. No frequency table is required so latency is low. The delta coder subtracts data values from an average value of the last data block.
REORDERING DATASETS IN A TABLE FOR INCREASED COMPRESSION RATIO
Selecting tables for compression by threshold statistical values. Identified tables are reordered according to fields having the lowest cardinality to increase the size of character strings replaced by keys during compression. Field locations are mapped between the original table and the reordered table. Dictionary-based compression is performed on reordered tables.
METHOD AND ELECTRONIC DEVICE FOR DECODING A DATA STREAM, AND ASSOCIATED COMPUTER PROGRAM AND DATA STREAMS
A method for decoding a data stream, including a plurality of identifiers and a bit sequence, into a sequence of data of respective predetermined types includes the following operations for obtaining each item of data of the sequence: determining a context on the basis of an identifier, from among the plurality of identifiers, with the type of the relevant item of data; and decoding one portion of the bit sequence by an entropic decoder which receives the bit sequence as an input and is parameterized in the determined context. An electronic decoding device and an associated computer program are also provided.
PARTITIONAL DATA COMPRESSION
A system collects statistical data for a data page, divides the data page into parts, analyzes the data page and the statistical data, based on compression efficiency of one or more compression methods for each part of each page, to determine a compression method for each part of page, and compresses, based on the analyzing, the parts of the data page.