Patent classifications
H03M7/34
Adaptive code generation with a cost model for JIT compiled execution in a database system
The disclosure relates to technology for query compilation in a database management system. A first execution time of code for at least one database query without applying a code generation method is estimated and in response to receiving the at least one database query, and for one or more code generation methods, a compilation cost and a second execution time of the code as modified by the code generation methods is estimated. A cost savings for each of the one or more code generation methods is calculated, where the cost savings is calculated as the first execution time less the second execution time of the code generation method, less the compilation cost of the code generation method. One of the code generation methods or the no code generation method with the highest cost savings is then selected.
Data compression by hamming distance categorization
Data is compressed based on non-identical similarity between a first data set and a second data set. A representation of the differences is used to represent one of the data sets. For example, a probabilistically unique value may be generated as a new block label. Probabilistic comparison of the new block label with a plurality of training labels associated with training blocks produces a plurality of training labels that are potentially similar to the new block label. The Hamming distance between each potentially similar training label and the new block label is determined to select the training label with the smallest calculated Hamming distance from the new block label. A bitmap of differences between the new block and the training block associated with the selected training label is compressed and stored as a compressed representation of the new block.
Methods and systems for data analysis and compression
The present disclosure provides computer implemented methods and systems for analyzing datasets, such as large data sets output from nucleic acid sequencing technologies. In particular, the present disclosure provides for data analysis comprising computing the BWT of a collection of strings in an incremental, character by character, manner. The present disclosure also provides compression boosting strategies resulting in a BWT of a reordered collection of data that is more compressible by second stage compression methods compared to non-reordered computational analysis.
Data compression of electronic data transaction request messages
A data transaction processing system receives electronic data transaction request messages from client computers over a data communication network and groups a subset of the electronic data transaction request messages. The data transaction processing system may preprocess the group of electronic data transaction request messages based on the other messages in the same group before forwarding the electronic data transaction request messages to a transaction processor.
Technologies for efficient LZ77-based data decompression
Technologies for data decompression include a computing device that reads a symbol tag byte from an input stream. The computing device determines whether the symbol can be decoded using a fast-path routine, and if not, executes a slow-path routine to decompress the symbol. The slow-path routine may include data-dependent branch instructions that may be unpredictable using branch prediction hardware. For the fast-path routine, the computing device determines a next symbol increment value, a literal increment value, a data length, and an offset based on the tag byte, without executing an unpredictable branch instruction. The computing device sets a source pointer to either literal data or reference data as a function of the tag byte, without executing an unpredictable branch instruction. The computing device may set the source pointer using a conditional move instruction. The computing device copies the data and processes remaining symbols. Other embodiments are described and claimed.
Coding method, decoding method, coder, and decoder
A coding method, a decoding method, a coder, and a decoder, where the coding method includes obtaining the pulse distribution, on a track, of the pulses to be encoded on the track, determining a distribution identifier for identifying the pulse distribution according to the pulse distribution, and generating a coding index that includes the distribution identifier. The decoding method includes receiving a coding index, obtaining a distribution identifier from the coding index, wherein the distribution identifier is configured to identify the pulse distribution, on a track, of the pulses to be encoded on the track, determining the pulse distribution, on a track, of all the pulses to be encoded on the track according to the distribution identifier, and reconstructing the pulse order on the track according to the pulse distribution.
Efficient adaptive seismic data flow lossless compression and decompression method
An efficient adaptive seismic data flow lossless compression and decompression method, which aims at solving the problem that data occupies the storage space and affects the transmission efficiency and is used for efficiently compressing geophysical instrument data, particularly seismic data after 24-bit analog-to-digital conversion. In the method, a data flow is compressed in a lossless mode in real time, and sampling data is adaptively compressed into 1 byte or 2 bytes or 3 bytes from original 24 bits and 3 bytes in a coding manner. Besides the foregoing data ranges, other integers that can be expressed by other 24-bit integer data with symbols are required to be expressed by 4 bytes after being operated through a compression algorithm. The method has the advantages of saving a large amount of storage space and remarkably increasing the data transmission efficiency.
Differential data creating apparatus, data updating apparatus, and differential data creating method
The present invention aims to provide a technology capable of enhancing the effect of reducing differential data in size. A bit shift unit shifts either of old data and new data in a forward direction and a backward direction of its bit string by each of 0, 1, 2, . . . , and n bit(s) to generate a plurality of data. A copy bit string extracting unit extracts information on copy bit strings based on the plurality of data and other non-shifted data. An additional bit string extracting unit excludes copy bit strings from the new data to extract information on additional bit strings. A differential data generating unit creates differential data based on the information on copy bit strings and the information on additional bit strings.
Pad encoding and decoding
A system, method and computer program product for encoding an input string of binary characters representing alphanumeric characters. A system includes: a character writing engine for writing a binary character to an empty cell of a multi-dimensional shape beginning with a starting empty cell; a next cell determination engine for determining a next empty cell by traversing neighboring cells in the multi-dimensional shape until an empty cell is located; a loop facilitator for looping back to the character writing engine and the next cell determining engine until no more data characters or a next empty cell is not determined; and a serialization engine for serializing the cells into a one dimensional binary string of characters representing an encoded string of alphanumeric characters.
Methods and apparatus to parallelize data decompression
An example method to parallelize data decompression includes adjusting a first one of initial starting positions to determine a first adjusted starting position by decoding the bitstream starting at a training position in the bitstream, the decoding including traversing the bitstream from the training position as though first data located at the training position is a valid token; and merging, by executing an instruction with the processor, first decoded data generated by decoding a first segment of the compressed data bitstream starting from the first adjusted starting position with second decoded data generated by decoding a second segment of the compressed data bitstream, the decoding of the second segment starting from a second position in the compressed data bitstream and being performed in parallel with the decoding of the first segment, and the second segment preceding the first segment in the compressed data bitstream.