Patent classifications
H03M7/3088
Streaming-friendly technology for detection of data
A method by a network device for detecting data in a data stream. The method includes receiving the data stream, where the data stream includes a sequence of original characters, generating a sequence of type-mapped characters corresponding to the sequence of original characters, converging each of two or more consecutive occurrences of a first character in the sequence of type-mapped characters into a single occurrence of the first character, inserting beginning/ending of segment indicators in the sequence of type-mapped characters, searching for occurrences of one or more predefined sequences of characters in the sequence of type-mapped characters, and responsive to finding an occurrence of any of the one or more predefined sequences of characters, extracting a sequence of characters in the sequence of original characters corresponding to the predefined sequence of characters found in the sequence of type-mapped characters.
Method, device and system for data compression and decompression
A method, device, and system for data compression and decompression are provided. The method for data compression comprises, converting data to be transmitted within each period, from the time domain to the frequency domain, wherein, a default time length is set as a period; identifying weak power frequencies in the frequency domain data according to a set identification rule; weighting data transmitted on the identified weak power frequencies to obtain corresponding weighting information; converting other data converted to the frequency domain and the weighted data back to time domain; compressing the data converted back to the time domain; and transmitting, the compressed data along the weighting information.
USING PREDICATES IN CONDITIONAL TRANSCODER FOR COLUMN STORE
A storage device is disclosed. The storage device may comprise storage for input encoded data. A controller may process read requests and write requests from a host computer on the data in the storage. An in-storage compute controller may receive a predicate from the host computer to be applied to the input encoded data. A transcoder may include an index mapper to map an input dictionary to an output dictionary, with one entry in the input dictionary mapped to an entry in the output dictionary, and another entry in the input dictionary mapped to a “don't care” entry in the output dictionary.
Computer-readable recording medium, encoding device, and encoding method
The encoding device 100 extracts, when encoding a target file by using a static dictionary unit 121 and a dynamic dictionary unit 122, a registered word included in an external dictionary unit 221 from among words registered in the dynamic dictionary unit 122, in which the external dictionary associates a specific word group and a code group with each other; and registers, in the dynamic dictionary unit 122, a code of the registered word in the external dictionary unit 221 and a dynamic code assigned dynamically in association with each other.
Content estimation data compression
The present disclosure is directed to systems and methods for providing fast and efficient data compression using a combination of content dependent, content estimation, and content independent data compression. In one aspect of the disclosure a method for compressing data comprises the steps of: analyzing a data block of an input data stream to identify a data type of the data block, the input data stream comprising a plurality of disparate data types; performing content dependent data compression on the data block, if the data type of the data block is identified; performing content estimation data compression if the content is estimable; and performing content independent data compression on the data block, if the data type of the data block is not identified or estimable. In another aspect of the present invention LZDR compression is applied to simultaneously perform one method of compression while computing statistics useful in estimating the optimal form of compression to be applied.
COMPUTER-READABLE RECORDING MEDIUM, ENCODING DEVICE, ENCODING METHOD, DECODING DEVICE, AND DECODING METHOD
An encoding device 100 encodes a plurality of input text files to a plurality of encoded files by using a static dictionary unit 121 and a dynamic dictionary unit 122. The dynamic dictionary unit 122 is generated in accordance with word appearance frequencies in the plurality of text files. The encoding device 100 generates a coupled encoded file that includes the plurality of encoded files, information on the dynamic dictionary unit 122, and position information that indicates positions of the respective plurality of encoded files.
Predicate application through partial compression dictionary match
Apparatus and systems, including computer program products, implementing and using techniques for predicate application using partial compression dictionary match. A search strategy is developed for each predicate to be applied to compressed data. The compressed data is searched using the search strategy to locate the compression symbols identified in the search strategy. In response to locating a compression symbol from the search strategy in the compressed data, a respective row and applying the predicate is decompressed and a respective row that matches the predicate is returned to a database engine or an application.
High-Throughput Compression of Data
A mechanism is provided for high-throughput compression of data. Responsive to receiving an indication of a match of a current 4-byte sequence from an incoming data stream to stored hash values in a set of hash tables, numerous variables are set to initial values. Responsive to receiving a subsequent 4-byte sequence from the incoming data stream and determining that an active match variable is set to one, the subsequent 4-byte sequence is compared to data in a copy of the incoming data stream in memory at an active position with a predefined length offset. A constraint variable is set to a number of bytes for which the match is to be extended. Responsive to the constraint variable being below a predetermined number, a length, distance pair is output indicating a match to a previous pattern in the incoming data stream.
Selection of data compression technique based on input characteristics
A compression scheme can be selected for an input data stream based on characteristics of the input data stream. For example, when the input data stream is searched for pattern matches, input stream characteristics used to select a compression scheme can include one or more of: type and size of an input stream, a length of a pattern, a distance from a start of where the pattern is to be inserted to the beginning of where the pattern occurred previously, a gap between two pattern matches (including different or same patterns), standard deviation of a length of a pattern, standard deviation of a distance from a start of where the pattern is to be inserted to the beginning of where the pattern occurred previously, or standard deviation of a gap between two pattern matches. Criteria can be established whereby one or more characteristics are used to select a particular encoding scheme.
METHOD AND APPARATUS FOR A DICTIONARY COMPRESSION ACCELERATOR
Apparatus and method for dictionary accelerator compression. For example, one embodiment of an apparatus comprises: a plurality of cores; a compression/decompression accelerator coupled to or integral to one or more of the plurality of cores, the compression/decompression accelerator to perform decompression and compression operations in response to read and write operations, respectively, wherein responsive to notification of a compression job to compress a memory page or a portion thereof, a history buffer associated with the compression/decompression accelerator to is to be initialized with pre-configured dictionary data, the compression/decompression accelerator to match portions of the pre-configured dictionary data with portions of the memory page to generate compressed output data.