H03M7/3088

System, methods, and media for compressing non-relational database objects
09727629 · 2017-08-08 · ·

Method, media, and systems for compressing objects, comprising: receiving a request to write a first object including a first key and a first value, wherein the first object is of a given type; receiving a request to write a second object including a second key and a second value, wherein the second object is of the given type; classifying the first object to a compression dictionary according to at least one rule based on a value of the first object and/or the key of the first object; classifying the second object to the compression dictionary according to at least one rule based on a value of the second object and/or the key of the second object; and compressing the first object and the second object based on the compression dictionary.

Decompression of a compressed data unit
09729168 · 2017-08-08 · ·

A method that may include retrieving, by a decompression processor, a compressed data unit; wherein the compressed data unit comprises a control section and a data section; wherein the control section comprises multiple decompression instructions for a retrieval of data portions from one or more sources; wherein the one or more source comprise the data section; wherein the control section does not include any data portion; and executing, by a decompression processor, the multiple decompression instructions to provide a decompressed data unit.

Detection of unknown code page indexing tokens

A method for determining an encoding used for a sequence of bytes may be provided. The method comprises providing a set of candidate code pages and transforming them into different groups of sequences of bytes, wherein each group of sequences of bytes corresponds to one of the candidate code pages. Thereby each code point is transformed by applying a transformation from one of the candidate code pages to a reference code point value relating to a reference encoding for each code point. The method comprises further separating each of the transformed sequences of bytes into groups of tokens, wherein each group of tokens relates to one candidate code page, and providing an index relating to a text corpus. Furthermore, the method comprises selecting a code page from the set of candidate code pages at least partially based on how many tokens are found in the index.

Encoding method and decoding method for a list of identifiers, associated computer program products, transmitter and receiver applying said methods
09722921 · 2017-08-01 · ·

This method (100) for encoding a list of identifiers in a network including a transmitter including a global list of identifiers and able to transmit a code corresponding to a coded list of identifiers, comprises the following steps: associating (105) with each identifier an ordinal number; ordering the list of identifiers in order to obtain a sorted list; defining (120) a variable equal to the number of identifiers in the list to be transmitted; if the variable is positive (125), coding (130, 135) the first identifier of the sorted list with a code corresponding to the number of sub-sets of the global list of cardinal equal to the number of identifiers of the sorted list and including at least one identifier, for which the ordinal number is in a strict order relationship with the ordinal number of the first identifier, removing (140) this identifier from the sorted list; coding (145) the list with the sum of the obtained codes.

DATA COMPRESSION FOR CELLULAR INTERNET OF THINGS (CIOT)
20170279934 · 2017-09-28 ·

Aspects of the present disclosure provide techniques for compressing data packets for cellular internet of things (CIoT) communications. An example method generally includes establishing at least one prefill buffer common to one or more UEs, wherein the prefill buffer includes a plurality of common strings, generating a compressed packet by finding matches to the common strings in at least one of a header portion or payload portion of the packet and associating identifiers with the common strings, and transmitting the packet.

Data dictionary with a reduced need for rebuilding

A processor receives statistical information about a data set included in a column of a data table. The processor receives additional information about the data set that indicates a data format utilized by the data set and a type of information represented by the data set. The processor generates a data dictionary for compression of the data set based, at least in part, on the statistical information and the additional information. The data dictionary is created such that the data dictionary is capable of compressing data that is statistically predicted to be received at a future point.

SIMILARITY DEDUPLICATION
20220237155 · 2022-07-28 · ·

Dictionary-based compression is performed to compress data units using a similar data unit as the base unit (i.e., dictionary) for each candidate data unit. Similarity may be determined between data units by applying a locality-sensitive hashing scheme to each candidate data unit to produce a hash value, and by determining whether there is a matching value in a hash index of hash values for existing data units on the system. If there is a matching hash value, the candidate data unit may be compressed using the data unit corresponding to the matching hash value as the dictionary. Only a representative portion of the data unit may be hashed to produce the hash value, the portion comprised of chunks of the data unit, where each chunk is a continuous, uninterrupted section of data. The chunks themselves may not be (in some embodiments likely are not) contiguous to one another.

NEAR-STORAGE ACCELERATION OF DICTIONARY DECODING

An accelerator is disclosed. The accelerator may include a memory that may store a dictionary table. An address generator may be configured to generate an address in the dictionary table based on an encoded value, which may have an encoded width. An output filter may be configured to filter a decoded value from the dictionary table based on the encoded value, the encoded width, and a decoded width of the decoded data. The accelerator may be configured to support at least two different encoded widths.

Efficient storage and retrieval of localized software resource data

A method of and system of for compressing and decompressing a localized software resource is disclosed. The method may include receiving a software resource, the software resource being in a first language, receiving a localized software resource for compression, where the software resource in the first language is a counterpart of the localized software resource in the second language. Upon receiving the software resources creating a first local dictionary for the localized software resource based at least in part on one or more first language words in the software resource and on data from a global dictionary, and compressing the localized software resource based on the local dictionary.

VARIABLE SPREADING FACTOR CODES FOR NON-ORTHOGONAL MULTIPLE ACCESS

Aspects of the present disclosure provide techniques for variable spreading factor codes for non-orthogonal multiple access (NOMA). In an exemplary method, a base station assigns, from a first codebook of N short code sequences of length K, a subset of the short code sequences to a number of user equipments (UEs); receives a signal including uplink data or control signals from two or more of the UEs, wherein a first uplink data or control signal is sent using a first subsequence of one of the assigned short code sequences, and a second uplink data or control signal is sent using a second subsequence of one of the assigned short code sequences or using one of the assigned short code sequences; and decodes each uplink data or control signal in the signal based on the assigned short code sequences and subsequences of the assigned the short code sequences.