H03M7/3091

COMPUTING SYSTEM WITH DATA TRANSFER BASED UPON DEVICE DATA FLOW CHARACTERISTICS AND RELATED METHODS
20200236196 · 2020-07-23 ·

A computing system may include a server, and a client computing device in communication with the server. The server may be configured to provide a corresponding virtual desktop instance for the client computing device. The computing system may include a local device to be coupled to a given client computing device and to be operable in a given virtual desktop instance associated with the given client computing device, thereby generating client initialization packets. The server may be configured to generate a server mapping table. The given client computing device may be configured to generate a client mapping table, replace a client packet with a client mapping ID number to define compressed client initialization packets, and send the compressed client initialization packets to the server. The server may be configured to replace the client mapping ID number with the client packet in the compressed client initialization packets based upon the server mapping table.

SCALABLE BINNING FOR BIG DATA DEDUPLICATION
20200233597 · 2020-07-23 ·

Fast record deduplication is accomplished by providing as an input, data records having multiple attributes, and local similarity functions of individual attributes with local similarity thresholds. Bin IDs are then generated based on the local similarity functions and the local similarity thresholds. The Bin IDs are unique identifiers of a respective bin of records, and the bin of records is a set of records that are possibly pairwise similar. Local candidate pairs are identified based on data records that share Bin IDs. The local candidate pairs are aggregated to produce a set of global candidate pairs. The set of global candidate pairs are filtered by deciding whether a pair of data records represents a duplicate.

Encoding and decoding of digital audio signals using difference data
10699721 · 2020-06-30 · ·

An audio encoder can parse a digital audio signal into a plurality of frames, each frame including a specified number of audio samples, perform a transform of the audio samples of each frame to produce a plurality of frequency-domain coefficients for each frame, partition the plurality of frequency-domain coefficients for each frame into a plurality of bands for each frame, each band having bit data that represents a number of bits allocated for the band, and encode the digital audio signal and difference data to a bit stream (e.g., an encoded digital audio signal). The difference data can produce the full bit data when combined with estimate data that can be computed from data present in the bit stream. The difference data can be compressed to a smaller size than the full bit data, which can reduce the space required in the bit stream.

Deduplication and compression of data segments in a data storage system
10678435 · 2020-06-09 · ·

Techniques for performing data deduplication and compression in data storage systems. Data deduplication is performed in a deduplication domain on a segment-by-segment basis to obtain a plurality of deduplicated data segments. Deduplicated data segments are grouped together to form a plurality of compression groups. Data compression is performed on each compression group, and the compressed group is stored on spinning media. By performing data deduplication on a segment-by-segment basis, the size of each segment can be reduced to increase the effectiveness of data deduplication. By performing data compression on compression groups, the size of each compression domain can be increased to increase the effectiveness of data compression. By storing deduplicated data segments as a compressed group on the spinning media, a sequential nature of the segments can be preserved to reduce a seek time/rotational latency of the spinning media and a number of IOPS handled by the data storage system.

Managing inline data compression and deduplication in storage systems

A method is used in managing inline data compression and deduplication in storage systems. A block of data from data stored in a cache of a storage system is identified based on entropy. Entropy of the block of data is compared with a first threshold value. Based on the comparison, the block of data is either deduplicated or compressed without deduplication.

Multi-tenant encryption on distributed storage having deduplication and compression capability

A tenant's clear text data in a multi-tenant storage system can be encrypted using the tenant's cryptographic key to produce encrypted yet compressible data (cryptographic data). The cryptographic data can be encrypted using a system cryptographic key that is managed by the multi-tenant storage system and then stored. Use of the system cryptographic key allows for subsequent maintenance activities such as deduplication and compression to be performed on data stored in the multi-tenant storage system without having to access any of the tenants' cryptographic keys.

Reducing the amount of data stored in a sequence of data blocks by combining deduplication and compression
10659076 · 2020-05-19 · ·

The described technology is generally directed towards reducing the amount of data stored in a sequence of data blocks by combining deduplication and compression. According to an embodiment, a system can comprise a memory that can store computer executable components, and a processor that can execute the components stored in the memory. The components can comprise a data block identifier that can identify, for a sequence of data blocks, a first data block that corresponds to a first data, resulting in a first identified data block, and a deduplication component that can identify a second data block that corresponds to the first data, resulting in a second identified data block, wherein the deduplication component can replace the second identified data block with a key value corresponding to the first identified data block. Further, a compression component can compress the first identified data block, resulting in a compressed data block.

METHOD FOR ENCODING AND DECODING OF QUALITY VALUES OF A DATA STRUCTURE
20200153454 · 2020-05-14 ·

Method for encoding of quality values of a data structure, whereby said data structure comprises a set of genomic reads, wherein the method comprises the following steps executable by a data processing system: ascertain the quality values of each read covering a certain index locus, determine a codebook identifier identifying a specific codebook from a plurality of codebooks for said certain index locus based on the ascertained quality values of said certain index locus, whereby each code-book provides a mapping from a quality value of said quality value alphabet to a corresponding quantized quality value of a quantized quality value alphabet, quantizing all ascertained quality values at said certain index locus using the specific codebook identified by the codebook identifier at said certain index locus in order to obtain for each quality value at said certain index locus a corresponding quantized quality value, and encode all determined codebook identifiers using a first entropy encoder and encode all quantized quality values using a second entropy encoder or a set of encoders.

Binary difference operations for navigational bit streams
10642824 · 2020-05-05 · ·

A computing device may identify a series of bits representative of a first binary large object (BLOB) for navigation data including road segments and road attributes. The computing device duplicates each bit of the series of bits a predetermined number of times to form a first bit string. The first bit string is larger than the series of bits by a factor of the predetermined number. The computing device performs a binary difference of the first bit string to a second bit string representative of a second BLOB. A result of the binary difference is stored in a navigation patch file.

SELECTION OF HASH KEY SIZES FOR DATA DEDUPLICATION

Techniques for data processing may include: receiving a data chunk; determining a metric value denoting a degree of compressibility of the data chunk; selecting, in accordance with the metric value denoting the compressibility of the data chunk, a first size of a plurality of sizes, wherein each of the plurality of sizes denotes a different size of an amount of storage used for storing a value of said each size; and performing the data deduplication processing for the data chunk, wherein the data deduplication processing includes using a first hash value for the data chunk to determine whether the data chunk is a duplicate of another data chunk of a hash table, wherein the first hash value is stored in a storage location of the first size.