Patent classifications
H03M7/3091
Method and system for dynamic compression module selection
A computer-implemented method for compressing a data set, the method comprising receiving a first data block of the data set, selecting automatically by a compression management module a compression module from a plurality of compression modules to apply to the first data block based on projected compression efficacy or resource utilization, and compressing the first data block with the selected compression module to generate a first compressed data block.
MAINTAINING DATA DEDUPLICATION REFERENCE INFORMATION
A data deduplication method includes detecting a deduplication transaction including a data pattern associated with a data pattern address (DPA) and a reference, to the pattern, associated with a data reference address (DRA). A deduplication key may be determined based on the DPA and the DRA by concatenating the DPA and the DRA with the DPA as the most significant bits. The key may be stored in a key field of a record in a persistent and sequentially-accessed log, which is part of a log-with-index (LWI) structure that also maintains, in RAM or SSD, a binary index of the log records. When full, the log is cleared by writing the records in key-sorted order to the new tablet. From time to time, two tablets in the tablet library are merged. Tablet merging may include two or more atomic merges, each atomic merge corresponding to a portion of the tablet.
Apparatus and method for accessing compressed data
A system and method for storing compressed data in a memory system includes identifying user data to be compressed and compressing pages of user data to form data extents that are less than or equal to the uncompressed data. A plurality of compressed pages are combined to a least fill a page of the memory. The data may be stored as sectors of a page, where each sector includes a CRC or error correcting code for the compressed data of that sector. The stored data may also include error correcting code data for the uncompressed page and error correcting code for the compressed page. When data is read in response to a user request, the sector data is validated using the CRC prior to selecting the data from the read sectors for decompression, and the error correcting code for the uncompressed page may be used to validate the decompressed data.
Device and method for data compression
A device for data compression includes a processing unit, a temporary memory, and a storage device. The temporary memory is used to temporarily store data to be compressed. The storage device includes multiple physical blocks. Each physical block has a same volume size. The processing unit compresses the to-be-compressed data, generates compressed data, and stores the compressed data into one of the physical blocks. The processing unit compares a data size of the compressed data and a volume size of one physical block, and when the data size of the compressed data is smaller than the volume size, the processing unit stores remnant data into the same physical block as the compressed data stored in, wherein the total data size of the remnant data plus the compressed data is equal to the volume size of the physical block both are stored in.
COMPRESSING INDICES IN A VIDEO STREAM
In one embodiment a system, apparatus, and method for optimizing index value lengths when indexing data items in an array of data items is described, the method including producing, at a first processor, an ordered series of index values, sending the ordered series of index values to an indexing processor, receiving, at the indexing processor, a data object including the array of data items, associating, at the indexing processor, a first part of one of the index values with a first one data item of the array of data items, associating, at the indexing processor, a second part of the one of the index values with a next one data item of the array of data items, repeating the steps of associating a first part of one of the index values and associating a second part of the one of the index values until all of the data items in the array of data items are indexed.
COMPUTER-READABLE RECORDING MEDIUM, ENCODING DEVICE, ENCODING METHOD, DECODING DEVICE, AND DECODING METHOD
An encoding device 100 encodes a plurality of input text files to a plurality of encoded files by using a static dictionary unit 121 and a dynamic dictionary unit 122. The dynamic dictionary unit 122 is generated in accordance with word appearance frequencies in the plurality of text files. The encoding device 100 generates a coupled encoded file that includes the plurality of encoded files, information on the dynamic dictionary unit 122, and position information that indicates positions of the respective plurality of encoded files.
TECHNOLOGIES FOR EFFICIENT LZ77-BASED DATA DECOMPRESSION
Technologies for data decompression include a computing device that reads a symbol tag byte from an input stream. The computing device determines whether the symbol can be decoded using a fast-path routine, and if not, executes a slow-path routine to decompress the symbol. The slow-path routine may include data-dependent branch instructions that may be unpredictable using branch prediction hardware. For the fast-path routine, the computing device determines a next symbol increment value, a literal increment value, a data length, and an offset based on the tag byte, without executing an unpredictable branch instruction. The computing device sets a source pointer to either literal data or reference data as a function of the tag byte, without executing an unpredictable branch instruction. The computing device may set the source pointer using a conditional move instruction. The computing device copies the data and processes remaining symbols. Other embodiments are described and claimed.
Scalable deduplication system with small blocks
Exemplary method, system, and computer program product embodiments for scalable data deduplication working with small data chunk in a computing environment are provided. In one embodiment, by way of example only, for each small data chunk, a signature is generated based on a combination of a representation of characters used in selecting data to be deduplicated. A c-spectrum of the small data chunk being a sequence of representations of different characters ordered by a frequency of occurrence in the small data chunk, and an f-spectrum of the small data chunk being a corresponding sequence of frequencies of the different characters in the small data chunk.
Memory deduplication based on guest page hints
Methods, systems, and computer program products are included for de-duplicating one or more memory pages. A method includes receiving, by a hypervisor, a list of read-only memory page hints from a guest running on a virtual machine. The list of read-only memory page hints specifies a first memory page marked as writeable. The method also includes determining whether the first memory page matches a second memory page. In response to a determination that the first memory page matches the second memory page, the hypervisor may deduplicate the first and second memory pages.
METHOD, ELECTRONIC DEVICE, AND COMPUTER PROGRAM PRODUCT FOR STORAGE MANAGEMENT
The present disclosure relates to a method, an electronic device, and a computer program product for storage management. According to an example, a method for storage management is provided, including: generating a to-be-stored target data stream based on a to-be-stored object, wherein the target data stream includes at least a part of the object, determining whether the target data stream matches at least one stored data stream that has been stored in a storage apparatus, wherein sizes of the target data stream and the at least one stored data stream depend on their respective content, and, if the target data stream does not match the at least one stored data stream, storing the target data stream in the storage apparatus. Therefore, the performance of storage management can be improved, and the storage costs can be reduced.