Device and method for compacting compressed and uncompressed data blocks
11177825 · 2021-11-16
Assignee
Inventors
- Aleksei Valentinovich Romanovskii (Saint, RU)
- Ilya Aleksandrovich Papiev (Saint, RU)
- Jinbao Niu (Hangzhou, CN)
- Qiang XUE (Chengdu, CN)
- Shaohui Quan (Hangzhou, CN)
Cpc classification
G06F12/0238
PHYSICS
G06F12/0223
PHYSICS
G06F2212/7205
PHYSICS
G06F3/0679
PHYSICS
H03M7/46
ELECTRICITY
H03M7/40
ELECTRICITY
International classification
H03M7/40
ELECTRICITY
Abstract
A device and a method for an improved compacting of compressed and uncompressed data blocks into an output buffer are provided. The device is configured to obtain a set of input data blocks comprising at least one of a compressed data block and an uncompressed data block; compact the compressed data blocks into the output buffer, starting from a first predefined region in the output buffer, such that the compressed data blocks are sequentially compacted; and compact the uncompressed data blocks into the output buffer, starting from a second predefined region in the output buffer, such that the uncompressed data blocks are sequentially compacted.
Claims
1. A device for compacting compressed and uncompressed data blocks into an output buffer, comprising: an interface; the output buffer; and a processor coupled to the interface and the output buffer, and wherein the processor is configured to: obtain a set of input data blocks comprising at least one of a compressed data block and an uncompressed data block; compact the compressed data blocks into the output buffer, starting from a first predefined region in the output buffer, wherein the compressed data blocks are sequentially compacted; and compact the uncompressed data blocks into the output buffer, starting from a second predefined region in the output buffer, wherein the uncompressed data blocks are sequentially compacted; wherein a beginning of the second predefined region is an end of the output buffer, and the uncompressed data blocks are sequentially compacted from the end of the output buffer growing toward the beginning of the output buffer.
2. The device according to claim 1, wherein the processor is further configured to determine an upper limit in bytes for a compressed data block to be compacted into the output buffer, based on an output buffer size, a size of the compressed data block, and a header size of the compressed data block.
3. The device according to claim 1, wherein the processor is further configured to obtain an input data block, compress the obtained input data block based on a predefined compression ratio, and determine a size of the compressed data block.
4. The device according to claim 3, wherein in response to the size of the compressed data block being less than the grain unit of output buffer, the processor is further configured to perform a copy-add of the compressed data into a separate output buffer being associated to the compressed data blocks having a size smaller than the grain unit, wherein the grain unit is representative of granularity of a memory storage.
5. The device according to claim 1, wherein the first predefined region is the beginning of the output buffer, and wherein the compressed data blocks are sequentially compacted from the beginning of the output buffer growing toward the end of the output buffer.
6. The device according to claim 5, wherein the compaction of the compressed data blocks into the output buffer is gapless and/or independent from the grain unit of the output buffer.
7. The device according to claim 1, wherein the compaction of the uncompressed data blocks into the output buffer is arranged based on the grain unit of the output buffer.
8. The device according to claim 1, wherein the processor is further configured to determine a block address for the compressed data block based on determining an offset for the compressed data block in the output buffer, and estimating a length of the compressed data block in the output buffer.
9. The device according to claim 8, wherein the block address of a compressed data block having a size less than the grain unit corresponds to an index of the grain unit.
10. The device according to claim 1, wherein the processor is further configured to determine a block address for the uncompressed data block based on determining an offset for the uncompressed data block in the output buffer, and estimating a length of the uncompressed data block in output buffer.
11. The device according to claim 1, wherein the processor is further configured to generate a block leading header for each compressed data block, being representative of an offset from a beginning of the grain unit.
12. The device according to claim 1, wherein the processor is further configured to generate a block trailing header for each compressed data block, being representative of an offset of the last byte of the compressed data block from a beginning of the last grain unit of the compressed data block.
13. The device according to claim 3, wherein the processor is further configured to write the compressed data blocks and the uncompressed data blocks on the memory storage.
14. The device according to claim 13, wherein the compressed data blocks and the uncompressed data blocks are written on the memory storage based on their corresponding compaction on the output buffer.
15. The device according to claim 3, wherein the processor is further configured to read from the memory storage the compressed data blocks and the uncompressed data blocks.
16. The device according to claim 15, wherein the processor is further configured to read the compressed data blocks and the uncompressed data blocks based on the identification number of the output buffer, the size of the corresponding block, and the offset of the corresponding block from the beginning of the output buffer.
17. The device according to claim 3, wherein the memory storage is based on a volatile memory storage or a non-volatile memory storage.
18. A method for compacting compressed and uncompressed data blocks into an output buffer, the method comprising: obtaining, by a processor of a device, a set of input data blocks comprising at least one of a compressed data block and an uncompressed data block; compacting, by the processor of the device, the compressed data blocks into the output buffer, starting from a first predefined region in the output buffer, such that the compressed data blocks are sequentially compacted; and compacting, by the processor of the device, the uncompressed data blocks into the output buffer, starting from a second predefined region in the output buffer, such that the uncompressed data blocks are sequentially compacted; and wherein a beginning of the second predefined region is an end of the output buffer, and the uncompressed data blocks are sequentially compacted from the end of the output buffer growing toward the beginning of the output buffer.
19. The method according to claim 18, further comprising: determining an upper limit in bytes for a compressed data block to be compacted into the output buffer, based on an output buffer size, a size of the compressed data block, and a header size of the compressed data block.
20. The method according to claim 18, further comprising: obtaining an input data block, compress the obtained input data block based on a predefined compression ratio, and determining a size of the compressed data block.
Description
BRIEF DESCRIPTION OF DRAWINGS
(1) The above described aspects and implementation forms of the present disclosure will be explained in the following description of specific embodiments in relation to the enclosed drawings, in which
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
DETAILED DESCRIPTION OF EMBODIMENTS
(21)
(22) The device 100 is configured to compact the compressed data blocks 102 into the output buffer 104, starting from a first predefined region in the output buffer 104, such that the compressed data blocks 102 are sequentially compacted.
(23) The device 100 is further configured to compact the uncompressed data blocks 103 into the output buffer 104, starting from a second predefined region in the output buffer 104, such that the uncompressed data blocks 103 are sequentially compacted.
(24) For example, the device 100 may obtain the set of input data block 101. Moreover, the compressed data blocks 102 and the uncompressed data blocks 103 may be compacted differently. The compressed data blocks 102 may be compacted starting from a different region in the output buffer 104 (also referred to as a chunk), than the uncompressed data blocks 103.
(25) The compressed data blocks 102 may be compacted using so-called grain units, e.g., grain unit having fixed sizes of 1 KB and chunks with the fixed size of 1 MB, without limiting the present disclosure to a specific size of the grain unit and/or chunk. Moreover, a block address that combines Chunk ID, grain offset and length for compressed data blocks may be used.
(26) For instance, the compressed data blocks 102 may be compacted from the first predefined region, which may be the beginning of the output buffer 104, and may further grow toward the end of the output buffer 104, and may be sequentially compacted. Furthermore, the compressed data blocks 102 may be compacted without alignment to grain units of the output buffer. This compaction of the compressed data blocks 102 may eliminate wasted space. In some embodiments, however, some wasted space may be formed, for example, when 8 KB is compressed into less than 1 KB.
(27) Moreover, the uncompressed data blocks 103 may be compacted into the output buffer 104, starting from the second predefined region in the output buffer 104, which may be the end of the output buffer 104 and may further grow from the end of the output buffer toward the beginning of the output buffer 104, without limiting the present disclosure to a specific region and/or direction. Also, the uncompressed data blocks 103 may be sequentially compacted.
(28) Moreover, the compaction of the compressed data blocks 102 into the output buffer 104 may effectively eliminate wasted space between adjacent compressed data blocks 102. Similarly, the compaction of the uncompressed data blocks 103 may also eliminate wasted space between adjacent uncompressed data blocks 103. However, some wasted spaced may be formed, for example, when compressed data blocks 102 and uncompressed data blocks 103 are compacted next to each other, e.g., are neighbors.
(29)
(30) The device 100 further includes an interface 201 in the form of an adapter, which is configured to be communicatively coupled with another device, for example, a host computer, for obtaining a set of input data blocks 101 comprising at least one of a compressed data block 102 and an uncompressed data block 103. The interface 201 may be based on a computer bus interface, for example, a Serial AT Attachment (SATA), a Parallel Advanced Technology Attachment (PATA), or the like.
(31) The device 100 further includes a data compressor 202, which is configured to compress the input data block 101 obtained by the interface 201, based on a predefined compression ratio, and determine the size of the compressed data block 102. For example, the data compressor 202 may utilize various compression schemes to remove, e.g., redundant data, meta-data and/or reduce the size of the obtained input data blocks 101.
(32) Moreover, the device 100, for example, its data compressor 202, may further be configured to determine an upper limit in bytes for the compressed data block 102 to be compacted into the output buffer 104, based on the output buffer size, the size of the compressed data block, and its header size.
(33) Moreover, the device 100 may further optionally comprise a data compactor 203. The data compactor 203 may obtain the information related to, e.g., the size of the compressed data block, the header size, the output buffer size, etc., and may further perform a compaction of the compressed data blocks 102 and the uncompressed data blocks 103 into the output buffer 104.
(34) The device 100 may further optionally comprise a block address calculator 204. The block address calculator 204 may calculate a block address for the compressed data block 102 based on determining an offset for the compressed data block 102 in the output buffer 104, and estimating a length of the compressed data block 102 in the output buffer 104.
(35) As discussed, the data compressor 202 may determine the size of the compressed data block 102. Moreover, when it is determined that the size compressed data block 102 is less than the grain unit of the output buffer 104, the block address calculator 204 may assign an index of the grain unit as the block address of the compressed data block with the size less than the grain unit.
(36) The block address calculator 204 may further calculate a block address for the uncompressed data block 103 based on determining an offset for the uncompressed data block 103 in the output buffer 104, and estimating a length of the uncompressed data block 103 in output buffer 104.
(37) As discussed, the length bits in block address may be estimated. For example, the device 100 (e.g., its block address calculator 204) may assign the bit pattern “111” to the uncompressed data occupying 8 grain units. Moreover, other bit patterns, from “000” to “110” (encoding numbers from 0 to 6) may be used for compressed data block's length. The compressed data block's length may be calculated by adding 2 to the value represented by a bit pattern. For example, the bit pattern “000” represents the value of 0. Then, by adding 2 to its 0 value (i.e. 0+2=2) the present disclosure encodes a length of 2 grain units for the compressed data blocks. Therefore, the lengths of the compressed data blocks may be expressed in grain units, to be in the range of 2 to 8.
(38) In addition, a compressed data length of 8, which is encoded by bit pattern of “110” is for actual length of the compressed data block being less than 8 KB (i.e. <8 KB), and it is rounded to 8 grain units. Therefore, it is possible to have compressed data blocks with the sizes up to, e.g., 8 KB-256 bytes. A reason to have this value as upper bound for compressed data's size is that, for example, if all blocks are compressed to 8 KB-256, and further packed into the chunk without alignment, then it may be possible to have 129 compressed data blocks in one chunk, which is one block more than 128 uncompressed data blocks. In some embodiments, the present disclosure may require 1 bit per block to be used for garbage collection, rather than 1 bit per grain unit in the conventional devices.
(39) The device 100 may further optionally comprise a block header generator 205. The block header generator 205 may further be configured to generate a block leading header for each compressed data block 102, being representative of an offset from the beginning of the grain unit. Moreover, the block header generator 205 may further generate a block trailing header for each compressed data block 102, being representative of an offset of the last byte of the compressed data block from the beginning of the last grain unit of the compressed data block.
(40) As discussed, the data compressor 203 may perform a compaction of the compressed data blocks 102 and the uncompressed data blocks 103. The data compactor may compact the compressed data blocks 102 into the output buffer 104, starting from a first predefined region in the output buffer 104. The first predefined region may be the beginning of the output buffer 104, and the compressed data blocks 102 may be sequentially compacted from the beginning of the output buffer 104 growing toward the end of the output buffer 104. Moreover, the compaction of the compressed data blocks 102 into the output buffer 104 may be gapless and/or independent from the grain unit of the output buffer 104. In other words, it may be without alignment to grain units of the output buffer 104. This packing may eliminate the waste space, or the like.
(41) The data compactor may further perform a compaction of the uncompressed data blocks 103 into the output buffer 104, starting from the second predefined region. The second predefined region may be the end of the output buffer 104, and the uncompressed data blocks 103 may be sequentially compacted from the end of the output buffer 104 growing toward the beginning of the output buffer 104. Furthermore, the compaction of the uncompressed data blocks 103 into the output buffer 104 may be arranged based on the grain unit of the output buffer 104. For example, the uncompressed data blocks 103 may be packed into output buffer such that they are naturally aligned to 1 KB (i.e. a size of overall 8 KB).
(42) The device 100 may further optionally comprise a separate output buffer 206. The separate output buffer 206 may be used for the compaction of the compressed data blocks which have a small size, e.g., less than the grain unit of the output buffer 104. As discussed, the data compressor 202 may determine the size of the compressed data blocks. Moreover, wherein when it is determined that the size of the compressed data block is less than the grain unit of output buffer 104, the device (e.g., the data compactor 203) may perform a copy-add of the compressed data into a separate output buffer 206 being associated to the compressed data blocks having a size smaller than the grain unit, wherein the grain unit is representative of granularity of a memory storage 208 and/or the output buffer 104.
(43) The device 100 may further optionally comprise a memory storage 208. The memory storage 208 may be based on e.g., a volatile memory storage, a non-volatile memory storage, etc.
(44) The device 100 may further optionally comprise writing module 207. The writing module may be configured to write the compressed data blocks 102 and the uncompressed data blocks 103 on the memory storage 208. Moreover, the compressed data blocks 102 and the uncompressed data blocks 103 are written on the memory storage 208 based on their corresponding compaction on the output buffer 104 (and/or the separate output buffer 206).
(45) The device 100 may further optionally comprise read module 209. The read module 209 may be configured to read from the memory storage 208 the compressed data blocks 102 and the uncompressed data blocks 103. Moreover, the read module 209 may be configured to read the compressed data blocks 102 and the uncompressed data blocks 103 based on the identification number of the output buffer, the size of the corresponding block, and the offset of the corresponding block from the beginning of the output buffer.
(46)
(47) The device 100 obtains a set of input data blocks comprising compressed data blocks 102 including B0, B1, B3, B4, B5 and Bn, and uncompressed data blocks 103 including B2 and B6 to be compacted into the chunk 104.
(48) The compressed data blocks 102 (i.e. B0, B1, B3, B4, B5 and Bn) are grouped together at the beginning of the chunk 104. They are packed starting from the left (i.e. beginning of the chunk) and grow towards the end of the chunk 104. The compressed data blocks 102 are packed without any alignment, hence there is no waste space between any two adjacent compressed data blocks 102.
(49) Furthermore, the uncompressed data blocks 103, for example, having the size of 8 KB, 16 KB (and/or 32 KB) blocks, are grouped together at the end of the chunk 104. The uncompressed data blocks 103 are packed starting from the right (i.e. end of the chunk) and grow towards the beginning of the chunk 104. The uncompressed data blocks 103 are packed with natural alignment by block size, e.g. by 8 KB.
(50) There might be (e.g. small amount of) a wasted space 301 between the compressed data blocks 102 and the uncompressed blocks 103, for example, when chunk 104 is full and the compressed data block 102 are in adjacent of uncompressed data block 103, etc.
(51) In the embodiment of
(52)
(53) As discussed, one or more of several kinds of metadata may be used including a block address, a leading header 401 and a trailing header 402 (shown as white rectangles) for the compressed data block.
(54) The block address may, for example, combine the chunk ID, the block offset (e.g., an offset relative to the chunk start), and the block length. Moreover, the block offset and the block length may be expressed in grain units.
(55) The block addresses may occupy 64 bits. Moreover, the block addresses may be kept in RAM (read access memory) for faster access to blocks written to storage media, e.g. to a solid state device (SSD).
(56) The leading header 401 for the compressed data blocks may include, for example, the block offset in bytes for the compressed data blocks relative to the beginning of the grain unit. The leading header 401 for compressed data block may be stored at the beginning of the first grain unit of the compressed data block 102 in the chunk 104. The first grain unit of compressed data block may be at the offset in respective block address. The relationship of block address and block header is shown in
(57) The trailing header 402 for the compressed data blocks may include, for example, the (e.g. proper) trailing when the compressed data block 102 is the last compacted block. However, if the compressed data block 102 is compacted such that there is another compressed data block compacted next to it, e.g., there is a next compressed data block, then the trailing header 402 for the previous compressed data block may be and/or may represent the leading header for the next compressed data block.
(58) Each compressed data block 102 may have one leading header 401 and one trailing header 402. The leading header 401 determines offset of first byte of compressed data block 102, and the trailing header 402 determines offset of byte following the last byte of compressed data block.
(59) In
(60) The uncompressed data block may only have the block address. Moreover, the uncompressed data blocks may not have a header, e.g., the leading header and/or the trailing header.
(61)
(62) In
(63) H0 offset 501 field contains value 4 in bytes to point to B0 block actual start in the grain unit. H0 size 502 field contains value 1532 in bytes, B0 block compressed size as reported, e.g., by data compressor.
(64) H1 offset field contains 516 bytes offset to point to B1 block start in the grain unit. H1 size field contains value 2296 in bytes for compressed size of B1 block, etc.
(65)
(66) The size of the compressed block data (also referred to as length) may be measured in bytes and/or in grain units.
(67) The present disclosure does not require compressed data blocks 102 to be aligned, so the compressed data blocks 102 may start anywhere in a grain unit 601, span one or more grain unit 601, and may further end anywhere in the grain unit 601.
(68)
(69) The present disclosure defines the sizes of the compressed data blocks in grain units in the ranges from 2 to 8. Moreover, the present disclosure allows the sizes of the compressed data blocks 102 in bytes up to 8 KB-256, which can also be expressed in grain units of 8.
(70)
(71) A wasted spaced 301 may occur between two compressed blocks 102, for example, when the size of the (e.g., at least one) compressed data block 102 is less than one grain unit (e.g., less than 1 KB). In the embodiment of
(72) Moreover, a wasted space 301 may occur, for example, under the following three conditions:
(73) 1. The last part of the compressed block B0 is smaller than 1 KB, and
(74) 2. CR is very high so that next block B1 is compressed into less than 1 KB, and
(75) 3. B0 last part together with whole block B1 fit into the same unit, so there may be waste space left in that unit (block B2 with corresponding header will start from the next unit).
(76)
(77)
(78)
(79)
(80) For instance, with an assumption that the block header takes 4 bytes. The last part of the compressed data block B0 has a size bigger than 1020 and less than 1024 bytes. Therefore, a wasted space 301 is formed here which is in the ranges from 1 to 3 bytes. Moreover, the last grain unit header is placed to the next grain unit, and the size of the compressed data block 102 increases by one grain unit. Furthermore, the next compressed data block of B1 may be absent, therefore, the last grain unit contains only the trailing header. In such a case, the formed wasted space is 1020 bytes plus (1 to 3) bytes from the previous grain unit. This may occur only for every chunk 104.
(81)
(82) In some embodiments, several headers (e.g., two or more) may be grouped together and may further be placed before respective blocks. Moreover, it may be possible to further eliminate and/or decrease the wasted space.
(83) In
(84)
(85) The method 1100 comprises a first step of, obtaining 1101 a set of input data blocks 101 comprising at least one of a compressed data block 102 and an uncompressed data block 103.
(86) The method 1100 further comprises a second step of, compacting 1102 the compressed data blocks 102 into the output buffer 104, starting from a first predefined region in the output buffer 104, such that the compressed data blocks 102 are sequentially compacted.
(87) The method 1100 further comprises a third step of, compacting 1103 the uncompressed data blocks 103 into the output buffer 104, starting from a second predefined region in the output buffer 104, such that the uncompressed data blocks 103 are sequentially compacted.
(88)
(89) At 1201, the device 100 obtains an input data block and start writing the obtained input data block into a memory storage.
(90) At 1202, the device 100 compresses the obtained input data block based on a predefined compression ratio.
(91) At 1203, the device 100 determines if the input data block is compressible or not. Moreover, when it is determined that the input data block is compressible, the devices 100 further determines the size of the compressed data block, and proceeds to step 1204. However, when it is determined that the input data block is not compressible (i.e. the compression is failed), the device 100 proceeds to step 1208.
(92) At 1204, the device 100 computes block length according to compressed size and offset in current unit of the chunk.
(93) Moreover, the device determines an upper limit for the compressed block size, based on [output Buffer Size/(1+output Buffer Size/input Block Size)−2×header Size]. For example, for an output buffer of 1 MB, and input block of 8 KB, and a header Size of 4, the device determines an upper limit of 8120 bytes. In the above formula; the header size is multiplied by 2, because of the leading and trailing headers for each compressed block.
(94) Furthermore, the device 100 adjusts the upper limit with the number of bytes in the last grain unit occupied by previous compressed data block, so that the following compressed data block length in grain units should be less than or equal to the length-bit (i.e. 1<length Bits). For the case of length Bits of 3, the length of the compressed data block in grain units would be 8 (i.e. less or equal 1<3).
(95) For example, assuming that the last grain unit of previous compressed data block (together with its trailing header) is occupied by 15 bytes. The device 100 adjusts the upper limit by subtracting 15 to have overall limit for the size of the compressed data block in grain units equal to or less than grain units (1<length Bits).
(96) Moreover, the device 100 compares the adjusted upper limit to the free space left in the output buffer. When it is determined that, there is enough free space, the device 100 proceeds with the adjusted upper limit. Otherwise, the device 100 replaces the value of the upper limit by the size of free space left in the output buffer and proceeds to the step 1205.
(97) At 1205, the device 100 puts the blocks first part to the chunk, starting from offset in the current unit till the block last unit.
(98) The device 100 stores the compressed data block in the output buffer growing from the start adding the leading and the trailing headers (to be generated and added at 1206). For instance, the device 100 increments the current pointer for the compressed data blocks by the size of the compressed data block, and by adding the leading and the trailing headers, in order to write the next uncompressed data block.
(99) Moreover, when the size of the compressed data block is less than one grain unit, then the device performs a copy-add of the compressed data block to the separate buffer for small compressed data blocks, as discussed above.
(100) In addition, when the separate buffer is full, e.g., it contains two or more small compressed data blocks, the device 100 flushes its content (i.e. the content of the separate buffer) to the output buffer, and proceeds to the step 1209 for calculating the block addresses of the compressed data blocks.
(101) At 1206, the device 100 generates headers in the block first unit and the last unit.
(102) Moreover, the device 100 writes the leading header for the compressed data blocks, for example, when the first grain unit to store the compressed data blocks is empty. The leading header contains the actual start position of the compressed data blocks, as discussed above. For instance, for a header size of 4 bytes, the actual start position of the compressed data blocks is 4.
(103) As discussed above, the device 100 writes the compressed data blocks in all of the grain units but the last one to the output buffer. In the last grain unit, the device 100, initially writes the trailing header. The trailing header contains the actual start position of the free space that follows the compressed data block. The trailing header may be the leading header for the next compressed data block.
(104) In addition, the last grain unit for the compressed data block may contain the trailing header, and zero or more bytes of the compressed data block. Furthermore, it may happen that, the size of the compressed data block to be written in the last grain unit, does not allow writing the trailing header. For example, when the size of the grain unit is 1024 bytes, the header size is 4 bytes, and the size of the compressed data block to be written in the last grain unit is 1021, or 1022, or 1023 bytes. In such a case, the compressed data block may be added with 0-bytes up to the size of the grain unit. Moreover, the trailing header may be written at the beginning of the next grain unit.
(105) At 1207, the device 100 puts the block second part to the chunk, the device 100 puts remained bytes starting from the last block unit offset.
(106) In addition, the device 100 may further write zero or more bytes of the compressed data block in the last grain unit.
(107) At 1208, the device 100 puts the uncompressed data block to the end of the chunk before already packed previous uncompressed data block.
(108) For example, the device 100 determines that the compression is failed, e.g., the size of the block is bigger than or equal to the adjusted upper limit. Moreover, the device 100 stores the original block (i.e. being an uncompressed data block) in the output buffer growing from the end toward the beginning. In addition, the device 100 decrements the current pointer for the uncompressed data blocks by the original block size, in order to prepare for writing the uncompressed data blocks.
(109) Moreover, when there is not enough free space in the output buffer to store the uncompressed data block, the device 100 flushes the output buffer, initializes all data, and further stores the uncompressed data block in a newly initialized output buffer.
(110) Alternatively, if the flushing of the output buffer happens, for example, due to replacing the value of the upper limit by the size of the free space, as described above. Then, the device 100 flushes the output buffer, initializes all data, and further compresses the input data block again.
(111) At 1209, the device 100 generates the block address.
(112) At 1210, the device 100 writes the block and ends the writing of the block to the memory storage.
(113) The device 100 calculates the block address for the compressed and/or the uncompressed data block, as discussed above.
(114) The block address contains the bit-fields such as the offset, the length, etc., and are calculated as follow.
(115) The device 100 calculates the offset of the compressed or the uncompressed data block in the output buffer. For example, if the output buffer size is 1 MB and the size of the uncompressed data block is 8 KB, then the device 100 stores 128 uncompressed data blocks in the output buffer. Moreover, the device 100 allocates at least two grain units for each compressed data block, since two headers for each of the compressed data blocks are required. Hence, the device 100 may store 512 compressed data blocks in the output buffer. The device 100 further encodes the offset for any of 512 compressed data blocks by using 9 bits. This is 1 bit less than 10 bits that the device 100 uses for the 1 KB grain unit as the minimal addressable entity for the offset.
(116) Moreover, the device 100 calculates the length of the compressed and/or uncompressed data blocks in the output buffer. For example, when the output buffer is 1 MB, the data blocks are 8 KB, and 1 KB grain units. The device 100 requires 3 bits for the length. As discussed above, the bit value of 111 represents the uncompressed data block with a length equal to 8 grain units or 8 KB. Moreover, a bit values in the ranges of 000 to 011 represents the size of the compressed data blocks with a lengths in the ranges from 2 to 8 grain units or 2 KB to 8 KB.
(117) In the case of the separate output buffer, the device 100 uses an index of the small compressed data block in one grain unit. For example, an output buffer with a size of 1 MB, data blocks with sizes of 8 KB, 1 KB grain units, and 3 small blocks per grain unit. The device 100 requires 2 bits for the index. The index value of 0 (bit field value 00) means that there is no small compressed data blocks.
(118) Furthermore, the block address for the uncompressed data block contains the actual offset of the original block in the chunk, and length equal to [(1<<length Bits)−1]. For example, for the case of length Bits of 3, the block address length of the uncompressed data block is 7 (i.e. (1<3)−1=7). However, the original length in grain units is 8 (i.e. (1<3)=8). Moreover, the block address may contain the index of the small compressed data blocks in one grain unit, as discussed above. The index values in the ranges of 1 to 3 are for small compressed data blocks of 1 to 3, respectively.
(119) In addition, the compressed data block offset for its block address is the offset of the first grain unit that the compressed data block occupies. Moreover, when the compressed block data occupies one grain unit or less, the device 100 forces the length to be at least 2, due to the need of leading and trailing headers.
(120) Furthermore, the length of the compressed data block used in block address is the total number of the grain units occupied by the compressed block data minus 2, and by considering the adjusted upper limit for the size of the compressed data block. When, the device 100 determines that the difference between the current pointer for the uncompressed data block and the compressed data blocks is at least two grain units, the device 100 considers the next input data block. Otherwise, the device 100 flushes the output buffer, and initializes all data.
(121)
(122) At 1301, the device 100 starts a first step for reading a data block from the memory storage. The device 100 initiate the block reading using the block address.
(123) At 1302, the device 100 parses the block address and derives the chunk ID, offset and length, as discussed above.
(124) The device 100 uses the chunk identifier (chunk ID) to locate the chunk start on the storage media. Then, the device 100 uses the block address offset to find the location of the data block in the chunk.
(125) At 1303, the device 100 reads data from the chunk according to chunk ID, offset and length.
(126) The device 100 adjusts the length of the data block (i.e. which may be a compressed data block or an uncompressed data block), and may read the data block as a number of sequential grain units.
(127) For example, when the length of the data block, encoded in the respective bit-field of the block address is equal to [(1<<length-Bits)−1], then the data block is an uncompressed data block. And the device 100 reads the [(1<<length-Bits)] grain units. However, when the length of the data block, encoded in the respective bit-field of the block address is in the ranges of 0 to [(1<<length-Bits)-2], then the data block is a compressed data block, and the device 100 reads the [(length+2)] grain units.
(128) At 1304, the device 100 determines if the data block is a non-compressed block or not. Moreover, when it is determined that the data block is an uncompressed data block, the device 100 proceeds to step 1311. However, when it is determined that the data block is not an uncompressed data block, the device 100 proceeds to step 1305.
(129) At 1305, the device 100 parses the block start header, the device 100 derives the size of the compressed data block and bytes offset in the first grain unit.
(130) The device 100 parses the leading and the trailing headers for the compressed data block, to determine the start position of the compressed data block in the first read grain unit, and the end position of the compressed data block in the last grain unit.
(131) At 1306, the device 100 allocates a temporary buffer. The temporary buffer is corresponding to the derived block size.
(132) At 1307, the device 100 copies the block first part to the output buffer, starting from the offset till the last unit.
(133) At 1308, the device 100 skips the end header in block last unit.
(134) At 1309, the device 100 copies the block's second part to the output buffer from the last grain unit.
(135) Moreover, when the last grain unit contains a compressed data block, then the device 100 moves the compressed data block “in place” (i.e. to the beginning of the grain unit) to overwrite the trailing header, and further makes the compressed data blocks contiguous.
(136) At 1310, the device 100 decompresses the unpacked block.
(137) The device 100 decompresses contiguous compressed data and returns them to the original block content.
(138) At 1311, the device 100 reads the decompressed data block and ends the read process.
(139) Table I presents benchmarking results of compression and packing of two set of data based on a method disclosed in the prior art, and Table II presents benchmarking results of compression and packing of two set of data based on the device (or the method run by the device), as disclosed in the present disclosure.
(140) The benchmarking results for the prior art solution (a prototype implementation of the proposed solution in the prior art) and for the stand-alone application prototype for the present disclosure are obtained based on the following conditions.
(141) 1. Implemented chunk packing and block addresses calculation
(142) 2. Data read in memory—no IO (disk) overhead is taken into account
(143) 3. Repeated (packing, compression), and (unpacking, decompression) for 100 times and averages speed
(144) 4. Blocks compression/decompression is done in single thread
(145) 5. LZ-class algorithm is used as compressor in default and solution packing schemes
(146) In addition, the environment for obtaining the benchmarking results was an Intel® Xeon® CPU E5-2670 0 @ 2.60 GHz, wherein the benchmarking was compiled with GCC 6.3-O3, Linux kernel 3.19.
(147) TABLE-US-00001 TABLE I Benchmarking rresults of compression and packing of two set of data based on method disclosed in prior art. Baseline (prior art), alignment by 1 KB grain units, packing compressed data into 1 MB chunks Memory CPU CR % Compression Decompression Data set waste waste output/input speed B/sec speed, MB/sec Oracle DB 9.4 0.4 49.8 456 1593 Calgary, 6.1 4.4 60.5 314 1060 Silesia, etc.
(148) TABLE-US-00002 TABLE II Benchmarking rresults of compression and packing of two set of data based on the device (or the method run by the device) as disclosed in the present disclosure. Present disclosure: no alignment by 1 KB units, 4 byte headers while packing compressed data into 1 MB chunks Memory CPU CR % Compression Decompression Data set waste waste output/input speed, MB/sec speed, MB/sec Oracle DB 0.1 0.3 39.3 649 1681 Calgary, 0.1 4.2 53.6 418 1510 Silesia, etc.
(149) The benchmarking results of the present disclosure provides 13-27% of better data reduction ratio, and 6-42% of better speed as compared to typical inline compression packing scheme of the method disclosed in prior art. Furthermore, the present disclosure makes waste elimination close to “ideal” for selected LZ4 like algorithm.
(150) The present disclosure deals with actual compressed data sizes in packing scheme, and decreases waste of SSD memory to minimum as compared to typical inline compression packing scheme in the prior art. For example, for the Oracle DB dataset, the memory waste (% input) of 9.4% decreased down to 0.1%, and for the Standard datasets, the memory waste of 6.1% decreased down to 0.1%
(151) The present disclosure has been described in conjunction with various embodiments as examples as well as implementations. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed disclosure, from the studies of the drawings, this disclosure and the independent claims. In the claims as well as in the description the word “comprising” does not exclude other elements or steps and the indefinite article “a” or “an” does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutual different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.