MAINTAINING DATA DEDUPLICATION REFERENCE INFORMATION
20170351697 · 2017-12-07
Assignee
Inventors
Cpc classification
International classification
Abstract
A data deduplication method includes detecting a deduplication transaction including a data pattern associated with a data pattern address (DPA) and a reference, to the pattern, associated with a data reference address (DRA). A deduplication key may be determined based on the DPA and the DRA by concatenating the DPA and the DRA with the DPA as the most significant bits. The key may be stored in a key field of a record in a persistent and sequentially-accessed log, which is part of a log-with-index (LWI) structure that also maintains, in RAM or SSD, a binary index of the log records. When full, the log is cleared by writing the records in key-sorted order to the new tablet. From time to time, two tablets in the tablet library are merged. Tablet merging may include two or more atomic merges, each atomic merge corresponding to a portion of the tablet.
Claims
1. A data deduplication method, comprising: detecting a deduplication transaction comprising a data reference at a data reference address and a data pattern at a data pattern address; determining a data deduplication key based on the data reference address and the data pattern address; storing the data deduplication key in a key field of a record in a log; maintaining an index of the records in the log; responsive to detecting a log full signal, performing log clear operations comprising: creating, in a tablet library comprising at least one other tablet, a new tablet; storing the records, sorted in accordance with the data deduplication keys, to the new tablet; and clearing the log of all entries; and responsive to a tablet merge signal, merging a first tablet of the tablet library and a second tablet of the tablet library to form a merged tablet and releasing the first tablet and the second tablet from the tablet library.
2. The data deduplication method of claim 1, wherein determining the data deduplication key includes appending at least a portion of the data reference address to at least a portion of the data pattern address with the portion of the data pattern address comprising the most significant bits.
3. The data deduplication method of claim 1, wherein the records in the log either: do not include a value field corresponding to the key field; or include a null value in the value field.
4. The data deduplication method of claim 1, wherein each record includes: a presence bit for distinguishing between insertion transactions and deletion transactions; and a sequence field storing a sequence value common to each record in the log; and wherein clearing the log includes incrementing the sequence number.
5. The data deduplication method of claim 1, further comprising: maintaining the log in persistent storage; and maintaining the index in memory; and inserting records comprises inserting records sequentially in the next sequential record of the log.
6. The data deduplication method of claim 1, wherein merging the first tablet and the second tablet comprises iteratively performing a plurality of atomic merges for each of a plurality of atomic portions of the first and second tablets, each atomic merge comprising: merging an atomic portion of the first tablet with an atomic portion of the second tablet to form an atomic portion of the merged tablet; and updating tablet index nodes corresponding to the atomic portion.
7. The data deduplication method of claim 6, wherein boundaries of the atomic portion are defined in terms of either: a particular range of the keys; or a particular number of fixed size tablet pages.
8. The data deduplication method of claim 6, further comprising: maintaining the tablet index as copy-on-write data wherein said updating of the tablet index nodes preserves node data until the atomic merge is committed to the merged tablet and the tablet portions of the first and second tablets are released.
9. The data deduplication method of claim 8, wherein the tablet index includes a super root node comprising a parent node of root nodes for the first, second, and merged tablets, wherein said updating of the nodes preserves node data until the atomic merge is committed to the merged tablet and the tablet portions of the first and second tablets are released.
10. The data deduplication method of claim 1, wherein the log full signal is asserted responsive to utilization of the log exceeding a threshold selected from: a percentage utilization threshold; a record count threshold; and a byte size threshold.
11. The data deduplication method of claim 1, further comprising: responsive to receiving a range query indicating a range of keys, generating a query result indicative of the range indicated in the query.
12. The data deduplication method of claim 1, further comprising, responsive to receiving a summary query indicating a range of keys, a key mask, and a maximum count, returning a result indicating a number of key values within the range of keys subject to the key mask and the maximum count.
13. An information handling system, comprising: a processor; a memory, accessible to the processor for performing operations comprising: detecting a deduplication transaction comprising a data reference at a data reference address and a data pattern at a data pattern address; determining a data deduplication key based on the data reference address and the data pattern address; storing the data deduplication key in a key field of a record in a log; responsive to detecting a log full signal, performing log clear operations comprising: creating, in a tablet library comprising at least one other tablet, a new tablet; storing the records, sorted in accordance with the data deduplication keys, to the new tablet; and clearing the log of all entries; and responsive to a tablet merge signal, merging a first tablet of the tablet library and a second tablet of the tablet library to form a merged tablet and releasing the first tablet and the second tablet from the tablet library.
14. The information handling system of claim 13, further comprising: maintaining a binary tree index of the records in the log, wherein the storing of the records to the new tablet comprises storing the records sorted in accordance with binary tree index.
15. The information handling system of claim 13, wherein determining the data deduplication key includes appending at least a portion of the data reference address to at least a portion of the data pattern address with the portion of the data pattern address comprising the most significant bits.
16. The information handling system of claim 13, wherein the records in the log either: do not include a value field corresponding to the key field; or include a null value in the value field.
17. The information handling system of claim 13, wherein each record includes: a presence bit for distinguishing between insertion transactions and deletion transactions; and a sequence field storing a sequence value common to each record in the log; and wherein clearing the log includes incrementing the sequence number.
18. The information handling system of claim 13, wherein merging the first tablet and the second tablet comprises iteratively performing a plurality of atomic merges for each of a plurality of tablet portions, each atomic merge comprising: merging an atomic portion of the first tablet with an atomic portion of the second tablet to form an atomic portion of the merged tablet; and updating a portion of the tablet index corresponding to the atomic portion; wherein the atomic portion comprises a tablet portion corresponding to a particular range of the keys
19. The information handling system of claim 13, wherein the operations include: responsive to receiving a range query indicating a range of keys, retrieving all keys within the range indicated in the query.
20. The information handling system of claim 13, wherein the operations include: responsive to receiving a summary query indicating a range of keys, a key mask, and a maximum count, returning a result indicating a number of key values within the range of keys subject to the key mask and the maximum count.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0028] The description of the illustrative embodiments can be read in conjunction with the accompanying FIGUREs. It will be appreciated that, for simplicity and clarity of illustration, elements illustrated in the FIGUREs have not necessarily been drawn to scale. For example, the dimensions of some of the elements may be exaggerated relative to other elements. Embodiments incorporating teachings of the present disclosure are shown and described with respect to the FIGUREs presented herein, in which:
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
DETAILED DESCRIPTION
[0037] In the following detailed description, specific exemplary embodiments in which disclosed subject matter may be practiced are described in sufficient detail to enable those skilled in the art to practice the disclosed embodiments. For example, details such as specific method orders, structures, elements, and connections have been presented herein. However, it is to be understood that the specific details presented need not be utilized to practice embodiments of disclosed subject matter. It is also to be understood that other embodiments may be utilized and that logical, architectural, programmatic, mechanical, electrical and other changes may be made within the scope of the disclosed subject matter. The following detailed description is, therefore, not to be taken as limiting the scope of the appended claims and equivalents thereof.
[0038] References within the specification to “one embodiment,” “an embodiment,” “at least one embodiment”, or “some embodiments” and the like indicate that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. The appearance of such phrases in various places within the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Further, various features may be described which may be exhibited by some embodiments and not by others. Similarly, various requirements may be described which may be requirements for some embodiments but not for other embodiments.
[0039] It is understood that the use of specific component, device, and/or parameter names and/or corresponding acronyms thereof, such as those of the executing utility, logic, and/or firmware described herein, are for example only and not meant to imply any limitations on the described embodiments. The embodiments may thus be described with different nomenclature and/or terminology utilized to describe the components, devices, parameters, methods and/or functions herein, without limitation. References to any specific protocol or proprietary name in describing one or more elements, features or concepts of the embodiments are provided solely as examples of one implementation, and such references do not limit the extension of the claimed embodiments to embodiments in which different elements, features, protocols, or concept names are utilized. Thus, each term utilized herein is to be given its broadest interpretation given the context in which that term is utilized.
[0040] Disclosed subject matter includes a storage controller with an LWI structure that supports sequential inserts of new transactions into a non-volatile log structure that is backed by a RAM-speed index tree analogous to a log structured merge tree. The LWI structure combines a memory-based index for fast searching and sorting with an NVM-based or HDD-based, sequentially accessed log structure to provide persistence and adequate insert performance. Disclosed LWI structures are tailored for use in one or more data storage and storage system applications including one or more data deduplication applications.
[0041]
[0042] Storage system 100 is an information handling system that receives and processes I/O operations 92. The storage system 100 of
[0043] Although storage system 100 encompasses any one or more suitable types of storage media, the storage media types emphasized in the description of the following figures include hard disk drive (HDD) storage, any of various nonvolatile semiconductor-based storage media including, as non-limiting examples, a solid state drive (SSD), and traditional random access memory (RAM). An SSD generally includes a flash memory array and a storage drive interface integrated within a single device. In at least one embodiment, storage system 100 supports nonvolatile memory (NVM) express (NVMe) SSDs, in which the storage interface is a peripheral component interconnect express (PCIe) interface.
[0044]
[0045]
[0046] The metadata 145 may include substantially any information descriptive of, derived from, or otherwise associated with dataset 140. The metadata 145, illustrated in more detail in
[0047] As depicted in
[0048] The chip set 102 illustrated in
[0049] The memory 110 illustrated in
[0050] In conjunction with metadata transaction functionality supported by storage controller 100, the memory 110 illustrated in
[0051] Query processor functionality corresponding to query processor instructions 114 may support commands including: INSERT k->v (insert the mapping k->v into the transaction log), DELETE(k) (remove or invalidate an existing k->v mapping from the transaction log), and QUERY(k), (return the value v associated with a key k if k is present). In addition, disclosed embodiments may support extended metadata commands including, as non-limiting examples, RANGE (k1,k2), which may retrieve all keys in dataset 140 between k1 and k2 inclusive, and SUMMARIZE (k1,k2,c,m) for determining key counts including, for example, the count of all keys, as masked by m, from k1 to k2, to a maximum of c.
[0052]
[0053] The LWI structure 150 illustrated in
[0054] In at least one embodiment, the transaction log 152 of
[0055]
[0056] The metadata 145 of
[0057] Referring now to
[0058] In at least some embodiments, the transaction log is maintained in nonvolatile storage and transactions are incorporated into the transaction log sequentially, to insure adequate insert/delete performance. The index tree may be maintained in RAM or in an SSD and is used to maintain an index of the keys stored in the transaction log.
[0059] The metadata method 400 of
[0060] When a log full signal is generated, the metadata method 400 of
[0061] The metadata method 400 illustrated in
[0062] Tablet merges may occur periodically or based upon one or more other or additional criteria. The size of a tablet resulting from the merging of two existing tablets can range from a minimum size, equal to the size of the more recent tablet, to a maximum size, equal to the sum of the two tablets. Assuming that any two newly created tablets are of approximately the same size and a tablet merge occurs between two existing tablets each time a new tablet is created, i.e., whenever the number of tablets equals three the two oldest tablets are merged, it can be seen that the size of the merged tablet may grow monotonically over time.
[0063] To prevent an unchecked drift of tablet size over time, any one of various suitable operations may be included within metadata method 400. For example, if the size of the tablet library, excluding the oldest tablet, exceeds a particular threshold, merge operations are performed. In another example, a time stamp may be associated with each tablet. If the age of the oldest tablet exceeds a particular threshold, merge operations are performed.
[0064]
[0065] To reduce the risk associated with losing a potentially enormous amount of computational “work-in-progress” during a tablet merge, the tablet merge operation 432 depicted in
[0066] Atom sizes may be chosen in accordance with any one or more of various suitable parameters. For example, because the tablets contain records arranged in key sorted order, the atomicity of each merge may be defined according to a range of keys. Alternatively, if the tablets are formatted into fixed-size pages when stored, atomicity might be defined in terms of physical pages, e.g., each atomic merge merges N physical pages. Defining atomicity in terms of physical pages may, however, result in some duplicate entries in the merged tablet since page ranges and key ranges are not inherently aligned.
[0067] The tablet merge operation 432 illustrated in
[0068] Whenever the distribution of keys within a tablet is non-uniform, a key-defined atomicity may result in some atomic merges that process more records than others. Similarly, if the distribution of keys in a first tablet differs from the distribution of keys in a second tablet, an atomic merge may require more time to complete since fewer keys in the older tablet can be discarded due to the presence of the same key in the newer tablet. Some embodiments may impose page-based or other size-based constraints on the size of an atom to achieve a more uniform atomic risk, i.e., the risk of re-performing an atomic merge. Once the first or next successive ATP is defined, the tablet merge operation 432 of
[0069] As described below with respect to
[0070] The source tablet ATPs are then merged (operation 510) into the ATP of the merged tablet. The basic merge operation may comprise a prioritized union of the source ATPs in which duplicate records are discarded and any conflicting records are resolved in favor of the more recent record. In this manner, the merged tablet contains all the records included in the newer tablet plus any record in the older tablet for a key not found in the newer tablet. After the entire ATP of the source tablets has been merged into the corresponding ATP of the merged tablet, the tablet index nodes that were previously stored to copy-on-write storage locations can be committed (operation 512) to the tablet index and the original data can be released.
[0071] The tablet merge operation 432 illustrated in
[0072]
[0073] The tablet index 171 of
[0074]
[0075] Data deduplication keys 720 may be beneficially used in storage systems 100 that support data deduplication. Data deduplication support may include elements (not depicted in
[0076]
[0077]
[0078]
[0079] For at least some purposes useful in maintaining data deduplication metadata, it may be beneficial to configure a key that, when sorted, readily reveals the number of data deduplication references 712 to any particular data pattern 702. The data deduplication keys 720-1 through 720-4 corresponding to data deduplication references 712-1 through 712-4 respectively are produced or obtained by concatenating the applicable data reference address 714 and the applicable data pattern address 704 with the data pattern address 704 stored in the most significant bits and the data reference address 714 stored in the least significant bits. Other embodiments may produce or obtain data deduplication keys 720 differently.
[0080] In at least one embodiment, each data deduplication key 720 represents a key 162 in the log 152 of
[0081] Accordingly, the data deduplication keys 720 illustrated in
[0082] The data deduplication references 720 illustrated in
[0083] Any one or more processes or methods described above, including processes and methods associated with the
[0084] A computer readable medium, which may also be referred to as computer readable memory or computer readable storage, encompasses volatile and non-volatile media, memory, and storage, whether programmable or not, whether randomly accessible or not, and whether implemented in a semiconductor, ferro-magnetic, optical, organic, or other suitable medium. Information handling systems may include two or more different types of computer readable medium and, in such systems, program code may be stored, in whole or in part, in two or more different types of computer readable medium.
[0085] Unless indicated otherwise, operational elements of illustrated or described methods may be combined, performed simultaneously, or performed in a different order than illustrated or described. In this regard, use of the terms first, second, etc. does not necessarily denote any order, importance, or preference, but may instead merely distinguish two or more distinct elements.
[0086] Program code for effecting described operations may be written in any appropriate combination of programming languages and encompasses human readable program code including source code as well as machine readable code including object code. Program code may be executed by a general purpose processor, a special purpose processor, including, as non-limiting examples, a graphics processor, a service processor, or an embedded processor or controller.
[0087] Disclosed subject matter may be implemented in any appropriate combination of software, firmware, and hardware. Terms including circuit(s), chip(s), processor(s), device(s), computer(s), desktop(s), laptop(s), system(s), and network(s) suggest at least some hardware or structural element(s), but may encompass non-transient intangible elements including program instruction(s) and one or more data structures including one or more databases.
[0088] While the disclosure has been described with reference to exemplary embodiments, it will be understood by those skilled in the art that the disclosure encompasses various changes and equivalents substituted for elements. Therefore, the disclosure is not limited to the particular embodiments expressly disclosed, but encompasses all embodiments falling within the scope of the appended claims.
[0089] As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification indicate the presence of stated features, operations, elements, and/or components, but does not preclude the presence or addition of one or more other features, operations, elements, components, and/or groups thereof.