REBALANCE METHOD FOR BLOCKCHAIN-BASED DECENTRALIZED FILE SYSTEM

20240004842 ยท 2024-01-04

    Inventors

    Cpc classification

    International classification

    Abstract

    A rebalance method for blockchain-based decentralized file system is provided, and the method includes an encoded data rebalance method of a deleted node, and the encoded data rebalance method of the deleted node includes: broadcasting a codeword of the deleted node to all remaining nodes when one node of a node set is deleted; and decoding based on a current storage content and the codeword transmitted from the deleted node by each remaining node using a decoding function to obtain a data packet of each remaining node, and storing the data packet into each remaining node, to thereby generating a distributed target file storage system. The method can correct data skew and reduce replication factors while reducing a communication load of transmission codes during a rebalance phase, thereby ensuring optimal performance of the decentralized file system.

    Claims

    1. A rebalance method for blockchain-based decentralized file system, comprising an encoded data rebalance method of a deleted node, wherein the encoded data rebalance method of the deleted node comprises: broadcasting a codeword of the deleted node to all remaining nodes when one node of a node set is deleted; and decoding based on a current storage content and the codeword transmitted from the deleted node by each remaining node using a decoding function to obtain a data packet of each remaining node, and storing the data packet into each remaining node, to thereby generate a distributed target file storage system.

    2. The rebalance method for blockchain-based decentralized file system as claimed in claim 1, wherein the encoded data rebalance method of the deleted node comprises: setting a node m ( [ K ] k K - r - 1 ) , wherein ( [ K ] k K - r - 1 ) represents a set of subsets of Kr1 nodes selected from a node set [K]\k, [K]\k represents a set of remaining nodes in a node set [K] after deleting a node k, r represents a replicator, both K and k represent nodes; and making {p.sub.1 , . . . , p.sub.r}=[K]\(mk), wherein {p.sub.1 , . . . , P.sub.r} represents a set comprised of nodes p.sub.1 , . . . , p.sub.r, and [K]\(mk) represents a set of remaining nodes in the node set [K] after deleting the node m and the node k; for a node p.sub.i[K]\(mk), filling a data packet |W.sub.[p.sub.l.sub.,m].sup.p.sup.i|:p.sub.lp.sub.i by using a virtual null position |W.sub.[p.sub.l.sub.,m].sup.p.sup.i|=max {|W.sub.[p.sub.l.sub.,m].sup.p.sup.i|:p.sub.lp.sub.i }; transmitting X p i , m = p l p i W [ p l , m ] p i by the node p.sub.i, wherein represents an exclusive or (XOR) operation; and decoding a requirement W.sub.[p.sub.j.sub.,m].sup.p.sup.i of a node p.sub.j based on the X.sub.p.sub.i.sub.,mand a storage content of the node p.sub.j after completing the transmitting, wherein a process for the decoding is expressed as follows: X p i , m ( p l p j , p i W [ p l , m ] p i ) = ( p l p i W [ p l , m ] p i ) ( p l p j , p i W [ p l , m ] p i ) = W [ p j , m ] p i ; wherein p l p j , p i W [ p l , m ] p i represents an XOR transmission of a data packet W.sub.[p.sub.j.sub.,m].sup.p.sup.i of another node p.sub.l after deleting the node p.sub.j and the node p.sub.i; and X p i , m ( p l p j , p i W [ p l , m ] p i ) represents an XOR operation between an operation result of p l p j , p i W [ p l , m ] p i and X.sub.p.sub.i.sub.,m.sup.p.sup.i transmitted from the node p.sub.i.

    3. A rebalance method for blockchain-based decentralized file system, comprising an encoded data rebalance method of an added node, wherein the encoded data rebalance method of the added node comprises: broadcasting, based on a preset decoding function, a codeword by each initial node to a target node when the target node is added into a node set; and decoding the codeword by the target node using the preset decoding function, and deleting a corresponding data packet from each initial node, to thereby generate a distributed target file storage system.

    4. The rebalance method for blockchain-based decentralized file system as claimed in claim 3, wherein the encoded data rebalance method of the added node comprises: adopting custom-character.sub.[K] to represent an index of a bit set storing at K initial nodes, and [ K ] = ( [ K ] K - r ) , wherein ( [ K ] K - r ) represents a set of subsets of Kr1 nodes selected from a node set [K], [K] represents the node set, and r represents a replicator; setting a node mcustom-character.sub.[K], U.sub.m={W.sub.[k,m]:k[K]\m}; for a bit of a data packet W.sub.[k,m], using a node indexed by [K]\m to represent a node set initially storing the node, and [K]\m to represent remaining nodes in the node set [K] after deleting the node m; setting an initial node k[K], wherein the node m is different from the node k, and the data packet W.sub.[k,m] is labeled and existed in storage of the node m; and transmitting, by the initial node k[K], a data packet W [ k , m ] : m ( [ K ] k K - r ) to a target node K+1, and deleting the data packet W [ k , m ] : m ( [ K ] k K - r ) from the initial node, to thereby make the target node K+1 store the data packet transmitted by each initial node.

    Description

    BRIEF DESCRIPTION OF DRAWINGS

    [0020] In order to provide a clearer explanation of technical solutions in the disclosure, drawings required in embodiments descriptions will be introduced below, the following described drawings are merely some embodiments of the disclosure, other drawings can be obtained according to these drawings for those skilled in the art without creative labor.

    [0021] FIG. 1 illustrates a flowchart of an encoded data rebalance method of a deleted node according to an embodiment of the disclosure.

    [0022] FIG. 2 illustrates a flowchart of an encoded data rebalance method of an added node according to an embodiment of the disclosure.

    [0023] FIG. 3 illustrates a schematic diagram of a data packet transmitted during a rebalance process of the deleted node according to an embodiment of the disclosure.

    [0024] FIG. 4 illustrates a schematic diagram of a data packet transmitted during a rebalance process of the added node according to an embodiment of the disclosure.

    DETAILED DESCRIPTION OF EMBODIMENTS

    [0025] Technical solutions in embodiments of the disclosure will be clearly and completely described below. Obviously, the described embodiments are merely some of the embodiments of the disclosure, not all of them. Based on the embodiments of the disclosure, all other embodiments obtained by those skilled in the art without creative labor fall within a scope of protection of the disclosure.

    [0026] A blockchain technology is a peer-to-peer (P2P) network based on decentralization, the blockchain technology combines, by using an open-source software (OSS), a principle of cryptography, temporal data, and a consensus mechanism to ensure coherence and continuation of nodes in a distributed database, so that information can be verified and traced immediately, but it is difficult to tamper with and cannot be masked, thereby creating a privacy, efficient, and secure shared value system.

    [0027] In an application of the blockchain technology, uneven data distribution across storage nodes is one of main factors leading to poor performance of data storage and analysis platform. The uneven distribution of data is called data skew. During data rebalance, data moves between storage nodes so that all nodes store approximately the same amount of data, thereby reducing data skew. Moreover, when a storage system has data replicated using certain replicators, a rebalance method must ensure that these replicators are not reduced during a rebalance period. An efficient data rebalance algorithm minimizes communication involved in the rebalance process.

    [0028] The disclosure mainly addresses a design of a rebalance method for a decentralized file storage system. The decentralized distributed file storage system is r-balanced, that is a replicator of each data segment in the decentralized file storage system is r. A definition of the replicator includes the following: a distributed file storage system D and a node subset S[K] ([K] represents a node set) are considered, a file W is composed of a set of F fragment, the number of nodes storing w.sub.i(i[F]) is called the replicator of a bit w.sub.i, and w.sub.i represents i-th (i[F]) fragment of the file W. An expected number of bits stored in each node is the same. For the r-balanced file storage system, a definition of the decentralized r-balance file storage system is as follows: D(r,[K])={D.sub.n.Math.W:n[K]} represents a decentralized r-balanced file storage system of k nodes, and the decentralized r-balanced file storage system needs to satisfy the following two conditions.

    [0029] A first condition is a replicator condition, the replicator of each bit is r, and r.sub.i=r,i[F], where [F] is a fragment set of the file W.

    [0030] A second condition is a rebalance state condition, the expected number of bits stored in each node is the same. Since the number of bits of the node is rF ,which means that for n[K], E(|D.sub.n|)=F is satisfied,

    [00007] = r K

    and is a storage point.

    [0031] The disclosure provides a rebalance method of a file system for adding and deleting single node. The rebalance method can ensure that the replicator and balance attribute of the distributed file storage system are maintained at the same time.

    [0032] Overall processes of the embodiments of the disclosure are shown in FIG. 1 and FIG. 2, and the overall processes include a rebalance process of the deleted node, and a rebalance process of the added node. The embodiments shown in FIG. 1 and FIG. 2 will be described in detail by the following content.

    [0033] Embodiment 1 shows the rebalance process of the deleted node.

    [0034] In order to ensure high reliability of the file storage system, and maintain the replicator r of the file storage system, a decentralized r-balanced distributed file storage system D(r,[K]) is considered, and [K] represents a set of nodes. A node k[K] is deleted, D.sub.k(r,[K]\k)={D.sub.n.sup.k:n[K]\k} is used to represent the decentralized r-balanced distributed file storage system obtained by rebalance a new system composed by a node set [K]\k ([K]\k represents a set of remaining nodes in the node set [K] after deleting the node k). Specifically, both K and k represent nodes.

    [0035] R (k, D, D.sub.k) is used to represent a rebalance method of the deleted node k from the database D(r, [K]), and D.sub.k represents a target database after balancing. The D.sub.k includes a series encoding functions {.sub.n:n[K]\k} and decoding functions {.sub.n:n[K]\k}. Specifically, [K]\k represents the set of remaining nodes in the node set [K] after deleting the node k. For the node n[K]\k, a codeword .sub.n(D.sub.n) with a length of l.sub.n is broadcasted to all remaining nodes. Each remaining node nk can decode, by using the decoding function n , a data transmission requirement D.sub.n.sup.k of the node k through a current storage content D.sub.n and codewords received from other remaining nodes.

    [0036] custom-character.sub.k is used to represent a bit set storing at the deleted node k, and

    [00008] k = ( [ K ] k K - r ) .

    For mcustom-character.sub.k, m is a node of bits of the bit set custom-character.sub.k. W.sub.m represents a bit set that cannot be obtained at the node m and can be obtained at r-1 remaining nodes [K]\(mk) . For mcustom-character.sub.k, {W.sub.p.sub.j.sub.,m.sub.j.sup.:p.sub.j[m],m.sub.j=[m]\p.sub.j,[K]\(mk)} is used to represent a set of (K1)(r1) boxes, and p.sub.j represents a node of a node set [m]; m.sub.j=[m]\p.sub.j represents remaining nodes after deleting the node p.sub.j from the node set [m]; [K]\(mk) represents that belongs to a set of remaining nodes of the node set [K] after deleting the node m and the node k. Each bit in the bit set W.sub.m is associated with a randomly and uniformly selected box. The merging process is executed at a certain node in the node set [K]\(mk) containing the bit set W.sub.m, and [K]\(mk) represents the set of remaining nodes of the node set [K] after deleting the node m and the node k. The bit in the bit set W m is transmitted to all other nodes in the node set [K]\(mk). In this way, all these nodes have a same bit in their respective boxes, which is collectively referred to as a bit set that selects a same box as the data packet, and the bit is indexed by a tag of the box they jointly select.

    [0037] Considered arbitrary

    [00009] m ( [ K ] k K - r - 1 ) ,

    for any such m, a set P.sub.m={p.sub.1 , . . . , p.sub.r}=[K](mk) of the remaining nodes is considered. For any P.sub.iP .sub.m, a set of r1 data packets of {W.sub.[p.sub.j.sub.,m].sup.p.sup.i:p.sub.jp.sub.i} is considered. A data packet W.sub.[p.sub.j.sub.,m].sup.p.sup.i can be obtained at the deleted node, and can be also obtained at all remaining nodes p.sub.l:lj, however, the data packet W.sub.[p.sub.j.sub.,m].sup.p.sup.i cannot be obtained at the node p.sub.j. Bits in the data packet W.sub.[p.sub.j.sub.,m].sup.p.sup.iare accurately stored in the node p.sub.j. Considering that they have a same size, this structure allows these data packets to be performed an exclusive or (XOR) transmission by nodes. The node p.sub.j can decode W.sub.[p.sub.j.sub.,m].sup.p.sup.i. Please refer to FIG. 3.

    [0038] An algorithm of the encoded data rebalance method for the deleted node includes the following steps 0-2.

    [0039] In step 0, a node

    [00010] m ( [ K ] k K - r - 1 )

    is set, making {p.sub.1, . . . , p.sub.r}=[K]\(mk) , where {p.sub.1 , . . . , p.sub.r} represents a set composed by nodes p.sub.1 , . . . , p.sub.r, [K]\(mk) represents the set of remaining nodes in the node set [K] after deleting the node m and the node k.

    [0040] In step 1, for a node p.sub.i[K]\(mk), a data packet |W.sub.[p.sub.l.sub.,m].sup.p.sup.i|:p.sub.lp.sub.i is filled by using a virtual null position |W.sub.[p.sub.l.sub.,m].sup.p.sup.i|=max{|W.sub.[p.sub.l.sub.,m].sup.p.sup.i|:p.sub.lp.sub.i}.

    [0041] In step 2,

    [00011] X p i , m = p l p i W [ p l , m ] p i

    is transmitted by the node p.sub.i, where represents an XOR operation, and a process for the transmitting is finished.

    [0042] After completing the process for the transmitting, a requirement W.sub.[p.sub.j.sub.,m].sup.p.sup.i (i.e., data packet) of the node p.sub.j is decoded based on the X.sub.p.sub.i.sub.,m and a storage content of the node p.sub.j, and a process for the decoding is expressed as follows:

    [00012] X p i , m ( p l p j , p i W [ p l , m ] p i ) = ( p l p i W [ p l , m ] p i ) ( p l p j , p i W [ p l , m ] p i ) = W [ p l , m ] p i ; where p l p j , p i W [ p l , m ] p i

    represents the XOR transmission of a data packets W.sub.[p.sub.l.sub.,m].sup.p.sup.i of another node p.sub.l after deleting the node p.sub.j and the node p.sub.i. The data requirement packet W.sub.[p.sub.j.sub.,m].sup.p.sup.i is decoded in this way, and is accurately stored in the node p.sub.j. After completing the algorithm, the distributed file storage system D.sub.k(r, [K]\k) is obtained.

    [0043] Embodiment 2 shows the rebalance process of the added node.

    [0044] A target node with an index of K+1 is added into a system of the node set [K], the added target node is assumed as blank, thus the data skew of the system is caused. After executing a rebalance operation of the added node, the decentralized r-balanced distributed file storage system D*(r,[K+1])={D*.sub.nW:n[K+1]} is obtained.

    [0045] In generally, the rebalance method of the added node is composed of a series encoding functions {n:n[K]} and decoding functions {n:n[K+1]}. A codeword *.sub.n(D.sub.n) with a length of l.sub.n is broadcasted by the initial node n[K]. For a transmitted codeword, the target node uses a decoding function *.sub.K+1(*.sub.n(D.sub.n):n[K])=D*.sub.K+1 to decode. A requirement D*.sub.n of the initial node n[K] is decoded by the initial node n[K] using its decoding function such as *.sub.n(D.sub.n, (*.sub.j(D.sub.j):j[K]\n))=D*.sub.n.

    [0046] In order to restore a balance state destroyed of the file storage system D(r, [K]) after adding the target node K+1, the encoded data rebalance method of the added node is implemented. In the method, some bits from the storage content of each node in the K numbers of initial nodes are deleted , and the bits are transmitted to the target node, thereby establishing a new decentralized r-balanced distributed file storage system D*(r, [K+1]). custom-character.sub.[K] is used to represent an index of a bit set storing at the K numbers of initial nodes, and

    [00013] [ K ] = ( [ K ] K - r ) .

    [0047] A node mcustom-character.sub.[K] is set, a set U.sub.m is used to represent r boxes, and U.sub.m={W .sub.[k,m]:k[K]\m}.

    [0048] For a bit of a data packet W.sub.[k,m], a node indexed by [K]\m is used to represent a node set initially storing the node, [K]\m is used to represent remaining nodes in the node set [K] after deleting the node m. Moreover, an initial node k[K] is set, for the node m:k.Math.m, that is, the node m is different from the node k, and the data packet W.sub.[k,m] is labeled and existed in the

    [0049] storage of the node m . Please referring to FIG. 4. The data packet

    [00014] k ( [ K ] k K )

    is transmitted by the initial node k[K] to the target node K+1, and the data packet is deleted from the initial node k[K]. The data packet transmitted from the initial node is stored in the target node K+1. The obtained file storage system is defined as D*(r, [K+1]).

    [0050] An algorithm of the encoded data rebalance method of the added node includes the following steps 0-1.

    [0051] In step 0, for the node k[K] and the node

    [00015] m ( [ K ] k K - r ) , X k , m * = W [ k , m ]

    is transmitted by the node k to the node K+1.

    [0052] In step 1, the node k deletes the data packet W.sub.[k,m] from itself, and the algorithm of the encoded data rebalance method of the added node is finished.