METHODS, SYSTEMS, AND COMPUTER READABLE MEDIA FOR PROVIDING BYZANTINE FAULT TOLERANCE

Abstract

Methods, systems, and computer readable media for providing Byzantine fault tolerance (BFT) are disclosed. According to one method, a method for providing BFT occurs at a computing platform executing a BFT protocol, wherein the computing platform is acting as a leader participant of a round of the BFT protocol. The method comprising: receiving signed round-change messages from multiple participants in the round; broadcasting a signed lock message indicating that signed round-change messages have been received from a predetermined number of the participants in the round voting for a same candidate block; receiving signed commit messages from multiple participants in the round; and broadcasting a signed decide message indicating the candidate block is a finalized block after the predetermined number of the participants in the round have sent signed commit messages indicating the candidate block.

Claims

1. A method for providing Byzantine fault tolerance (BFT), the method comprising: at a computing platform executing a BFT protocol, wherein the computing platform is acting as a leader participant of a round of the BFT protocol: receiving signed round-change messages from multiple participants in the round; broadcasting a signed lock message indicating that signed round-change messages have been received from a predetermined number of the participants in the round voting for a same candidate block; receiving signed commit messages from multiple participants in the round; and broadcasting a signed decide message indicating the candidate block is a finalized block after the predetermined number of the participants in the round have sent signed commit messages indicating the candidate block.

2. The method of claim 1 wherein the predetermined number of the participants includes at least 2t+1 participants, where t represents an amount of malicious participants in the round.

3. The method of claim 1 wherein a participant in the round receives the decide message from the leader participant or another participant and sends the decide message to other participants in the round.

4. The method of claim 1 wherein the candidate block is a maximal acceptable candidate block for the round.

5. The method of claim 1 wherein the leader participant changes for a subsequent round.

6. The method of claim 1 wherein the round is associated with a blockchain height and wherein the signed decide message indicates an agreed upon blockchain height.

7. The method of claim 1 wherein a participant in the round utilizes a round synchronization technique and a height synchronization technique, wherein the round synchronization technique involves the participant incrementing by one a current blockchain height variable associated with the participant in response to receiving the decide message, and wherein the height synchronization technique involves the participant sending a signed round-change message to the leader in response to the participant receiving a signed look message, a commit message, or a decide message for a subsequent round relative to a current round variable associated with the participant.

8. The method of claim 1 wherein a participant in the round utilizes one or more timers, wherein the one or more timers includes an operation timeout timer, a round changing status timer, or a lock status timer, a commit status timer, or a lock release status timer.

9. The method of claim 1 wherein a participant in the round utilizes an application programming interface (API) for obtaining a participant list for the round or a related blockchain height or wherein the participant in the round checks a local participant list after receiving a BFT related message.

10. A system for providing Byzantine fault tolerance (BFT), the system comprising: at least one processor; and a computing platform implemented using the at least one processor, wherein the computing platform is executing a BFT protocol, wherein the computing platform is acting as a leader participant of a round of the BFT protocol, wherein the computing platform is configured for: receiving signed round-change messages from multiple participants in the round; broadcasting a signed lock message indicating that signed round-change messages have been received from a predetermined number of the participants in the round voting for a same candidate block; receiving signed commit messages from multiple participants in the round; and broadcasting a signed decide message indicating the candidate block is a finalized block after the predetermined number of the participants in the round have sent signed commit messages indicating the candidate block.

11. The system of claim 10 wherein the predetermined number of the participants includes at least 2t+1 participants, where t represents an amount of malicious participants in the round.

12. The system of claim 10 wherein a participant in the round receives the decide message from the leader participant or another participant and sends the decide message to other participants in the round.

13. The system of claim 10 wherein the candidate block is a maximal acceptable candidate block for the round.

14. The system of claim 10 wherein the leader participant changes for a subsequent round.

15. The system of claim 10 wherein the round is associated with a blockchain height and wherein the signed decide message indicates an agreed upon blockchain height.

16. The system of claim 10 wherein a participant in the round utilizes a round synchronization technique and a height synchronization technique, wherein the round synchronization technique involves the participant incrementing by one a current blockchain height variable associated with the participant in response to receiving the decide message, and wherein the height synchronization technique involves the participant sending a signed round-change message to the leader in response to the participant receiving a signed look message, a commit message, or a decide message for a subsequent round relative to a current round variable associated with the participant.

17. The system of claim 10 wherein a participant in the round utilizes one or more timers, wherein the one or more timers includes an operation timeout timer, a round changing status timer, or a lock status timer, a commit status timer, or a lock release status timer.

18. The system of claim 10 wherein a participant in the round utilizes an application programming interface (API) for obtaining a participant list for the round or a related blockchain height or wherein the participant in the round checks a local participant list after receiving a BFT related message.

19. A non-transitory computer readable medium having stored thereon executable instructions that when executed by a processor of a computer cause the computer to perform steps comprising: at a computing platform executing a Byzantine fault tolerance (BFT) protocol, wherein the computing platform is acting as a leader participant of a round: receiving signed round-change messages from multiple participants in the round; broadcasting a signed lock message indicating that signed round-change messages have been received from a predetermined number of the participants in the round voting for a same candidate block; receiving signed commit messages from multiple participants in the round; and broadcasting a signed decide message indicating the candidate block is a finalized block after the predetermined number of the participants in the round have sent signed commit messages indicating the candidate block.

20. The non-transitory computer readable medium of claim 19 wherein the predetermined number of the participants includes at least 2t+1 participants, where t represents an amount of malicious participants in the round.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0011] The subject matter described herein will now be explained with reference to the accompanying drawings of which:

[0012] FIG. 1 depicts a table containing information about different Byzantine fault tolerance (BFT) protocols;

[0013] FIGS. 2A-2C depict portions of a block diagram illustrating a BFT consensus algorithm;

[0014] FIGS. 3A-3C depict tables containing information for various test scenarios involving an example BFT implementation;

[0015] FIG. 4 is a block diagram illustrating an example computer system for providing BFT; and

[0016] FIG. 5 is a diagram illustrating an example process for providing BFT.

DETAILED DESCRIPTION

[0017] The subject matter described herein relates to methods, systems, and computer readable media for providing Byzantine fault tolerance (BFT).

[0018] 1 Introduction

[0019] Lamport, Shostak, and Pease [14] and Pease, Shostak, and Lamport [15] initiated the study of reaching consensus in the face of Byzantine failures and designed the first synchronous solution for Byzantine agreement. Dolev and Strong [8] proposed an improved protocol in a synchronous network with O(n.sup.3) communication complexity. By assuming the existence of digital signature schemes and a public-key infrastructure, Katz and Koo [12] proposed an expected constant-round BFT protocol in a synchronous network setting against

[00001] $[\frac{n - 1}{2}]$

Byzantine faults.

[0020] For an asynchronous network, Fischer, Lynch, and Paterson [10] showed that there is no deterministic protocol for the BFT problem in face of a single failure in an asynchronous network. Their proof is based on a diagonalization construction and has two assumptions: (1) when a process writes a bit on the output register, it is finalized and cannot change anymore; and (2) an honest process runs infinitely many steps in a run. Several researchers have tried to design BFT consensus protocols to circumvent the impossibility. For example, to circumvent this impossibility result, Ben-Or [1] initiated the probabilistic approach to BFT consensus protocols in completely asynchronous networks and Dwork, Lynch, and Stockmeyer [9] designed BFT consensus protocols in partial synchronous networks. Castro and Liskov [5] initiated the study of practical BFT (PBFT) consensus protocol design and introduced the PBFT protocol for partial synchronous networks. The core idea of PBFT has been used in the design of several widely adopted BFT systems such as Tendermint BFT [3]. Tendermint BFT has been used in more than 40% Proof of State blockchains (see, e.g., [13]) such as the Internet of Blockchain Cosmos [6]. More recently, Yin et al [20] improved the PBFT/Tendermint protocol by changing the mesh communication network in PBFT to hub-like (or star) communication networks in HotStuff and by using threshold cryptography. Facebook's Libra blockchain has adopted HotStuff in their LibraBFT protocol [17].

[0021] There are generally two kinds of partial synchronous networks for Byzantine Agreement protocols. In Type I partial synchronous networks, all messages are guaranteed to be delivered. In this type of networks, Denial of Service (DoS) attacks are not allowed and reliable point to point communication channels for all pairs of participants are required for the underlying networks. In Type II partial synchronous networks, the network becomes synchronous after an unknown Global Synchronization Time (GST). In this type of networks, Denial of Service (DoS) attacks are allowed before GST though it is not allowed after GST. The Type II network is more realistic and is commonly used in the literature.

[0022] Several partial synchronous network models for BFT design assume the existence of reliable broadcast communication channels for certain message transmission. In particular, these protocols normally leverage the gossip-based broadcast protocol in Bracha [2] which is based on the existence of reliable point-to-point communication channels for all pairs of participants. In particular, the broadcast protocol in Bracha [2] assumes a complete network to achieve a reliable message system in which no messages are lost or generated. Since our Internet infrastructure is not a complete network, one needs to be very careful in building Internet based BFT protocols using Bracha's results. Specifically, one should not assume that there is a reliable broadcast channel before GST of Type II networks.

[0023] The subject matter described herein shows that one can launch attacks against several widely deployed BFT protocols (e.g., Tendermint BFT, Ethereum's Casper FFG, and GRANDPA BFT [11]) so that participants reach a deadlock before GST and the deadlock cannot be removed after GST. Thus, after such attacks, the participants can never reach an agreement even after GST. That is, these BFT protocols cannot achieve the liveness property in type II partial synchronous networks. For Type I networks, one does not know when the message could be delivered. Thus the broadcast protocol may be unreliable until the end of a fixed unknown time period. That is, the same attack in the Type II networks could be used to show that these protocol will reach deadlock before the end of this unknown time period. On the other hand, all these protocols will change views after certain timeout and after a view change, participants would not accept messages from previous views. That is, even all messages are delivered at the end of this unknown time period, participants discard these messages if they have changed views already. Thus these protocols will remain deadlocked. In a summary, our attacks show that these BFT protocols are insecure in all types of partial synchronous networks (including both Type I and Type II networks).

[0024] It should also be noted that though Tendermint [3] BFT protocol claims security in Type II asynchronous networks, it actually uses a Type I network model since it assumes a reliable point to point communication channel for each pair of participants in the network and no message is ever lost (including messages before GST). However, our discussion in the preceding paragraph shows that Tendermint is not secure in the Type I networks either. It should also be noted that in the first version of the LibraBFT specification (accessed on Jul. 19, 2019), its network model is a Type II partial synchronous network. In the current version [17] of the LibraBFT specification (dated as Nov. 8, 2019 and accessed on Feb. 9, 2020), its network model is essentially a Type I partial synchronous network since all messages are delivered in the end (see pages 3 of Section 2 in [17]).

[0025] Based on the security requirement analysis for BFT protocols in asynchronous networks, we propose a BFT finality gadget protocol for blockchains, referred to herein as Blockchain DLS (BDLS). It should be noted that the first BFT protocol (i.e., the DLS protocol) for Type II networks was proposed by Dwork, Lynch, and Stockmeyer [9]. DLS protocol leverages a star network where participants only exchange messages via round leaders. The PBFT protocol allows all participants to broadcast their messages to all other participants. By leveraging this kind of mesh network, PBFT protocol was able to achieve consensus with reduced round complexity. By leveraging the lock-mechanisms in PBFT/Tendermint BFT protocols and changing the mesh network back to star network, HotStuff BFT/LibraBFT is able to achieve consensus with reduced communication complexity but increased round complexity. The BDLS protocol described herein is based on the original DLS protocol [9] and is able to achieve consensus with both reduced round complexity and reduced communication complexity. Specifically, BDLS has the same round complexity as PBFT and has reduced communication complexity than HotStuff BFT/LibraBFT. BDLS is proved to be secure in Type II partial synchronous networks and achieves the best performance among existing BFT protocols for blockchains. Though both BDLS and HotStuff BFT leverages star networks, BDLS employs the lock-mechanisms used in DLS protocol while HotStuff employs the lock-mechanisms used in PBFT/Tendermint BFT protocols. Thus BDLS could achieve consensus in 4 steps while HotStuff requires 7 steps to achieve consensus in synchrony.

[0026] 2 Synchronous, Asynchronous, and Partial Synchronous Networks

[0027] Assume that the time is divided into discrete units called slots

[0028] T.sub.0, T.sub.1, T.sub.3 . . . where the length of the time slots are equal. Furthermore, we assume that: (1) the current time slot is determined by a publicly-known and monotonically increasing function of current time; and (2) each participant has access to the current time. In a synchronous network, if an honest participant P.sub.1 sends a message m to a participant P.sub.2 at the start of time slot T.sub.i.sub.1, the message m is guaranteed to arrive at P.sub.2 at the end of time slot T.sub.i. In the complete asynchronous network, the adversary can selectively delay, drop, or re-order any messages sent by honest parties. In other words, if an honest participant P.sub.1 sends a message m to a participant P.sub.2 at the start of time slot T.sub.i.sub.1, P.sub.2 may never receive the message m or will receive the message m eventually at time T.sub.i.sub.2 where i.sub.2=i.sub.1+. Dwork, Lynch, and Stockmeyer [9] considered the following two kinds of partial synchronous networks: [0029] Type I asynchronous network: < is unknown. That is, there exists a A but the participants do not know the exact value of . [0030] Type II asynchronous network: < holds eventually. That is, the participant knows the value of . But this only holds after an unknown time slot T=T.sub.i. Such a time T is called the Global Stabilization Time (GST).

[0031] For Type I asynchronous networks, the protocol designer supplies the consensus protocol first, then the adversary chooses her . For Type II asynchronous networks, the adversary picks the and the protocol designer (knowing ) supplies the consensus protocol, then the adversary chooses the GST. The definition of partial synchronous networks in [5, 20, 17] is the second type of partial synchronous networks. That is, the value of is known but the value of GST is unknown. In such kind of networks, the adversary can selectively delay, drop, or re-order any messages sent by honest participants before an unknown time GST. But the network will become synchronous after GST. Several BFT protocols in the literature (e.g., Tendermint, GRANDPA, and the current version of LibraBFT dated on Nov. 8, 2019) uses Type II networks, but they also assume that no message gets lost. With this additional assumption, the network is actually a Type I network since all messages are delivered within a time period GST+ where GST is unknown and is known.

[0032] For the Type I network model, Denial of Service (DoS) attack is not allowed since message could be lost with DoS attacks. We think that it is more natural to use Type II asynchronous networks for distributed BFT protocol design and analysis. Thus the subject matter described herein generally refers to Type II network scenarios.

[0033] 3 Reliable Broadcast Communication Channels

[0034] The difference between point-to-point communication channels and broadcast communication channels has been extensively studied in the literature. A reliable broadcast channel requires that the following two properties be satisfied. [0035] 1. Correctness: If an honest participant broadcasts a message m, then every honest participant accepts m. [0036] 2. Unforgeability: If an honest participant does not broadcast a message m, then no honest participant accepts m.

[0037] For complete networks, reliable broadcast protocols have been proposed in Bracha [2]. For a given integer k, a network is called k-connected if there exist k-node disjoint paths between any two nodes within the network. In non-complete networks, it is well known that (2t+1)-connectivity is necessary for reliable communication against t Byzantine faults (see, e.g., Wang and Desmedt [19] and Desmedt-Wang-Burmester [7]). On the other hand, for broadcast communication channels, Wang and Desmedt [18] showed that there exists an efficient protocol to achieve probabilistically reliable and perfectly private communication against t Byzantine faults when the underlying communication network is (t+1)-connected. The crucial point to achieve these results is that: in a point-to-point channel, a malicious participant P.sub.1 can send a message m.sub.1 to participant P.sub.2 and send a different message m.sub.2 to participant P.sub.3 though, in a broadcast channel, the malicious participant P.sub.1 has to send the same message m to multiple participants including P.sub.2 and P.sub.3. If a malicious P.sub.1 sends different messages to different participants in a reliable broadcast channel, it will be observed by its neighbors.

[0038] Though broadcast channels at physical layers are commonly used in local area networks, it is not trivial to design reliable broadcast channels over the Internet infrastructure since the Internet connectivity is not a complete graph and some direct communication paths between participants are missing (see, e.g., [14, 19]). Quite a few broadcast primitives have been proposed in the literature using message relays (see, e.g., Srikanth and Toueg [16], Bracha [2], and Dwork-Lynch-tockmeyer [9]). In the message relay based broadcast protocol, if an honest participant accepts a message signed by another participant, it relays the signed message to other participants. However, in order for these message relay based broadcast protocol to be reliable, it requires that the network graph is complete which is not true for the Internet environments.

[0039] A broadcast channel is unreliable if a malicious participant could broadcast a message m.sub.1 to a proper subset of the participants but not to other participants. That is, some participants will receive the message m.sub.1 while other participants will receive a different message m.sub.2 or receive nothing at all. In next sections, we show that several BFT protocols are insecure due to the lack of reliable broadcast channels before GST (messages before GST could get lost or re-ordered by the definition). Thus it is important to design BFT protocols that could tolerate unreliable broadcast channels before GST.

[0040] In the following sections, if not specified explicitly, we will assume that there are n=3t+1 participants P.sub.0, . . . , P.sub.n1 for the BFT protocol and at least t of them are malicious. Furthermore, we assume that each participant has a public and private key pair where the public key is known to all participants. We use the notation <>.sub.i to denote that the message is digitally signed by the participant P.sub.i.

[0041] 4 Security Analysis of Tendermint BFT Protocol

[0042] Buchman, Kwon, and Milosevic [3] initiated the study of BFT protocols as a finality gadget for blockchains. Specifically, the authors in [3] proposed Tendermint BFT as an overlay atop a block proposal mechanism.

[0043] 4.1 Tendermint BFT Protocol

[0044] Tendermint BFT protocol [3] is based on the PBFT protocol. In Tendermint BFT, there are n=3t+1 participants P.sub.0, . . . , P.sub.n1 and at most t of them are malicious. Each participant maintains five variables step, lockedV, lockedR, validV, and ValidR throughout the protocol run. For each blockchain height h, the protocol runs from round to round until it reaches an agreement for the height h. Then the protocol moves to the next blockchain height. For each round, it contains three steps: propose, pre vote, and precommit. For each height h, the participants start the process by initializing their five variables to: step=propose, lockedV=nil, lockedR=1, validV=nil, and ValidR=1. Then it starts from round 0 until an agreement is reached for the height h. There is a public function proposer(h,r) that returns the round leader for a given round r of the height h. The round r of the height h proceeds as follows: [0045] 1. propose: The leader P.sub.i=proposer(h,r) distinguishes the two cases: [0046] r=0 or validV=nil: P.sub.i chooses her proposal v and vr=1. [0047] r>0 and validVnil: P.sub.i lets v=validV and vr=ValidR P.sub.i broadcasts the signed message

custom-character PROPOSAL,h,r,v,vr.sub.i (1)

to all participants. All other participants P.sub.j initialize the timeout counter to execute OnTimeoutPropose(h,r). [0048] 2. prevote: For all participants P.sub.j who are in step=propose, P.sub.j distinguishes the following three cases: [0049] P.sub.j receives (1) with vr=1. If lockedR=1 or validV=v, then P.sub.j broadcasts the message custom-character PREVOTE,h,r,H(v)j Otherwise, P.sub.j broadcasts the message PREVOTE,h,r,nilj. P.sub.j sets step=prevote. [0050] P.sub.j receives (1) with vr0 and P.sub.j has received 2t+1 PREVOTE,h,vr,H(v). P.sub.j distinguishes the following two cases [0051] lockedRvr or lockedV=v: P.sub.j broadcasts custom-character PREVOTE,h,r,H(v)j [0052] Otherwise: P.sub.j broadcasts the message PREVOTE,h,r,nilj. [0053] P.sub.j sets step=prevote. [0054] P.sub.j receives (1) with vr0 though P.sub.j has not received 2t+1 PREVOTE,h,vr,H(v). P.sub.j does nothing. [0055] 3. precommit: [0056] (a) As soon as a participant P.sub.j in step prevote receives 2t+1 messages custom-character PREVOTE,h,r,* for the first time, P.sub.j initializes timeout counter to execute OnTimeoutPrevote(h,r). [0057] (b) As soon as a participant P.sub.j in step prevote receives 2t+1 messages PREVOTE,h,r,nil for the first time, P.sub.j broadcasts PRECOMMIT,h,r,nil and sets step=precommit. [0058] (c) If P.sub.j is in step prevote V precommit, has received the proposal (1), and has received 2t+1 messages custom-character PREVOTE,h,r,H(v), then P.sub.j carries out the following steps [0059] If step=prevote, then P.sub.j sets lockedV=v, lockedR=r, broadcasts PRECOMMIT,h,r,H(v), and sets step=precommit. [0060] P.sub.j sets validV=v and validR=r. [0061] 4. decision: As soon as a participant P.sub.j receives 2t+1 messages custom-character PRECOMMIT,h,r,* for the first time, P.sub.j initializes timeout counter to execute OnTimeoutPrecommit(h,r). If P.sub.j has not decided a value for the height h, has received the proposal (1), and has received 2t+1 messages PRECOMMIT,h,r,H(v), then P.sub.j sets v as the decision value for height h, resets values for the five variables, and goes to round 0 of height h+1. [0062] 5. automatic update round: During any time of the protocol, if a participant P.sub.j receives t+1 messages for a round r>r, P.sub.j moves to round r. [0063] 6. Timeout functions: [0064] (a) OnTimeoutPropose(h,r): broadcast custom-character PREVOTE,h,r,nil and set step=prevote. [0065] (b) OnTimeoutPrevote(h,r): broadcast PRECOMMIT,h,r,nil and set step=precommit. [0066] (c) OnTimeoutPrecommit(h,r): move to round r+1 of height h.

[0067] 4.2 Attacks on Tendermint BFT Protocol

[0068] In this section, we show that Tendermint BFT does not achieve the liveness property in partial synchronous networks. We describe our attack in the Type II networks where the broadcast channel is unreliable before GST.

[0069] Specifically, we show that if a malicious participant could choose to broadcast a message to a subset of the users before GST, then the system will reach a deadlock and no new block will be created anymore (even after GST). In other words, the Tendermint BFT will reach deadlock before GST and the deadlock could not be removed after GST. We then extend these attacks on

[0070] Tendermint BFT to Type I networks. For simplicity, we assume that for a given height h, the leader participant is P.sub.0 and the participants in P.sub.1={P.sub.0, . . . , P.sub.t1} are malicious. Furthermore, let P.sub.2={P.sub.t, . . . , P.sub.2t}, and P.sub.3={P.sub.2t+1, . . . , P.sub.3t}.

[0071] Attack 1. In round 0 of height h, P.sub.0 chooses a minimal valid value v and broadcasts custom-character PROPOSAL,h,0,,v,1 to participants in P.sub.1P.sub.2. After receiving PROPOSAL,h,0,,v,1 from P.sub.0, each participant P.sub.1P.sub.1 broadcasts PREVOTE,h,0,H(v) to participants in P.sub.2 and each participant P.sub.jP.sub.2 broadcasts PREVOTE,h,0,H(v) to all participants and sets step=prevote. Each participant P.sub.jP.sub.2 receives 2t+1 messages custom-character PREVOTE,h,0,H(v). Thus the participant P.sub.jP.sub.2 sets lockedV=v, lockedR=0, step=precommit, validV=v, validR=0, and then broadcasts PRECOMMIT,h,0,H(v). Since each participant receives at most t+1 pre-commit messages for the value v, no decision will be made during the round 0. After timeout for round 0, all participants moves to round 1 of height h. The participants in P.sub.1 will become dormant from now on. If a participant in P.sub.2 becomes the leader of round 1, it will broadcast the proposal custom-character PROPOSAL,h,1,v,0. Since participant P.sub.j in P.sub.3 has received at most t+1 prevote messages for the value v in round 0, P.sub.j will do nothing until timeout. Thus no honest participant can collect sufficient prevote messages for v to move ahead. After timeout for round 1, the system will move to round 2 of height h. On the other hand, if a participant P.sub.j in P.sub.3 becomes the leader of round 1, it will broadcast the proposal custom-character PROPOSAL,h,1,v,1. Since P.sub.0 has selected the value v as the minimal valid value and new transactions have been inserted into the system since then, the honest leader for round 1 will select a valid value vv with high probability. Thus participants in P.sub.2 will not accept the proposal for v and will broadcast custom-character PROVOTE,h,1,nil. That is, no agreement could be made during round 1 and the system will move to round 2 of height h after timeout. This process will continue forever without making an agreement for the height h even after GST.

[0072] Attack 2. One can launch an attack on Tendermint BFT so that some participants in P.sub.2 will decide on a value v for the height h (though no participant in P.sub.3 decides on any value for the height h) and then the system moves to the deadlock. It is noted that due to the lock function in Tendermint BFT and due to the blockchain property, the adversary will not be able to let the participants in P.sub.3 to decide on a different value for the height h or h+1.

[0073] In the preceding Attack 1, the malicious user needs to control t participants in the set P.sub.1. Indeed, we can revise the attack in such a way that the malicious user only needs to control one user P.sub.0 to launch a similar attack. We use the same set P.sub.1, P.sub.2, P.sub.3. But this time, we assume that only the leader P.sub.0 is malicious and all other participants are honest.

[0074] Attack 3. In round 0 of height h, P.sub.0 chooses a minimal valid value v and broadcasts custom-character PROPOSAL,h,0,,v,1 to participants in P.sub.1P.sub.2. P.sub.0 then broadcasts PREVOTE,h,0,H(v) to participants in P.sub.1P.sub.2 and becomes dormant. After receiving PROPOSAL,h,0,v,1 from P.sub.0, each participant P.sub.j(P.sub.1{P.sub.0})P.sub.2 broadcasts PREVOTE,h,0,H(v) to all participants and sets step=prevote. Each participant P.sub.jP.sub.1P.sub.2 receives 2t+1 messages custom-character PREVOTE,h,0,H(v). The participant P.sub.j(P.sub.1{P.sub.0})P.sub.2 sets lockedV=v, lockedR=0, step=precommit, validV=v, validR=0, and broadcasts PRECOMMIT,h,0,H(v). Since each participant receives at most 2t pre-commit messages for the value v, no decision will be made during the round 0. A similar argument as in the Attack 1 can be used to show that the protocol will enter a deadlock. Please note in this Attack 3, participant P.sub.j in P.sub.3 has received at most 2t prevote messages for the value v in round 0, which is still insufficient for P.sub.1 to accept a proposal for a locked value v from other participants.

[0075] 5 Casper FFG

[0076] Buterin and Griffith [4] proposed the BFT protocol Casper the Friendly Finality Gadget (Casper FFG) as an overlay atop a block proposal mechanism. In Casper FFG, weighted participants validate and finalize blocks that are proposed by an existing proof of work chain or other mechanisms. To simplify our discussion, we assume that there are n=3t+1 validators of equal weight. The Casper FFG works on the checkpoint tree that only contains blocks of height 100*k in the underlying block tree. Each validator P.sub.j can broadcast a signed vote (P.sub.i:s,t) where s and t are two checkpoints and s is an ancestor of t on the checkpoint tree. For two checkpoints a and b, we say that a.fwdarw.b is a supermajority link if there are at least 2t+1 votes for the pair. A checkpoint a is justified if there are supermajority links a.sub.0.fwdarw.a.sub.1.fwdarw. . . . .fwdarw.a where a.sub.0 is the root. A checkpoint a is finalized if there are supermajority links a.sub.0.fwdarw.a.sub.1.fwdarw. . . . .fwdarw.a.sub.i.fwdarw.a where a.sub.0 is the root and a is the direct son of a.sub.i. In Casper FFG, an honest validator P.sub.i should not publish two distinct votes

custom-character P.sub.i:s.sub.1,t.sub.1ANDP.sub.i:s.sub.2,t.sub.2

[0077] such that either

h(t.sub.1)=h(t.sub.2) OR h(s.sub.1)<h(s.sub.2)<h(t.sub.2)<h(t.sub.1)

[0078] here h() denotes the height of the node on the checkpoint tree. Otherwise, the validator's deposit will be slashed. Casper FFG is proved to achieve accountable safety and plausible liveness in [4] where [0079] 1. achieve accountable safety means that two conflicting checkpoints cannot both be finalized (assuming that there are at most t malicious validators), and [0080] 2. plausible liveness means that supermajority links can always be added to produce new finalized checkpoints, provided there exist children extending the finalized chain.

[0081] In order to achieve the liveness property, [4] proposed to use the correct by construction fork choice rule: the underlying block proposal mechanism should follow the chain containing the justified checkpoint of the greatest height.

[0082] The authors in [4] proposed to defeat the long-range revision attacks by a fork choice rule to never revert a finalized block, as well as an expectation that each client will log on and gain a complete up-to-date view of the chain at some regular frequency (e.g., once per month). In order to defeat the catastrophic crashes where more than t validators crash-fail at the same time (i.e., they are no longer connected to the network due to a network partition, computer failure, or the validators themselves are malicious), the authors in [4] proposed to slowly drains the deposit of any validator that does not vote for checkpoints, until eventually its deposit sizes decrease low enough that the validators who are voting are a supermajority. Related mechanism to recover from related scenarios such as network partition is considered an open problem in [4].

[0083] No specific network model is provided in [4]. Thus it is important to investigate the security of Casper FFG in various network models. The specification in [4] does not have sufficient details to guarantee its claimed plausible liveness. The authors mentioned that the Casper FFG could be used on top of most proof of work chains. However, without further restrictions on the block generation mechanisms, Casper FFG can reach deadlock (so plausible liveness property will not be satisfied). Assume that, at time T, the checkpoint a is finalized (where there is a supermajority link from a to its direct child b) and no vote for b's descendant checkpoint has been broadcast by any validator yet. Now assume that the underlying block production mechanism produced a fork starting from b. That is, b has two descendant checkpoints c and d. If t honest validators vote for c, t+1 honest validators vote for d, and t malicious validators vote randomly, then we reach a deadlock (since no link from b to its descendant can have a supermajority). If the checkpoints are 100 blocks away from each other and if it is expensive/slow to generate blocks (e.g., using proof of work (PoW)) then this kind of fork may be hard to happen though there is still a possibility.

[0084] 6 Another Finality Gadget: Polkadot's GRANDPA

[0085] Based on the Casper FFG protocol, the project Polkadot (https://wiki.polkadot.network/) proposed a new BFT finality gadget protocol GRANDPA [11]. Specifically, Polkadot implements a nominated proof-of-stake (NPoS) system. At certain time period, the system elects a group of validators to serve for block production and the finality gadget. Nominators also stake their tokens as a guarantee of good behavior, and this stake gets slashed whenever their nominated validators deviate from their protocol. On the other hand, nominators also get paid when their nominated validators play by the rules. Elected validators get equal voting power in the consensus protocol. Polkadot uses BABE as its block production mechanism and GRANDPA as its BFT finality gadget. Here we are interested in the finality gadget GRANDPA (GHOST-based Recursive ANcestor Deriving Prefix Agreement) that is implemented for the Polkadot relay chain. GRANDPA contain two protocols, the first protocol works in partially synchronous networks and tolerates Byzantine participants. The second protocol works in full asynchronous networks (requiring a common random coin) and tolerates Byzantine participants. In contrast to Casper FFG, GRANDPA voters can cast votes simultaneously for blocks at different heights and GRANDPA only depends on finalized blocks to affect the fork-choice rule of the underlying block production mechanism.

[0086] The first GRANDPA protocol assumes that after an unknown time GST, the network becomes synchronous. However, it also assumes that all messages are delivered before time GST+ for some given value . That is, no message gets lost. This network model is equivalent to our Type I asynchronous network and will not tolerate DoS attacks and network partition attacks. In the following paragraphs, we will show that GRANDPA is not even secure in the synchronous network.

[0087] Assume that there are n=3t+1 participants P.sub.0, . . . , P.sub.n1 and at most t of them are malicious. Each participant stores a tree of blocks produced by the block production mechanism with the genesis block as the root. A participant can vote for a block on the tree by digitally signing it. For a set S of votes, a participant P.sub.i equivocates in S if P.sub.i has more than one vote in S. S is called tolerant if at most t participants equivocate in S. A vote set S has supermajority for a block B if

|{P.sub.j:P.sub.i votes for B*}{P.sub.i:P.sub.i eguivocates}|2t+1

[0088] where P.sub.i votes for B* mean that P.sub.i votes for B or votes for a descendant of B. The -GHOST function g(S) returns the block B of the maximal height such that S has a supermajority for B. If a tolerant vote set S has a supermajority for a block B, then there are at least t+1 voters who do vote for B or its descendant but do not equivocate. Based on this observation, it is easy to check that if s.Math.T and T is tolerant, then g(S) is an ancestor of g(T).

[0089] The authors in [11] defined the following concept of possibility fora vote set to have a supermajority for a block: We say that it is impossible for a set S to have a supermajority for a block B if at least 2t+1 voters either equivocate or vote for blocks who are not descendant of B. Otherwise it is possible for S to have a supermajority for B. Then the authors [11] claimed that a vote set S is possible to have a supermajority for a block B if and only if there exists a tolerant vote set TS such that T has a supermajority for B. However, this claim has semantic issues in practice. For example, assume that blocks B and C are inconsistent and the vote set S contains the following votes:

[0090] 1. t malicious voters vote for B, one honest voter votes for B.

[0091] 2. 2t honest voters vote for C.

[0092] By the definition of [11], S is not impossible to have a supermajority for B. Thus S is possible to have a supermajority for a block B. Since honest voters will not equivocate, there does not exist a semantically valid tolerant vote set TS such that T has a supermajority for B. This observation could easily be used to show that the GRANDPA protocol cannot achieve the liveness property (see our discussion in next paragraphs).

[0093] 6.1 GRANDPA Protocol

[0094] The GRANDPA protocol starts from round 1. For each round, one participant is designated as the primary and all participants know who is the primary. Each round consists of two phases: prevote and precommit. Let V.sub.r,i and C.sub.r,i be the sets of prevotes and precommits received by P.sub.i during round r respectively. Let E.sub.0,i be the genesis block and E.sub.r,i be the last ancestor block of g(V.sub.r,i) that is possible for C.sub.r,i to have a supermajority. If either E.sub.r,i<g(V.sub.r,i) or it is impossible for C.sub.r,i to have a supermajority for any children of g(V.sub.r,i), then we say that P.sub.i sees that round r is completable. Let be a time bound such that it suffices to send messages and gossip them to everyone. The protocol proceeds as follows. [0095] 1. P.sub.i starts round r>1 if round r1 is completable and P.sub.i has cast votes in all previous rounds. Let t.sub.r,i be the time P.sub.i starts round r. [0096] 2. If P.sub.i is the primary of round r and has not finalized E.sub.r1,i, then it broadcasts E.sub.r1,i. [0097] 3. P.sub.i waits until either it is at least time t.sub.r,i+2 or round r is completable. P.sub.i prevotes for the head of the best chain containing E.sub.r1,i unless P.sub.i receives a block B from the primary with g(V.sub.r1,i)B>E.sub.r1,i. In this case, P.sub.i uses the best chain containing B. [0098] 4. P.sub.i waits until g(V.sub.r,i)E.sub.r1,i and one of the following conditions holds [0099] (a) it is at least time t.sub.r,i+4 [0100] (b) round r is completable [0101] (c) it is impossible for V.sub.r,i to have a supermajority for any child of g(V.sub.r,i) (this is an optional condition) Then P.sub.i broadcasts a precommit for g(V.sub.r,i)

[0102] At any time after the precommit step of round r, if P.sub.i sees that B=g(C.sub.r,i) is descendant of the last finalized block and V.sub.r,i has a supermajority, then P.sub.i finalizes B.

[0103] 6.2 Attacks on GRANDPA Protocol

[0104] In this section, we show that GRANDPA protocol cannot achieve the liveness property even in the synchronous networks. Assume that E.sub.r1,0= . . . =E.sub.r1,n1. During round r, the block production mechanisms produced a fork for E.sub.r1,0. That is, two child blocks B and C of E.sub.r1,0 are produced. At round r, t+1 voters (including all malicious voters) prevote for B and the remaining honest 2t voters prevote for C. For each voter P.sub.i, we have g(V.sub.r,i)=E.sub.r1,i. Thus each P.sub.i precommits g(V.sub.r,i)=E.sub.r1,i. Now each voter P.sub.i estimates E.sub.r,i=g(V.sub.r, i)=E.sub.r1,i. Since it is possible for C.sub.r,i to have a supermajority for any child of E.sub.r,i, the round r is not completable. That is, the process stuck at round r forever.

[0105] Even if one can revise the possible definition in the GRANDPA to resolve the issues that we have discussed in the preceding paragraph, our attacks on Tendermint could be easily mounted against GRANDPA protocol also. Thus GRANDPA protocol could not be secure in Type II networks.

[0106] 7 A Secure BFT protocol in Type II Partial Synchronous Networks

[0107] In this section, we propose a Byzantine Agreement Protocol that achieves safety and liveness properties in Type II partial synchronous networks. Though our protocol could be used in other scenarios such as State Machine Replication (SMR), we present the protocol as a finality gadget for blockchains. Assume that there is a separate block proposal mechanism that produces children blocks for finalized blocks by our BFT finality gadget. Let B.sup.0, . . . , B.sup.h1 be the blockchain where B.sup.0 is the genesis block and B.sup.h1 is the most recently finalized head block. The block proposal mechanism may produce several child blocks B.sub.0.sup.h, B.sub.1.sup.h, . . . , B.sub.n.sub.0.sub.1.sup.h of the current head block B.sup.h1. These child blocks are strictly ordered. For example, in proof of stake blockchain applications, each participant has a stake value for the chain height h and these child blocks may be ordered using proposer's stake values. However, it is beyond the scope of the subject matter described herein to specify how these child blocks are ordered for general blockchains. It is the task for the BFT finality gadget to select the maximal block among these candidate child blocks as the next block B.sup.h. Though the goal of the BFT protocol is to select the maximal child block as the final version of block B.sup.h, this may not be true in certain scenarios. For example, if t+1 honest participants have seen the child block B.sub.n.sub.0.sub.2.sup.h and have not seen the maximal block B.sub.n.sub.0.sub.1.sup.h at the start of the protocol (at the same time, we may assume that the other t honest participants have seen the maximal block B.sub.n.sub.0.sub.1.sup.h), then our BFT protocol BDLS will finalize B.sub.n.sub.0.sub.2.sup.h instead of B.sub.n.sub.0.sub.1.sup.h (assuming that the t malicious participants submit the block B.sub.n.sub.0.sub.2.sup.h to the leader). Secondly, our BFT protocol leverages the fact that a candidate block is self-certified. That is, the validity of a candidate child block can be verified by using the information contained in the candidate block itself against the currently finalized blockchain.

[0108] 7.1 The BFT Protocol (BDLS)

[0109] Our BFT protocol is based on the original DLS protocol in Dwork, Lynch, and Stockmeyer [9] and we call it a Blockchain version of DLS (BDLS). For each blockchain height h, BDLS protocol runs from round to round until it reaches an agreement for the height h. Then the protocol moves to the next blockchain height h+1. Let P.sub.0, . . . , P.sub.n1 be the n=3t+1 participants of the protocol. Assume that there are .sub.n.sub.0 valid candidate proposals B.sub.0.sup.h<B.sub.1.sup.h< . . . <B.sub.n.sub.0.sub.1.sup.h for the block B.sup.h. During the protocol run, each participant P.sub.i maintains a local variable BLOCK.sub.i.Math.{B.sub.0.sup.h,B.sub.1.sup.h, . . . ,B.sub.n.sub.0.sub.1.sup.h} that contains the candidate blocks that it has learned so far. Participant P.sub.i prefers the maximal block in BLOCK to be selected as the final block for B.sup.h. The goal of the BDLS protocol is for participants P.sub.0, . . . , P.sub.n1 to reach a consensus on the finalized block B.sup.h.

[0110] Generally, we can use a robust threshold signature scheme to reduce the authenticator complexity, e.g., achieve linear authenticator complexity. For simplicity, the following protocol description is based on a standard digital signature scheme. It could be easily revised to use a threshold signature scheme. Following Dwork, Lynch, and Stockmeyer [9], we assume that all messages after the unknown global stabilization time (GST) will be delivered in the same round and messages before round GST could get lost or re-ordered. Furthermore, though all participants have a common numbering for the round, they do not know when the round GST occurs. A candidate block B is acceptable to P.sub.i if P.sub.i does not have a lock on any value except possibly B. There is a public function leader(h,r) that returns the round leader for a given round r of the height h. For each height h, the BDLS protocol proceeds from round to round (starting from round 0) until the participant decides on a value. The round r of the height h starts when at least 2t+1 participants submit a round-change message to the leader participant. The round r proceeds as follows where P.sub.i=leader(h,r) is the leader for round r: [0111] 1. Each participant P.sub.j (including P.sub.i) sends the signed message (<h,r>.sub.j,<h,r,B.sub.j>.sub.j) to the leader P.sub.i where B.sub.j BLOCK.sub.j is the maximal acceptable candidate block for P.sub.j. The message <h,r>.sub.j is considered as a round-change message. After sending the round-change message, P.sub.j will not accept messages except a decide message for round r<r anymore. [0112] 2. If P.sub.i receives at least 2t+1 round-change messages (including himself), it enters round r (see Section 7.4 for details on when P.sub.i can stop waiting for more round-change request messages). In these round-change messages, if there are at least 2t+1 signed messages from 2t+1 participants with the same candidate block BNULL, then P.sub.i broadcasts the following signed message (2) to all participants

custom-character lock,h,r,B,proof.sub.i (2) [0113] where proof is a list of at least 2t+1 signed messages showing that B is the candidate blocks for at least 2t+1 participants (the proof also shows that round-change request has been authorized by at least 2t+1 participants). If P.sub.i does not receive such a block B, then P.sub.i adds all received candidate blocks to its local variable BLOCK.sub.i and broadcasts custom-character select,h,r,B,proof where B is the candidate block B=max{B:BBLOCK.sub.i} and proof is a list of at least 2t+1 round-change messages. In some embodiments, e.g., to achieve linear communication complexity when a threshold signature scheme employed, the proof in the lock-message and select-message may be different: In the lock-message, the proof contains an assembled digital signature on the message custom-character h,r,B while, in the select-message, the proof contains an assembled digital signature on the message h,r. See Remark 3 for details. [0114] 3. If a participant P.sub.j (including P.sub.i) receives a valid select,h,r,B,proof from P.sub.i during Step 2, then it adds B to its BLOCK.sub.j. If a participant P.sub.j (including P.sub.i) receives a valid message custom-character lock,h,r,Br,proof.sub.j from P.sub.i in Step 2, then it does the following: [0115] (a) releases any potential lock on B from previous round, but does not release locks on any other potential candidate blocks [0116] (b) locks the candidate block B by recording the valid lock (2) [0117] (c) sends the following signed commit message to the leader P.sub.i.

custom-character commit,h,r,B.sub.j. (3) [0118] 4. If P.sub.i receives at least 2t+1 commit messages (3), then P.sub.i decides on the value B and broadcasts the following decide message to all participants

custom-character decide,h,r,B,proof).sub.i. (4) [0119] where proof is a list of at least 2t+1 commit messages (3). [0120] 5. If a participant P.sub.j (including P.sub.i) receives a decide message (4) from Step 4 or from its neighbor, P.sub.j decides on the block B for B.sup.h and moves to the next height h+1 (that is, run the Step 1 of height h+1 by sending the round-change message). At the same time, the participant P.sub.j propagates (broadcasts) the decide message (4) to all of its neighbors if it has not done so yet (see the following Remark 2 for more details on this). Otherwise, it goes to the following lock release step: [0121] (lock release) If a participant P.sub.j (including P.sub.i) has some locked values, it broadcasts all of its locked values with proofs. A participant releases its lock on a value custom-character lock,h,r,B,proof.sub.i if it receives a lock lock,h,r,B,proof.sub.i with rr and BB. [0122] Move to the next round r+1 (e.g., run the Step 1 of height h with r+1). [0123] 6. height synchronization: At any time during the protocol, if P.sub.j receives a finalized bock of height h (e.g., a decide message (4)), P.sub.j decides for height h and moves to height h+1. [0124] 7. round synchronization: At any time during the protocol, if P.sub.1 receives a valid lock or select or decide message for a round r>r, P.sub.j moves to round r and processes the lock or select or decide message. [0125] 8. timeout: For each step, P.sub.j should set an appropriate timeout counter. If P.sub.j does not receive enough messages to move forward before timeout counter expires, it moves to the next step. Section 7.4 and Section 8 includes additional details regarding round/height synchronization.

[0126] Remark 1: In the BDLS protocol, the lock release step is a mesh network broadcast. In some applications, one may prefer a star network to reduce the total number of messages from n.sup.2 to n, e.g., to achieve linear communication complexity. One may achieve this kind of needs by replacing the lock release step with the following additions to the protocol. At the Step 1 of round r, each participant P.sub.1 sends the message

all-locked-values, custom-character h,r,B.sub.j.sub.j

instead of only sending the message custom-character h,r,B.sub.j.sub.j to P.sub.i, where all-locked-values is the set of candidate blocks that P.sub.j has locks on. During Step 2, if P.sub.i cannot lock a candidate block during round r, then it broadcasts the candidate block B=max{B:BBLOCK.sub.i} together with all locked candidate blocks by all participants. It is straightforward to check that our security analysis in the next section remains unchanged for this protocol revision.

[0127] Remark 2: During Step 5 of the BDLS protocol, when a participant receives a decide message, it propagates/broadcasts the decide message to its neighbors. It is recommended that each participant keep broadcasting the signed decide message for height h regularly until it receives at least 2t broadcasts of the decide message for height h from other 2t participants. The importance of this propagation/broadcast is illustrated in Section 9.

[0128] Remark 3: To achieve linear communication/authenticator complexity with threshold digital signature schemes, participant 13 may send the signed message ( custom-character h,r,.sub.jh,r,B.sub.j.sub.j) to the leader P.sub.i during step 1. It should be noted that if there are 2t+1 participants that send the same B.sub.j to the leader, then the leader P.sub.i can assembly a signature for h,r,B.sub.j. If there is no such value B.sub.j, then the leader can only assembly a digital signature for custom-character h,r, which can be used for the select message. In the security proof for BDLS in the next section, the leader does not need to assemble a digital signature for B.sub.j if it only broadcasts a select message.

[0129] 7.2 Liveness and Safety

[0130] The security of BDLS protocol is proved by establishing a series of Lemmas. The proofs for Lemmas 7.1, 7.2, 7.3 and Theorem 7.4 follow from straightforward modifications of the corresponding Lemmas/Theorem in [9]. For completeness, we include these proofs here also.

[0131] Lemma 7.1 It is impossible for two candidate blocks B and B to get locked in the same round r of height h.

[0132] Proof. In order for two blocks B and B to get locked in one round r of height h, the leader P.sub.i=leader(h,r) must send two conflict lock messages (2) with different proofs. This can only happen if there exist at least t+1 participants P.sub.j each of whom equivocates two messages custom-character h,r,B.sub.j and h,r,B.sub.j to P.sub.i. This is impossible since there are at most t malicious participants.

[0133] Lemma 7.2 If the leader P.sub.i decides a block value B at round r of height h and r is the smallest round at which a decision is made. Then at least t+1 honest participants lock the candidate block B at round r. Furthermore, each of the honest participants that locks B at round r will always have a lock on B for round rr.

[0134] Proof. In order for P.sub.i to decide on B, at least 2t+1 participants send commit messages (3) to P.sub.i at round r of height h. Thus at least t+1 honest participants have locks on B at round r. Assume that the second conclusion is false. Let r>r be the first round that the lock on B is released. In this case, the lock is released during the lock release step of round r if some participant has a lock on another block BB with associated round r where rrr. Lemma 7.1 shows that it is impossible for a participant to have a lock on B in round r. Thus the participant acquired the lock on B in round r with rr>r. This implies that, at the step 1 of round r, more than 2t+1 participants send signed messages (h,r,B) to the leader participant. That is, at least 2t+1 participants have not locked B at the step 1 of round r. This contradicts the fact that at least t+1 participants have locked B at the start of round r.

[0135] Lemma 7.3 Immediately after any lock release step at or after the round GST, the set of candidate blocks locked by honest participants contains at most one value.

[0136] Proof. This follows from the lock release step.

[0137] Theorem 7.4 (Safety) Assume that there are at most t malicious participants. It is impossible for two participants to decide on different block values.

[0138] Proof. Suppose that an honest participant P.sub.i decides on B at round r and this is the smallest round at which the decision is made. Lemma 7.2 implies that at least t+1 participants will lock B in all future rounds. Consequently, no other block values other than B will be acceptable to 2t+1 participants. Thus no participants will decide on any other values than B.

[0139] Theorem 7.5 (Liveness) Assume that there are at most t malicious participants and valid candidate child blocks for B.sup.h are always produced by the block proposal mechanism before the start of first round for height h for all h. Then BDLS protocol will finalize blocks for each height h. That is, the BDLS protocol will not reach a deadlock.

[0140] Proof. We consider two cases. For the first case, assume that no decision has been made by any honest participants and no honest participant locks a candidate block at round r where rGST is the first round after GST that the leader participant is honest. In this case, if P.sub.i receives 2t+1 signed messages for a candidate block B in step 1 of round r, then all honest participants will decides on B by the end of round r. Otherwise, P.sub.i broadcasts the maximal candidate block B during step 2 of round r. Thus all honest participants will receive this maximum block and this candidate becomes the maximum acceptable candidate block for all honest participants. Then, in round r>r where r is the smallest round after r that the leader participant is honest, all honest participants decide on a maximal block.

[0141] For the second case, assume that no candidate block is locked at the start of round GST and some participants hold a lock on a candidate block B. By Lemma 7.3, there are at most one value locked by honest participants at the end of round GST. Furthermore, at the end of round GST, all the honest participants either decide on B or obtain a lock on B. Thus if no decision is made during round GST, the decision will be made during round GST+1.

[0142] 7.3 Complexity Analysis

[0143] In this section, we compare the performance of PBFT, Tendermint BFT, HotStuff BFT and our BDLS protocols. Three kinds of primitives are used in these protocol design: (1) broadcast from the leader to all participants; (2) all participants send messages to the leader; and (3) all participants broadcast. We use the following symbols to denote these primitives:

[0144] custom-character : leader broadcasts

[0145] custom-character : all participants send messages to the leader

[0146] custom-character : all participants broadcast

[0147] In the following, we compare the performance of these protocols after the network is synchronized (that is, after GST) and when the round has an honest leader. For all of these protocols, they will reach agreement within one run of the protocol assuming all participants have all the necessary input values at the start of the protocol and the leader is honest.

[0148] FIG. 1 depicts a table 100 containing information about different BFT protocols with a honest leader after GST. Table 100 indicates the steps of one run of these protocols. Furthermore, for BDLS, we use the approaches discussed in the Remarks after the BDLS protocol description to embed the lock release step into Steps 1 and 2. For each custom-character or step, there is a total of n messages communicated in the network. For each step, there is a total of n.sup.2 messages communicated in the network. The row message complexity of Table 100 indicates the total number of messages communicated in the network for each run of the protocol. That is, in the ideal synchronized network, this is the total number of messages that are needed to achieve a consensus. These numbers show that BDLS has the smallest number of messages for a consensus in the synchronized network. Another way to compare the performance of BFT protocols is to compare the number of authenticator operations (signing and verifying) that are needed to achieve a consensus (see, e.g., [20]). Assume that all these schemes (except PBFT) use threshold digital signature schemes, then the row authenticator complexity of Table 100 indicates the total number authenticator operations needed for each run of the protocol.

[0149] 8 Implementation and Performance Evaluation

[0150] 8.1 Chained BDLS and Other Implementation Related Issues

[0151] In order to improve efficiency, several blockchain BFT protocols (e.g., Ethereum Casper FFG, HotStuff BFT, and LibraBFT) adopt the chaining paradigm where the BFT protocol phases for commitment are spread across rounds. That is, every phase is carried out in a round and contains a new proposal. The same techniques could be used to construct a chained BDLS. As noted in HotStuff BFT and LibraBFT, the block tree in chained LibraBFT and chained HotStuff BFT may contain chains that have gaps in round numbers. Thus the commit logic for LibraBFT and HotStuff BFT requires a 3-chain with contiguous round numbers whose last descendant has been certified. Since BDLS is a 2-phase BFT protocol, chained BDLS decide logic requires a 2-chain with contiguous round numbers whose last descendant has been certified.

[0152] For chained BFT protocol implementation, the BFT protocol participants for various rounds/heights should be relatively static. If the BFT protocol participants change from rounds to rounds or from heights to heights, it is not realistic to implement chained BFT protocols. Thus chained BFT protocol implementation is suitable for permissioned blockchains such as Libra blockchain while it is not suitable for permissionless blockchains where BFT protocol participants change frequently. The same rule applies to threshold digital signature scheme implementation for BFT protocols. That is, for permissionless blockchains where BFT protocol participants change frequently, it may have limited advantage in using threshold digital signature schemes since the expensive key set-up process has to be run each time when the participants set changes.

[0153] In most distributed BFT protocols, when the participants could not reach an agreement in one round, participants move to a new round by submitting round-change request. Thus BFT participants may be in different status and receive different messages. It is important to maximize the period of time when at least 2t+1 honest participants are in the same round. PBFT protocol achieves round synchronization by exponentially increasing the timeout length for each round. That is, if the round 0 of height h has a timeout length of , then the round r of height h will have a timeout length of 2r . On the other hand, Tendermint BFT achieves round synchronization by linearly increasing the timeout length for each round. That is, the round r has a timeout length of r where is the timeout length for round 0 of height h. HotStuff proposes a functionality called PaceMaker to achieve round synchronization without details on how to implement the PaceMaker. LibraBFT implemented the PaceMaker functionality in the following way. When a participant gives up on a certain round r, it broadcasts a timeout message carrying a certificate for entering the round. This brings all honest participants to r within the transmission delay bound. When timeout messages are collected from a quorum of participants, they form a timeout certificate. BDLS may use any of these recommended approaches for round synchronization.

[0154] 8.2 BDLS with Pacemaker Mechanism

[0155] Though BDLS may use a PBFT mechanism to keep round synchronization (that is, the timeout period for round r is 2r ), it may be more efficient to use a pacemaker or heartbeat mechanism for BDLS round synchronization. Similar to LibraBFT, the advancement of rounds in BDLS is governed by a module referred to herein as Pacemaker. Pacemaker keeps track of votes and of time. In some embodiments, BDLS may be modified to include Pacemaker so that Pacemaker can be seamlessly integrated into the protocol without extra workload. The major change is Step 1 where Pacemaker timeout messages are combined with round-change messages for efficiency. The round r of the height h for a participant P.sub.j starts when its Pacemaker receives round-change messages from at least 2t+1 participants or if its timeout for round r1 or if it receives a lock or a select or a decide message for round r. Specifically, the round r proceeds as follows where P.sub.i=leader(h,r) is the leader for round r: [0156] 1. (If r>0, this step is done at the end of round r1 of height h. If r=0, this step is done after a decision for height h1 is made.) Pacemaker of each participant P.sub.j (including P.sub.i) broadcasts the signed message ( custom-character h,r.sub.j,h,r,B.sub.j) where B.sub.jBLOCK.sub.J is the maximal acceptable candidate block for P.sub.j of height h. The message h,r.sub.j is considered as a round-change message for round r. After P.sub.j broadcasts the round-change message for round r, it will set a timeout message .sub.0 and enters round-changing status. During round-changing status, a participant will not accept any messages except round-change messages and decide messages for the height h of any round. Furthermore, if r>0, then each participant P.sub.j (including P.sub.i) initializes all of its variables except the locked block variable. If r=0, then each participant P.sub.j (including P.sub.i) initializes all of its variables including the locked block variable. For any participant P.sub.j who is in round-changing status, if it does not enter the lock status of Step 2 before .sub.0 expires, it resends the round-change message and resets its .sub.0. [0157] 2. During any time of the protocol, if Pacemaker of P.sub.j (including P.sub.i) receives at least 2t+1 round-change messages (including a round-change message from himself) for round r (which is larger than its current round status), it enters lock status of round r. If P.sub.j has not broadcast the round-change message yet, it broadcasts now. Then P.sub.j sets the timeout counter .sub.1 for lock status. The lock status timeout counter can be set as follows. For round r=0, the timeout counter .sub.1=.sub.1,0 may be at least four network transmission delays plus some time for each participant to process the messages. For round r>0, the timeout counter may be defined as r.sub.1,0. Furthermore, as soon as the leader P.sub.i enters .sub.1<.sub.1 concurrently. Though it is sufficient for a non-leader participant to collect only 2t+1 round-change requests, the leader may collect as many round-change message as possible. In particular, the leader should try to collect all round-change messages from all participants. It is recommended that after the leader P.sub.i collects 2t+1 round-change requests and starts the lock status timeout counter .sub.1, it initiates another timeout counter .sub.1<.sub.1 to collect as many as possible round-change requests if more round-change requests still arrive. Generally, we can set .sub.1 as two network transmission delays. This mechanism is used to avoid the following attack: the malicious t participants may send random round-change messages to the leader. If the leader only checks the first 2t+1 messages (among them, t could be malicious), then the system may never reach an agreement. However, the leader should not wait forever since the t malicious participants may choose not to send round-change request at all. The leader P.sub.i stops the time counter .sub.1, P.sub.i distinguishes the two cases: [0158] (a) Among all round-change messages that P.sub.i has received, if there are at least 2t+1 signed messages from 2t+1 participants with the same candidate block BNULL, then P.sub.i broadcasts the following signed message (2) to all participants

custom-character lock,h,r,B,proof.sub.i (5) [0159] where the proof shows that at least 2t+1 participants signed messages indicating that B is the candidate block (the proof also shows that a round-change request has been authorized by at least 2t+1 participants). [0160] (b) If P.sub.i does not receive such a block B, then P.sub.i adds all received candidate blocks to its local variable BLOCK, and broadcasts

custom-character select,h,r,B,proof(6) [0161] where B is the candidate block B=max{B:BBLOCK.sub.i} and the proof shows that round-change requests have been authorized by at least 2t+1 participant from Step 1. [0162] 3. If a participant P.sub.j (including P.sub.i) does not receive a valid message from the leader P.sub.i during Step 2 and the timeout counter .sub.1 expires, P.sub.j enters commit status of round r and sets the timeout counter .sub.2 for commit status. The commit status timeout counter can be set as follow. For round r=0, the timeout counter .sub.2=.sub.2,0 may be at least two network transmission delays plus some time for each participant to process the messages. For round r>0, the timeout counter may be defined as r.sub.2,0. Otherwise, if a participant P.sub.j (including P.sub.i) receives a valid message (5) or (6) from P.sub.i before .sub.1 expires, P.sub.j stops the time counter .sub.1 and distinguishes the following two cases: [0163] If P.sub.j receives a valid custom-character select,h,r,B,proof from P.sub.i during Step 2, then it adds B to its BLOCK.sub.j and enters lock release status of round r and sets the timeout counter .sub.3 for lock-release status. [0164] If P.sub.j (including P.sub.i) receives a valid message lock,h,r,B,proof.sub.i from P.sub.i in Step 2, then it does the following and enters commit status by setting the timeout counter .sub.2: [0165] (a) releases any potential lock on B from previous round, but does not release locks on any other potential candidate blocks [0166] (b) locks the candidate block B by recording the valid lock (5) [0167] (c) sends the following signed commit message to the leader P.sub.i.

custom-character commit,h,r,B.sub.j (7) [0168] 4. If P.sub.i receives at least 2t+1 commit messages (7) for the round r of height h with the locked value B of (5) before .sub.2 expires, then P.sub.i decides on the value B and broadcasts the following decide message to all participants

custom-character decide,h,r,B, proof.sub.i (8) [0169] where proof is a list of at least 2t+1 commit messages (7). [0170] 5. If a participant P.sub.j (including P.sub.i) receives a decide message (8) from Step 4 or from its neighbor before the timeout counter .sub.2 expires, it decides on the block B for B.sup.h and the Pacemaker of P.sub.j goes to Step 1 of height h+1. At the same time, the participant P.sub.j propagates (broadcasts) the decide message (8) to all of its neighbors if it has not done so yet. Otherwise, if P.sub.j (including P.sub.i) does not receive a decide message from the leader P.sub.i or its neighbors before the timeout counter .sub.2 expires, P.sub.j enters lock release status of round r and sets the timeout counter .sub.3 for lock release status. The lock release status timeout counter can be set as follow. For round r=0, the timeout counter .sub.3=.sub.3,0 may be at least two network transmission delays plus some time for each participant to process the messages. For round r>0, the timeout counter may be defined as r.sub.3,0. [0171] 6. (lock release) If a participant P.sub.j (including P.sub.i) has some locked values, then P.sub.1 calculates

r.sub.1=max{r:P.sub.j holds a lock custom-character lock,h,r,B,proof.sub.i}. [0172] P.sub.j releases all locks lock,h,r,B,proof.sub.1 with rr.sub.1. P.sub.j then broadcasts the following lock release message

custom-character lockrelease,h,r,lock,h,r.sub.1,B,proof.sub.i.sub.1. (.sup.9) [0173] If P.sub.j receives a lock release message (lockrelease,h,r,lock,h,r.sub.1, B,proof.sub.i.sub.1 with r.sub.1>r.sub.1 from another participant before the timeout .sub.3 expires, then P.sub.j releases its lock lock,h,r.sub.1,B,proof custom-character .sub.i.sub.1 and records the lock lock,h,r.sub.1,B,proof.sub.i.sub.1. After the timeout .sub.3 expires, Pacemaker of P.sub.j goes to Step 1 for round r+1 of height h. [0174] 7. height synchronization: At any time of the protocol run, if P.sub.j receives a finalized bock of height h (e.g., a decide message (8)), P.sub.j decides for height h and moves to height h+1. [0175] 8. round synchronization: At any time of the protocol run, if P.sub.j receives a valid lock or select or decide message for a round r>r, P.sub.j moves to round r and process the lock or select or decide message. Furthermore, at any time, if P.sub.j receives from more than t+1 participants valid messages for round r>r (including round-change messages for round r), P.sub.j goes to Step 1 for round r of height h.

[0176] 8.3 BFT Consensus Algorithm

[0177] FIGS. 2A-2C depict portions of a block diagram illustrating a BFT consensus algorithm 200. In particular, FIGS. 2A-2C depicts various possible operations or actions associated with algorithm 200. For example, algorithm 200 or a variation thereof may be implemented by each participant for a given height in one or more rounds of consensus determinations.

[0178] Referring to FIG. 2A, in step 201, a message is received of a height h. If the receive message is a decide message, then step 228 occurs otherwise step 202 occurs. In step 202, when the receive message is not a decide message, it is determined whether the message round is greater than or equal to the current round of a participant and, if so, depending on what type of message is received, algorithm 200 may move to step 203, step 211, step 219, or step 223.

[0179] In step 203, it is determined whether the message is a round-change message. In step 204, if the message is a round-change message, the round-change message information is stored by the participant for the round indicated by the message. In step 205, it is determined whether the number of received round-change messages for the message round reaches or exceeds the predetermined number (e.g., 2t+1, where t is the number of malicious participants) of participants. In step 206, if the threshold is reached, the participant sends a round-change message if the participant has not already. In step 207, the participant enters a lock status for the round. In step 208, the participant sets a lock timeout timer, wherein if the lock status is removed if the timer runs out. In step 209, it is determined whether the participant is the current participant leader (for the round). If step 210, the current participant leader sets a collection timeout timer so that round-change messages can be received or collected (e.g., the timeout period may be based on round trip latency and/or other information).

[0180] Referring to FIG. 2B, in step 211, it is determined whether the message is a lock message. In step 212, if the message is a lock message, it is determined whether the message round is greater than the current round of a participant. If the message round is greater than the current round, step 213 occurs and if not then step 214 occurs. In step 213, the participant moves its current round to the message round including clearing all previous round timers and then step 214 occurs. In step 214, it is determined whether the participant is in a lock release state. If the participant is in the lock release state, step 215 occurs and if not step 217 occurs. In step 215, it is determined whether the current round is different from the round associated with the existing lock and the candidate block associated with the lock is different from the current candidate block. If so, in step 216, the existing lock is released and a new lock for the current round and candidate block is set. In step 217, the existing lock is release and a new lock for the current round and candidate block is set. In step 218, the participant sends a commit message indicating the candidate block to the current participant leader and then enters a commit status and starts a commit timeout timer (step 237 shown in FIG. 2C).

[0181] In step 219, it is determined whether the message is a select message. In step 220, if the message is a select message, it is determined whether the message round is greater than the current round of a participant. If the message round is greater than the current round, step 221 occurs and if not then step 222 occurs. In step 221, the participant moves its current round to the message round including clearing all previous round timers and then step 222 occurs. In step 222, the participant stores the candidate block from the select message as its candidate block and enters a commit status and starts a commit timeout timer (step 237 shown in FIG. 2C).

[0182] In step 223, it is determined whether the message is a commit message. In step 224, if the message is a commit message, it may be determined whether the participant is the current participant leader (for the round). If step 225, the current participant leader determines whether the current round is the same as the round in the commit message and the current candidate block is the same as the candidate block in the commit message. If so, in step 226, the current participant leader determines whether commit messages from at least 2t+1 participants. If so, in step 227, the current participant leader enters a commit status and broadcasts a decide message indicating the candidate block to other participants (step 232) and the current participant leader increments its current height by one (from the height indicated in the decide message), and then enters a round changing status.

[0183] In step 228, it may be determined whether a received message is a decide message , In step 229, if the message is a decide message, it may be determined whether the height in the message is the greater than the current height stored at the participant. If so, in step 230, the participant broadcasts the decide message to other participants. In step 231, the participant decides on the candidate block for the height indicated in the decide message and increments its current height by one (from the height indicated in the decide message), and then enters a round changing status.

[0184] After entering a round changing status, in step 233, the participant broadcasts a round-change message indicating the current (new) height and sets a round-change timeout timer (step 234), where the round-change status expires at the end of the timer.

[0185] Referring to FIG. 2C, timer related actions associated with algorithm 200 are depicted. In step 235, a particular timer for a participant is started. In step 236, if the timer is a lock timeout timer and it expires, then the participant enters a commit status and starts a commit timeout timer (step 237). In step 238, if the timer is a commit timeout timer and it expires, then the participant broadcasts a lock release message (step 239). In step 240, the participant enters a lock release status and sets a lock release timeout timer.

[0186] In step 241, if the timer is a lock release timeout timer and it expires, then the participant broadcasts a round-change message indicating a new round (e.g., increments the current round by 1) (step 242).

[0187] In step 243, if the timer is a round-change timeout timer and it expires, then the participant broadcasts a round-change message indicating a new height (e.g., increments the current height by 1) (step 244). In step 245, the participant sets a new round-change timeout timer.

[0188] In step 246, if the timer is a collect timeout timer, then before it expires, it is determined whether the participant has received round-change messages from at least 2t+1 participants, and that these messages indicate the same candidate block B and B is not NULL (step 247). If so, in step 248, the participant broadcasts a lock message to other participants, where the lock message indicates that round-change messages indicating a same candidate block have been received from a at least 2t+1 participants and, after broadcasting the lock message, the participant stops the collect timeout timer (step 249).

[0189] In step 246, if the timer is a collect timeout timer and it expires, the participant adds all received candidate blocks to its local variable BLOCK.sub.j (step 250). In step 251, the participant broadcasts a lock message to other participants, where the lock message indicates the maximal candidate block from the received candidate blocks and, after broadcasting the lock message, the participant stops the collect timeout timer (step 249).

[0190] It will be appreciated that algorithm 200 is for illustrative purposes and that different and/or additional actions may be used. It will also be appreciated that various actions described above with regard to algorithm 200 may occur in a different order or sequence.

[0191] FIG. 4 is a diagram illustrating an example computer system 400 for providing BFT. In some embodiments, computer system 400 may be a single device or node or may be distributed across multiple devices or nodes.

[0192] Referring to FIG. 4, computer system 400 includes one or more processor(s) 402, a memory 404, and storage 410 communicatively connected via a system bus 408. Computer system 400 may represent one or more computing platforms or devices. Computer system 400 may include or utilize one or more communications interface(s) 412. In some embodiments, processor(s) 402 can include a microprocessor, a central processing unit (CPU), a graphics processing unit (GPU), and/or any other like hardware based processing unit. In some embodiments, a BFT module 406 can be stored in memory 404, which can include random access memory (RAM), read only memory (ROM), optical read/write memory, cache memory, magnetic read/write memory, flash memory, or any other non-transitory computer readable medium.

[0193] BFT module 406 may include logic and/or software for performing various functions and/or operations described herein. In some embodiments,

[0194] BFT module 406 may include or utilize processor(s) 402 or other hardware to execute software and/or logic. For example, BFT module 406 may perform various functions and/or operations associated with providing BFT and/or related operations. In this example, BFT module 406 may be used in various applications, e.g., a consensus application, a blockchain application, a distributed computing application, and/or an authentication application.

[0195] In some embodiments, computer system 400 may include one or more communications interface(s) 412 for communicating with nodes, modules, and/or other entities. For example, one or more communications interface(s) 112 may be used for communications between BFT module 406 and a system operator and a same or different communications interface for communicating with other modules or network nodes.

[0196] In some embodiments, processor(s) 402 and memory 404 can be used to execute BFT module 406. In some embodiments, storage 410 can include any storage medium, storage device, or storage unit that is configured to store data accessible by processor(s) 402 via system bus 408. In some embodiments, storage 410 can include one or more databases hosted by or accessible by computer system 400.

[0197] In some embodiments, BFT module 406 may perform a method and/or technique (e.g., algorithm 200 or a variation thereof) for providing BFT in an asynchronous (e.g., partially synchronous) environment. For example, BFT module 406 may perform algorithm or a variation of BDLS described herein. In this example, BFT module 406 may perform different actions based on different types of signed messages, current states, and/or various timers when reaching a consensus decision or related functionality.

[0198] In some embodiments, BFT module 406 may be associated with participants performing a distributed computing application, e.g., blockchain generation or digital currency mining. In such embodiments, BFT module 405 may utilize algorithm 200 or a similar algorithm to determine a candidate block for a given height and round. For example, computer system 400 may utilize BFT module 406 to execute a BFT protocol, wherein computer system 400 acts as a leader participant of a round in a consensus decision. In this example, computer system 400 or BFT module 406 may receive signed round-change messages from multiple participants in the round; broadcast (e.g., send to multiple participants) a signed lock message indicating that signed round-change messages have been received from a predetermined number of participants (e.g., at least 2t+1 participants, where t represents an amount of malicious participants in the round) indicating a same candidate block (e.g., ); receiving signed commit messages from multiple participants in the round; and broadcasting a signed decide message indicating the candidate block is a finalized block (e.g., after a predetermined number of participants in the round have sent signed commit messages indicating the candidate block).

[0199] It will be appreciated that FIG. 4 is for illustrative purposes and that various nodes, their locations, and/or their functions may be changed, altered, added, or removed. For example, some nodes and/or functions may be combined into a single entity or some functionality (e.g., BFT module 406 and a pacemaker module and/or a blockchain generation program) may be separated into separate nodes or modules.

[0200] FIG. 5 is a diagram illustrating an example process 500 for providing BFT. In some embodiments, process 500 described herein, or portions thereof, may be performed at or by computer system 400, BFT module 406, processor(s) 402, and/or a module or node. For example, BFT module 406 or computer system 400 may include or be a mobile device, a smartphone, a tablet computer, a computer, a computing platform, or other equipment. In another example, BFT module 406 may include or provide an application running or executing processor(s) 402.

[0201] In some embodiments, process 500 may include steps 502-508 and may be performed by or at one or more devices or modules, e.g., a smartphone or computer implemented using at least one processor.

[0202] In some embodiments, a computing platform may execute a BFT protocol including process 500. In such embodiments, the computing platform executing process 500 may act as a leader participant of a round of the BFT protocol, e.g., for achieving consensus in bit mining or another distributed computing application.

[0203] Referring to process 500, in step 502, signed round-change messages may be received from multiple participants in a round.

[0204] In step 504, a signed lock message indicating that signed round-change messages have been received from a predetermined number of the participants in the round voting for a same candidate block may be broadcasted.

[0205] In step 506, signed commit messages may be received from multiple participants in the round.

[0206] In step 508, a signed decide message indicating the candidate block is a finalized block may be broadcasted after the predetermined number of the participants in the round have sent signed commit messages indicating the candidate block.

[0207] In some embodiments, a predetermined number of the participants in a round may include at least 2t+1 participants, where t represents an amount of malicious participants in the round.

[0208] In some embodiments, a participant in the round receives the decide message from the leader participant or another participant and sends the decide message to other participants in the round.

[0209] In some embodiments, a candidate block may be a maximal acceptable candidate block for a round.

[0210] In some embodiments, a leader participant may change for a subsequent round.

[0211] In some embodiments, a round may be associated with a blockchain height and a signed decide message may indicate an agreed upon blockchain height (e.g., agreed upon by at least a predetermined number of participants).

[0212] In some embodiments, a participant in a round may utilize a round synchronization technique and a height synchronization technique, wherein the round synchronization technique involves the participant incrementing by one a current blockchain height variable associated with the participant in response to receiving the decide message, and wherein the height synchronization technique involves the participant sending a signed round-change message to the leader in response to the participant receiving a signed look message, a commit message, or a decide message for a subsequent round relative to a current round variable associated with the participant.

[0213] In some embodiments, a participant in a round may utilize one or more timers, wherein the one or more timers may include an operation timeout timer, a round changing status timer, or a lock status timer, a commit status timer, or a lock release status timer.

[0214] In some embodiments, a participant in a round may utilize an application programming interface (API) for obtaining a participant list for the round or a related blockchain height.

[0215] In some embodiments, a participant in a round may check a local participant list after receiving a BFT related message.

[0216] It will be appreciated that process 500 is for illustrative purposes and that different and/or additional actions may be used. It will also be appreciated that various actions described herein may occur in a different order or sequence.

[0217] It should be noted that computer system 400, BFT module 406, and/or functionality described herein may constitute a special purpose computing device. Further, system 400, BFT module 406, and/or functionality described herein can improve the technological field of BFT and/or related consensus applications (e.g., blockchain applications, distributed data storage applications, etc.), by providing mechanisms and/or techniques for providing BFT using algorithm 200 or similar functionality. As such, various BFT techniques and/or mechanisms described herein can provide improved BFT relative to some existing BFT protocols. For example, such BFT techniques and/or mechanisms described herein, e.g., BDLS or algorithm 200, can provide improved liveness and safety in Type II partial synchronous networks and/or other distributed networks.

[0218] The disclosure of each of the following references is incorporated herein by reference in its entirety to the extent not inconsistent herewith and to the extent that it supplements, explains, provides a background for, or teaches methods, techniques, and/or systems employed herein.

[0219] 8.4 Performance Evaluation

[0220] In this section, performance of the BDLS consensus algorithm with a Pacemaker module in Section 8.2 implemented using Go Programming Language is evaluated. The implementation is based on algorithm 200 depicted in FIGS. 2A-2C.

[0221] A first testing platform utilized for evaluating an implementation of the BDLS consensus algorithm includes an AMD Ryzen 7 2700X eight-core processor with 64 gigabyte (GB) RAM and Linux 4.19.84-microsoft-standard operating system. A second testing platform utilized for evaluating an implementation of the BDLS consensus algorithm includes a BCM2835 Broadcom chip with 4 cores and 1 GB RAM and a Linux raspberry pi 4.19.75-v7I+ operating system (e.g., for approximating performance of the BFT implementation during a heavy load scenario).

[0222] Using the two testing platforms, scenarios involving 20 participants, 30 participants, 50 participants, 80 participants, and 100 participants were tested.

[0223] During testing, various network scenarios were simulated by changing values for the following parameters: [0224] DELAY.EXP: Expected Latency set to consensus algorithm [0225] DECIDE.AVG: Average finalization time for each height [0226] NET.MSGS: Total network number of messages exchanged in all heights [0227] NET.BYTES: Total network bytes exchanged in all heights [0228] NET.MSGRATE: Network message rate (messages/second) [0229] DELAY.MIN: Actual minimal network latency (network latency is randomized with normal distribution) [0230] DELAY.MAX: Actual maximal network latency.

[0231] FIGS. 3A-3C depict tables containing information for various test scenarios involving an example BFT implementation, e.g., based on algorithm 200. In FIG. 3A, table 300 shows test results for a 50 participants scenario involving the first testing platform. In FIG. 3B, table 302 shows test results for a 50 participants scenario involving the second testing platform. In FIG. 3C, table 304 shows DELAY.EXP and DECIDE.AVG values for different participant scenarios and testing platforms.

[0232] 8.5 Static and Dynamic BFT Participants

[0233] For blockchain environments, the BFT participants may change from height to height (or even from round to round). In such embodiments, to obtain the BFT participant team, each participant may use an API call to obtain the participant list for the height h before submitting the round-change message for a new height h. However, for a permissionless blockchain, the full participant list may not be available at the time when it submits the round-change message. Thus each time, when a participant receives a BFT message, the participant may check whether the sender of the message is in its local list of participants or not. If not, the participant may use an API to check whether the sender is a qualified participant for this height or not. If the sender is a qualified participant, the participant may expand its participant list and adjust the parameters accordingly.

[0234] On the other hand, some applications of BDLS BFT protocol may involve static BFT participants. To make the BDLS package more efficient for these applications, one may use an API call to check whether BFT participants change from round to round. If the participant list does not change, the BLDS protocol may not carry out the extra checks discussed in the preceding paragraph.

[0235] 9 Importance of Propagating Decision Messages

[0236] During Step 5 of the BDLS protocol, when a participant receives a decide message, it propagates the decide message to its neighbors. In this section, we show the importance of this process by the potential issues for the HotStuff protocol since it does not have this decision message propagation process.

[0237] 9.1 HotStuff BFT Protocol

[0238] HotStuff BFT [20] includes basic HotStuff protocol and chained HotStuff protocol. For simplicity, we only review the basic HotStuff BFT protocol. Similar to PBFT and Tendermint BFT, there are n=3t+1 participants P.sub.0, . . . , P.sub.n1 and at most t of them are malicious. The view is defined and changes in the same way as in PBFT. The major differences between PBFT and HotStuff BFT are: [0239] 1. PBFT participants broadcast signed messages to all participants though HotStuff participants send the signed messages to the leader participant in a point-to-point channel. In other words, PBFT uses a mesh topology communication network though HotStuff uses a star topology communication network. [0240] 2. PBFT uses standard digital signature schemes though HotStuff uses threshold digital signature schemes.

[0241] With these two differences, HotStuff achieves authenticator complexity O(n) for both the correct leader scenario and the faulty leader scenario. On the other hand, the corresponding authenticator complexity for PBFT is O(n.sup.2) for the correct leader scenario and O(n.sup.3) for the faulty leader scenario respectively. For simplicity, we will describe the HotStuff BFT protocol using a standard digital signature scheme instead of threshold digital signature schemes. Our analysis does not depend on the underlying signature schemes.

[0242] HotStuff BFT has revised the validRound and lockedRound variables in Tendermint BFT to its prepareQC and lockedQC variables respectively. Though Tendermint BFT participants set the values for two variables in the same phase, HotStuff BFT participants set the values for these variables in different steps.

[0243] In HotStuff BFT, each participant stores a tree of pending commands as its local data structure and keeps the following state variables viewNumber (initially 1), prepareQC(initially nil, storing the highest QC for which it voted pre-commit), and lockedQC (initially nil, storing the highest QC for which it voted commit).

[0244] Each time when a new-viewstarts, each participant should send its prepareQC variable to the leader. There is a public function LEADER(viewNumber)that determines the current leader participant. When a client sends an operation request m to the leader P.sub.i, the n participants carry out the four phases of the BFT protocol: prepare, pre-commit, commit and decide. [0245] 1. prepare: The leader P.sub.i starts the process after it has received 2t+1 newviewmessages. Each newview message contains a prepareQCvariable. P.sub.i selects highQC as the prepareQCvariable with the highest viewNumber. P.sub.i extends the tail of highQC node by creating a new leaf node proposal. P.sub.i then broadcasts the digitally signed new leaf node proposal (together with highQC for safety justification) to all participants in a preparemessage. A participant accepts this new leaf node proposal if the new node extends the currently locked node lockedQC. node or it has a higher view number than the current lockedQC. If a participant P.sub.j accepts the new leaf node proposal, it sends a prepare vote message to P.sub.i by signing it. [0246] 2. pre-commit: When P.sub.i receives 2t+1 preparevotes for the current proposal, it combines them into a prepareQC. P.sub.i broadcasts prepareQC in a pre-commit message. A participant sets its prepareQCvariable to this received prepareQC value and votes for it by sending the signed prepareQC back to P.sub.i in a pre-commit message. [0247] 3. commit: When P.sub.i receives 2t+1 pre-commitvotes. It combines them into a precommitQC and broadcasts it in a commitmessage. A participant sets its lockedQC variable to this received precommitQC value and votes for it by sending the signed precommitQC back to P.sub.i in a commit message. [0248] 4. decide: When P.sub.i receives 2t+1 commitvotes, it combines them into a commitQC. P.sub.i broadcasts commitQC in a decide message. Upon receiving a decide message, a participant considers the proposal embodied in the commitQC a committed decision, and executes the commands in the committed branch. The participant increments viewNumber and starts the next view.

[0249] 9.2 What Happens if Leader Does not Reliably Broadcast Decide Messages in HotStuff

[0250] In the following, we describe three scenarios with completely different semantics where the client receives different responses. However, the HotStuff trees are identical for these three scenarios. First assume that at the end of view v1, we have lockedQC=prepareQC and the HotStuff path corresponding to lockedQC.node is a.sub.0.fwdarw.a.sub.1.fwdarw.a.sub.l where a.sub.0 is the root.

[0251] Assume that the views v and v+1 are executed before GST. That is, the broadcast channel is not reliable before the end of view v+1. Assume that the leader for view v is P.sub.i and the leader for view v+1 is P.sub.i. Furthermore, assume that both P.sub.i and P.sub.i are malicious,

[0252] Scenario I: The leader P.sub.i for view v receives 2t+1 new-view messages that contain the identical highQC=prepareQC with the corresponding path a.sub.0.fwdarw.a.sub.1.fwdarw.a.sub.l. P.sub.i extends the path to the new path a.sub.0.fwdarw.a.sub.1.fwdarw.a.sub.l.fwdarw.b and creates a proposal for the new leaf node b. P.sub.i then broadcasts the digitally signed new leaf node proposal (together with highQC) to all participants in a preparemessage. All participant accept this new leaf node proposal and sends a preparevote message to P.sub.i by signing it. In the pre-commit phase, P.sub.i receives 2t+1 preparevotes for the current proposal, it combines them into a prepareQC and broadcasts prepareQC in a pre-commitmessage to all participants. All participant set their prepareQCvariable to this received prepareQC value and vote for it by sending the signed prepareQC back to P.sub.i. During the commit phase, P.sub.i receives 2t+1 pre-commitvotes. It combines them into a precommitQC and broadcasts it in a commitmessage. All participant set their lockedQCvariable to this received precommitQC value and vote for it by sending the signed precommitQC back to P.sub.i. In the decide phase, P.sub.i receives 2t+1 commitvotes, it combines them into a commitQC. P.sub.i only send the commitQC to one honest participant Pj but not to anyone else. After timeout, the view v+1 starts. During view v+1, the leader participant extends the path a.sub.0.fwdarw.a.sub.1.fwdarw.a.sub.l.fwdarw.b to a.sub.0.fwdarw.a.sub.1.fwdarw..fwdarw.a.sub.l.fwdarw.b.fwdarw.c by including a new client command to the node c. Assume that all messages during view v+1 are delivered and all participants behaves honestly. Thus at the end of view v+1, all participants (except P.sub.j) only executed the commands contained the node c and P.sub.j executed the commands contained both in b and c. Since the client only received one response from P.sub.j that the commands in node b is executed, it will not accept it.

[0253] Scenario II: In this scenario, the leader participant P.sub.i for view v does not send any decide message in the last step of view v. All other steps are identical to the Scenario I. Thus at the end of view v+1, all participants executed the command contained in the node c though no participants executed the command contained in the node b.

[0254] Scenario III: In this scenario, the leader participant P.sub.i for view v sends the decide message to all participants in the last step of view v. All other steps are identical to the Scenario I. Thus at the end of view v+1, all participants executed the commands contained in the nodes b and c.

[0255] For all these three scenarios, the path corresponding to the prepareQC at the end of view v+1 is a.sub.0.fwdarw.a.sub.1.fwdarw.a.sub.l.fwdarw.b.fwdarw.c though the internal states of honest participants are different.

[0256] In the HotStuff BFT protocol [20], it is mentioned that [i]n practice, a recipient who falls behind can catch up by fetching missing nodes from other replicas. For all three of the scenarios that we have described, at the end of view v+1, the participant who falls behind may fetch the prepareQC corresponding to the path a.sub.0.fwdarw.a.sub.1.fwdarw.a.sub.l.fwdarw.b.fwdarw.c. But it does not know which scenario has happened. It should be noted that in the HotStuff BFT protocol, the node on the tree only contains the following information: the hash of the parent node and the client command. However, it does not contain any information whether the command has been executed. Our analysis shows that it is important to include in the tree node whether a given command has been executed.

REFERENCES

[0257] [1] M. Ben-Or. Another advantage of free choice: Completely asynchronous agreement protocols (extended abstract). In Proc. 2nd ACM PODC, pages 27-30, 1983. [0258] [2] G. Bracha. An asynchronous [(n1)/3]-resilient consensus protocol. In Proc. 3rd ACM PODC, pages 154-162. ACM, 1984. [0259] [3] E. Buchman, J. Kwon, and Z. Milosevic. The latest gossip on BFT consensus. Preprint arXiv:1807.04938, 2018. [0260] [4] V. Buterin and V. Griffith. Casper the friendly finality gadget. arXiv preprint arXiv:1710.09437v4, 2019. [0261] [5] M. Castro and B. Liskov. Practical byzantine fault tolerance and proactive recovery. ACM TOCS, 20(4):398-461, 2002. [0262] [6] Cosmos. Cosmos Network: Internet of Blockchains https://cosm os. network. [0263] [7] Yvo Desmedt, Yongge Wang, and Mike Burmester. A complete characterization of tolerable adversary structures for secure point-to-point transmissions without feedback. In International Symposium on Algorithms and Computation, pages 277-287. Springer, 2005. [0264] [8] D. Dolev and H. R. Strong. Polynomial algorithms for multiple processor agreement. In Proc. 14th ACM STOC, pages 401-407. ACM, 1982. [0265] [9] C. Dwork, N. Lynch, and L. Stockmeyer. Consensus in the presence of partial synchrony. JACM, 35(2):288-323, 1988. [0266] [10] M. J. Fischer, N. A Lynch, and M. S. Paterson. Impossibility of distributed consensus with one faulty process. Journal of the ACM (JACM), 32(2):374-382, 1985. [0267] [11] Web3 Foundation. Byzantine finality gadgets, https://research.web3.foundation/en/latest/polkadot/GRANDPA, Apr. 17, 2019. [0268] [12] J. Katz and C.-Y. Koo. On expected constant-round protocols for byzantine agreement. Journal of Computer and System Sciences, 75(2):91-112, 2009. [0269] [13] J. Kwon. Tendermint powers 40%+ of all proof-of-stake blockchains. invest: asia, available at https://realsatoshi.net/12886/, Sep. 12, 2019. [0270] [14] L. Lamport, R. Shostak, and M. Pease. The Byzantine generals problem. ACM Transactions on Programming Languages and Systems (TOPLAS), 4(3):382-401, 1982. [0271] [15] M. Pease, R. Shostak, and L. Lamport. Reaching agreement in the presence of faults. Journal of the ACM (JACM), 27(2):228-234, 1980. [0272] [16] TK Srikanth and S. Toueg. Simulating authenticated broadcasts to derive simple fault-tolerant algorithms. Distributed Computing, 2(2):80-94, 1987. [0273] [17] The LibraBFT Team. State machine replication in the Libra Blockchain. available at https://developers.libra.org/docs/assets/papers/libra-consensus-state-machine-replication-in-the-libra-blockchain/2019-11-08. pdf, Nov. 28, 2019. [0274] [18] Y. Wang and Y. Desmedt. Secure communication in multicast channels: the answer to Franklin and Wright's question. Journal of Cryptology, 14(2):121-135, 2001. [0275] [19] Y. Wang and Y. Desmedt. Perfectly secure message transmission revisited. Information Theory, IEEE Tran., 54(6):2582-2595, 2008. [0276] [20] M. Yin, D. Malkhi, M.K. Reiter, G.G. Gueta, and I. Abraham. HotStuff:

[0277] BFT consensus in the lens of blockchain. arXiv preprint arXiv:1803.05069, 2018.

[0278] It will be understood that various details of the subject matter described herein may be changed without departing from the scope of the subject matter described herein. Furthermore, the foregoing description is for the purpose of illustration only, and not for the purpose of limitation, as the subject matter described herein is defined by the claims as set forth hereinafter.

METHODS, SYSTEMS, AND COMPUTER READABLE MEDIA FOR PROVIDING BYZANTINE FAULT TOLERANCE

Inventors

Cpc classification

Classification Explorer

G06F16/2365

PHYSICS

Classification Explorer

H04L9/3239

ELECTRICITY

Classification Explorer

G06F11/2023

PHYSICS

Classification Explorer

H04L63/1441

ELECTRICITY

Classification Explorer

H04L63/12

ELECTRICITY

Classification Explorer

G06F11/2028

PHYSICS

Classification Explorer

H04L63/1416

ELECTRICITY

Classification Explorer

G06F11/183

PHYSICS

Classification Explorer

H04L9/0643

ELECTRICITY

Classification Explorer

G06F11/1633

PHYSICS

Classification Explorer

G06F16/275

PHYSICS

Classification Explorer

H04L9/50

ELECTRICITY

International classification

Classification Explorer

G06F11/18

PHYSICS

Classification Explorer

G06F11/16

PHYSICS

Classification Explorer

G06F11/20

PHYSICS

Classification Explorer

H04L29/06

ELECTRICITY

Classification Explorer

H04L9/06

ELECTRICITY

Abstract

Claims

Description