FLOW CONTROL FOR PROBABILISTIC RELAY IN A BLOCKCHAIN NETWORK

20210160309 · 2021-05-27

    Inventors

    Cpc classification

    International classification

    Abstract

    The invention relates to method for adjusting the minimum and maximum number of peer nodes that a node on the blockchain network will connect with. The adjustment takes in to account the bandwidth and processing capability of the node. Bandwidth capacity of a node is determined based on a maximum data amount processable by the node over a time period. Data is monitored passing through interfaces of the node, to and from peer nodes, and a profile factor of the node is determined from the difference between the input data to output data. Over a plurality of time periods monitoring said data the data analysed is used to set a minimum number of peer nodes and a maximum number of peer nodes connectable to the node according to said monitored data and the maximum number of peers connectable to the node. The method enables a node to adjust the number of connections according to performance limitation factors, such as bandwidth availability and processing performance. With the number of peer node connections determined, the node can further determine a correlation matrix between the interfaces and peer nodes to which it is connected. The matrix can be compiled with correlation coefficients representing the correlation between data processed at each interface of said node. The invention also resides in a corresponding computer readable storage medium, electronic device, node of a blockchain network or blockchain network having such a node.

    Claims

    1. A computer-implemented method for a node of a blockchain network, said node having a plurality of interfaces connected to peer nodes, the computer-implemented method comprising: determining a bandwidth (B) capacity of a node based on a maximum data (D) amount processable by the node over a time period (T); monitoring data (d), comprising input data (d.sub.i) and output data (d.sub.o), passing through interfaces of the node, to and from peer nodes, and determining a profile factor (k) of the node from the difference between the input data (d.sub.i) to output data (d.sub.o); over a plurality of time periods (T) monitoring said data (d); and setting a minimum number of peer nodes (m.sub.min) and a maximum number of peer nodes (m.sub.max) connectable to the node according to said monitored data (d) and the maximum number of peers (P) connectable to the node.

    2. A method according to claim 1, wherein over at least one time period (T) the monitored data (d) is used to determine a ratio and setting the minimum number of peer nodes (m.sub.min) and a maximum number of peer nodes (m.sub.max) connectable to the node is determined using said ratio.

    3. A method according to claim 1, wherein the profile factor (k) is assigned a value according to a function of the node, wherein the profile factor has a value of one (1) for nodes configured to route data, a value greater than one (1) for nodes that collect or aggregate data, and a value less than one (1) for nodes that generate or provide data.

    4. A method according to claim 1, wherein said node has a plurality of interfaces connected to peer nodes and a correlation matrix is determined from correlation coefficients representing the correlation between data processed at each interface of said node and the time period (T) is determined from the length of time between a change to the correlation matrix.

    5. A method according to claim 4, wherein said node receives information related to the activity of nodes on the blockchain network and determines the time period (T) according to the length of time between a change to the correlation matrix.

    6. A method according to claim 1, wherein averaged data (d*) comprising an average of data (d) over the plurality of time periods (T), is used to determine the minimum and maximum number of peer nodes connectable to the node by determining the ratio of averaged data (d*) to the maximum data (D).

    7. A method according to claim 6, further including determining a variance (σ.sup.2) of the data (d) over a plurality of time periods (T), applying a scaling factor to the variance, and determining the minimum and maximum number of peer nodes connectable to the node according to the scaled variance.

    8. A method according to claim 7, wherein m min = [ d * D .Math. P - t .Math. σ 2 ] m max = [ d * D .Math. P + t .Math. σ 2 ] wherein d* is the averaged level of data (d) over the plurality of time periods (T), D is the maximum data (D) processable by the node over a time period (T), P is the maximum number of peers (P) connectable to the node, σ.sup.2 is the variance of said data (d), t is a scaling factor or variance parameter, and m.sub.max≤P

    9. A method according to claim 8, wherein the scaling factor or variance parameter t has an operational range, d j * D j .Math. P - 1 σ j 2 t P ( 1 - d j * D j ) σ j 2 and the mean value of the variance parameter t across the operational range is used when determining the minimum and maximum number of peer nodes connectable to the node.

    10. A method according to claim 1, wherein the time period (T) is adjusted by the node to a predetermined value temporarily to accommodate at least one of a local parameter update to the node, an initiation period, wherein the node accumulates data, and temporary relay of data to all peer nodes.

    11. A method according to claim 1, wherein the time period (T) is a flow control parameter (T.sub.flow), and said parameter is adjusted to modify the traffic passing through the node.

    12. A method according to claim 1, wherein the node has a plurality of interfaces connected to peer nodes, said plurality being at least a minimum number of interfaces connected to a peer node (m.sub.min) and less than or equal to a maximum number of interfaces connected to a peer node (m.sub.max), the method further comprising: determining a correlation matrix having correlation coefficients representing the correlation between data processed at each interface of said node; receiving data at a receiving interface of said node; and selecting at least one of a plurality of other interfaces of said node, and relaying said received data from the or each other interface, wherein other interfaces are selected according to a set of the correlation coefficients of the receiving interface.

    13. A method according to claim 12, wherein an indicator is derived from the correlation matrix and data is relayed if the correlation between the receiving interface and the or each other interface is lower than the indicator.

    14. A computer readable storage medium comprising computer-executable instructions which, when executed, configure a processor to perform the method of claim 1.

    15. An electronic device comprising: an interface device; one or more processor(s) coupled to the interface device; a memory coupled to the one or more processor(s), the memory having stored thereon computer executable instructions which, when executed, configure the one or more processor(s) to perform the method of claim 1.

    16. A node of a blockchain network, the node configured to perform the method of claim 1.

    17. A blockchain network having a node according to claim 16.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0055] These and other aspects of the present invention will be apparent from and elucidated with reference to, the embodiment described herein. An embodiment of the present invention will now be described, by way of example only, and with reference to the accompany drawings, in which:

    [0056] FIG. 1 illustrates network packets serially sent and received at application-level via a node on a bitcoin network;

    [0057] FIG. 2 represents the route taken by a single transaction between nodes and across the bitcoin network;

    [0058] FIG. 3 is an illustration of a correlation matrix showing the correlation coefficients of the interfaces connected to peer nodes; and

    [0059] FIG. 4 is diagram representing nodes having different functions in a blockchain network.

    DETAILED DESCRIPTION

    [0060] In this specification a solution to the problem of processing and storing large Gigabyte-sized blocks is described. Efficient transaction of data across a blockchain network can be improved by determining the connectivity between nodes and relaying data accordingly. Moreover, the specifications of the nodes and bandwidth available influence the flow of data across the network and number of connections a node has and the time periods for determination of connectivity can be adjusted or dimensioned to improve flow.

    [0061] Processing

    [0062] Types of Blockchain Network Nodes & Validation Nodes

    [0063] A blockchain network may be described as a peer-to-peer open membership network which may be joined by anyone, without invitation or without consent from other members. Distributed electronic devices running an instance of the blockchain protocol under which the blockchain network operates may participate in the blockchain network. Such distributed electronic devices may be referred to as nodes. The blockchain protocol may be a Bitcoin protocol, or other cryptocurrency, for example.

    [0064] Some blockchain networks, such as Bitcoin, use TCP for node-to-node communications. Data packets sent using TCP have an associated sequence number which is used for ordering purposes.

    [0065] In addition to this, the TCP protocol involves a three-way handshake procedure, both when establishing a connection as well as terminating one. Packets sent over TCP come with an overhead associated, they have a sequence number associated and there is a three-way handshake protocol. In establishing a connection 128-136 bytes are being transmitted, whereas closing a connection costs 160 bytes. Thus the handshake in packet transmission costs up to 296 bytes.

    [0066] Additionally, when a node receives a new transaction, it notifies the other nodes with an inventory (INV) message which contains the hash of the transaction. A node which receives an INV message checks whether the hash of that transaction has been seen before;

    [0067] if not, the node will request the transaction by sending a GETDATA message. The time necessary to transmit a transaction from Node A to Node B is T1=verification+TCP(inv+getdata+tx), where TCP( ) indicates the overhead, in terms of time, introduced by the TCP handshake procedure.

    [0068] The TCP Protocol and the Three-Way Handshake

    [0069] The current peer-to-peer Bitcoin protocol defines 10 data messages and 13 control messages. In this context, the transfer of data, or data packets, relates to object transfer may refer to individual transactions or blocks.

    [0070] A complete list of the messages used by the Bitcoin P2P protocol is known—please refer to [4] for the complete list. A sub-group of messages are related to the request or distribution of objects. These messages are BLOCK, GETDATA, INV, TX, MEMPOOL and NOTFOUND. A sub-group of messages are related to the reliability and efficiency of objects. These messages are FEEFILTER, PING, PONG and REJECT.

    [0071] By way of example, nodes ‘i’ and T on the Bitcoin network communicate using the following steps: [0072] 1. Node i transmits an INV message containing a list of transactions. [0073] 2. Node j replies with a GETDATA message asking for a subset of the transactions previously announced. [0074] 3. Node i transmits the requested transactions.

    [0075] The method herein seeks to optimise the protocol of a blockchain network to improve, at least, the dissemination of data.

    [0076] FIG. 1 illustrates the real scenario where data, in the form of network packets, is serially sent and received at application-level according to the primitives provided by the operating system. Assuming that a transaction x fits in a single Ethernet/IP packet, its transmission to m peers requires the buffering of m different output packets. Both input and output network packets, along with other information, will contain: [0077] A serialized transaction. [0078] A logical interface ID representing the TCP/IP connection to the sending/receiving peer.

    [0079] The expected time for an incoming transaction to be processed depends on the average length (in packets) of the input queue L.sup.i, while the expected time for a processed transaction to be correctly transmitted depends on the average length of the output queue L.sup.o.

    [0080] Therefore, the efficient relay of transactions relies on the reduction of both L.sup.i and L.sup.o values. However, a probabilistic model for selective relay of the transactions to the peers directly affects L.sup.o and by induction also L.sup.i.

    [0081] In the current Bitcoin implementation, INV and GETDATA message packets are queued in the I/O buffers in the same way as transactions, with a severe impact on the transmission and reception delays.

    [0082] CONNECTIVITY—Efficient Transactions Propagation—Probabilistic Relay

    [0083] If node i was allowed to directly transmit new transactions without the use of inventory exchanges, the transactions would be disseminated in the network at a faster rate. Without some sort of regulation, however, the network would be flooded.

    [0084] The method herein, therefore, uses a mechanism for the selective relay of data or objects from a node to a peer node to avoid the transmission of a huge amount of unnecessary transactions. The mechanism can be a probabilistic model.

    [0085] The mechanism, or probabilistic model, for transaction relay is based on assumptions, with reference to FIG. 2 in which three nodes i, j and k are part of a Bitcoin network and connected together. Node i is directly connected to Node j and Node k. Nodes j and k are indirectly connected via Node i or via the Bitcoin network.

    [0086] Node i is shown having two interfaces a and b that process data in the form of a transaction r, which is initiated at Node i and experiences various stages as it is propagated between nodes and across the Bitcoin network, said stages including: [0087] a first stage r.sub.1 processes the transaction for transmission from interface a, to a peer node and is received at Node j, [0088] a second stage r.sub.2 Node j processes the transaction for transmission across the Bitcoin network is received at Node k, and [0089] a third stage r.sub.3 Node k processes the transaction for transmission to peer Node I, which receives the transaction at interface b.

    [0090] For the avoidance of doubt, r.sub.1, r.sub.2 and r.sub.3 are the same transaction and the suffixes represent a relay.

    [0091] The assumptions below are made: [0092] If a node receives from an interface (b) the same transaction which was processed for transmission from another interface (a), then the two interfaces share a degree of correlation. [0093] If a transaction generated at node i reaches node j through a given j's input interface, then a second transaction generated at node i will reach j through the same interface with high probability.

    [0094] Referring again to FIG. 2, as an example of relay correlation, Nodes j and k are peers of Node i. If i generates a new transaction and relays it to node j, and later the same transactions is received at Node i from Node k, then j and k share a logical path through the network. Therefore, the originating node does not need to relay the same information to both its peer nodes. To be clear, Node i does not need to transmit transaction r to Node k because the probability of Node k receiving said transaction is high. Similarly, in reverse, Node k does not need to transmit transaction r to Node i because the probability of receiving said transaction is high. The logic is valid in both directions.

    [0095] Relationships Between Interfaces of a Node

    [0096] FIG. 3 is an example of a local correlation matrix C that can be determined for a Node i having five interfaces—a, b, c, d and e—connected to peer nodes. The matrix represents the degree of correlation for the incoming traffic. The formation and application of said matrix is described below. The values shown in FIG. 3 are for illustration purposes.

    [0097] Each node establishes a correlation matrix C by determining, by way of example, coefficients c.sub.ab representing the correlation between the transactions received from interfaces a and b. Such a coefficient is determined between all pairs of the interfaces.

    [0098] Using the list of transaction IDs received from each interface, let t.sub.a and t.sub.b be the number of transactions received from interfaces a and b respectively, and t.sub.ab be the number of duplicate transactions received from both a and b. The correlation coefficient for interfaces a and b is defined by Formula 1:

    [00004] c a .Math. b = t a .Math. b max .Math. { t a , t b } 1 ( Formula .Math. .Math. 1 )

    [0099] As a convention, for a coefficient c.sub.ab we assume a lexicographical order a<b of its indexes i.e. a is assigned ‘0’ (a=0), b is assigned ‘1’ (b=1) and c is assigned ‘2’ (c=2), and so on between interface IDs and numerical values. Given the correlation coefficients for a set of interfaces {a, b, d, e}, then interfaces a and b are more correlated than interfaces d and e if c.sub.ab>c.sub.de.

    [0100] Since the elements on the main diagonal are not significant, the matrix size can be reduced to m(m−1) elements, as shown in FIG. 3.

    [0101] Assigning a Value to a Node Interface

    [0102] An overall correlation index of an interface a is defined in Formula 2, as follows:

    [00005] c a = .Math. i = 0 a - 1 .Math. .Math. c ia + .Math. i = a + 1 m - 1 .Math. .Math. c ai ( Formula .Math. .Math. 2 )

    [0103] Using the correlation matrix in FIG. 3 as an example, the correlation index c.sub.a for interface a can be expressed as the sum of the correlation coefficients between each interface.


    c.sub.a=C.sub.ab+C.sub.ac+C.sub.ad+c.sub.ae=0.2+0.8+0.2+0.2=1.4

    [0104] This metric can be used to rank the correlation of the individual interfaces and understand the quality of relay provided by the node peers. By way of example, if the correlation index of a is significantly higher than the average value, then the transactions received from a are highly redundant.

    [0105] Conversely, if a correlation index of a is significantly lower than the average value, then either (i) the transactions received from a are somewhat unique or (ii) the peer node connected to a is behaving maliciously. Malicious behaviour is discussed in more detail below.

    [0106] It is to be noted that each node builds its own correlation matrix based only on the received transaction flows. No correlation information is exchanged between nodes or peers to avoid the propagation of malicious information.

    [0107] Indicator

    [0108] Given an incoming transaction from interface a, the node will perform the relay to a number of peers m* in the range [m.sub.min, m.sub.max].

    [0109] The current value m* depends on the current distribution of the m−1 correlation coefficients {c.sub.a} containing connection index a:


    {c.sub.a}=[c.sub.0a,c.sub.1a, . . . c.sub.am-1]

    [0110] In other words, {c.sub.a} is the list, or set, of correlation coefficients—this set having the coefficients for interface a. The connection index is a summation of the coefficients, i.e. c.sub.a=c.sub.ab+c.sub.ac+c.sub.ad+c.sub.ae, as per FIG. 3 and the example above.

    [0111] The number of interfaces m*(a) selected for relay from interface a depends upon the computation of a metric θ.sub.i.sup.(a) for each element of the set {c.sub.a}, wherein:

    [00006] m * ( a ) = .Math. i = 0 m - 1 .Math. θ i ( a )

    [0112] The metric θ.sub.i.sup.(a) functions as a switch, or selector, and indicates whether an interface will relay data, or not. For example, an interface will relay data if its metric is ‘1’ and not relay data if its metric is ‘0’.

    [0113] By defining c.sub.a as the average value of the coefficients within the set {c.sub.a}, the metric ƒ.sub.i.sup.(a) contributes to m*(a) if the corresponding correlation coefficient c.sub.ai is lower than c.sub.a:

    [00007] θ i ( a ) = { 1 , c ai c a _ 0 c ai > c a _

    [0114] θ.sub.i.sup.(a) may alternatively be based on the median value of {c.sub.a} rather than the average value. The metric θ.sub.i.sup.(a) can also be based upon, or determined from, alternative values derived from the set {c.sub.a} using statistical analysis, e.g. the metric ƒ.sub.i.sup.(a) contributes to m*(a) if the corresponding correlation coefficient c.sub.ai is at least one standard deviation beneath the mean of the set {c.sub.a}.

    [0115] An incoming transaction, by way of example from interface a in FIG. 3, will be relayed to the m*(a) least correlated interfaces according to the correlation coefficients in {c.sub.a}. A subset of {c.sub.a} selected for relay will be defined as {c*a}.

    [0116] Returning to the example of FIG. 3, if {c.sub.a}={0.2, 0.8, 0.2, 0.2}, then,


    c.sub.a=0.35


    θ.sub.b.sup.(a)=1


    θ.sub.c.sup.(a)=0


    θ.sub.d.sup.(a)=1


    θ.sub.e.sup.(a)=1

    [0117] Therefore, the number of interfaces m*(a) selected for relay from interface a is ‘3’. The coefficients in the subset falling below the average value, or indicator, which determine the metric θ.sub.i.sup.(a), are {0.2, 0.2, 0.2}, said coefficients corresponding to {c*.sub.a} and interfaces b, d, e, are selected for relay.

    [0118] The ‘cut-off” point, or level of coefficient that was used to determine the metric in the example above, was the average of the coefficients—‘0.35’. This threshold, or indicator, determines the metric θ.sub.i.sup.(a) from which m*(a) is determined.

    [0119] Overall, therefore, the method can be said to relay data from an interface according to an indicator. Said indicator can be used to determine a metric for each interface that determines whether it will relay data, or not.

    [0120] While the indicator has been used to determine the metric in the example above, other factors can influence the metric for each interface. The metric for each interface can be calculated according to changes to the correlation matrix.

    [0121] Accommodating Changes

    [0122] The determination of which interfaces are to be used to relay information, as described above, assumes a steady-state condition in which the matrix is unchanged.

    [0123] Further details of the protocol are now provided below for an explanation of how the protocol can accommodate change—for example when peer nodes join the network and make new connections with a node, or when peer nodes leave the network and are no longer connected to an interface of a node.

    [0124] When accommodating change, data can be relayed based on the metric, which takes in to account the indicator and, ergo, the correlation index.

    [0125] Start-Up

    [0126] When a node i boots up, it initializes m peer connections with other nodes in the blockchain, such as the Bitcoin network. See, for example, the Bitcoin developer reference [4] for full details. At the initial stage, node i does not have any information about the correlation between data passing through its interfaces and has a limited set of data representing data passing or flowing therethrough. Therefore, full transaction relay will be performed for an amount of time.

    [0127] During this period T.sub.bootup the metric θ.sub.i.sup.(a) contributes to m*(a) by setting each interface to ‘1’ and relaying data from all interfaces. The metric, therefore, allows data to be relayed irrespective of the indicator for a period of time.

    [0128] The length of period T.sub.bootup is a function ƒ(m) of m, i.e. a higher number of connections requires a longer amount of time to build an accurate correlation matrix. The following functions are proposed, by way of example only:


    ƒ.sub.1(m):=m ƒ.sub.2(m):=m.sup.2

    [0129] After T.sub.bootup, node i performs the selective relay as per the existing nodes presented below.

    [0130] Existing Nodes

    [0131] The connections of a generic node j will vary with time in light of a change, such changes because of other nodes either (i) joining, (ii) leaving the network, and/or (iii) behaving maliciously, said malicious nodes selectively blacklisted and the corresponding connections closed.

    [0132] Therefore, any change in the whole network graph, can be detected and parametrized by a quantity T.sub.change representing the average time between two (or more) consecutive change events. Once every T.sub.change, node j's local correlation matrix needs to be updated.

    [0133] Updates to the correlation matrix of a node can include at least one of: [0134] A periodic reset, wherein the correlation matrix is reset. [0135] As per new nodes, full transaction relay will be performed for an amount of time T.sub.bootup. Then, node j performs the selective relay on its new m* values for each interface connected to a peer node. [0136] An update, wherein the correlation matrix is updated. [0137] For a given interface a, the β highest correlated interfaces, 0<β<m.sub.min, from the selected set {c*.sub.a} will be swapped with the β least correlated interfaces not in {c*.sub.a}. [0138] In other words, while data is relayed, normally, to the m*(a) least correlated interfaces in the subset {c*.sub.a}, when the matrix is updated interfaces that are not part of the subset {c*.sub.a} and have low correlation indexes will be added to said subset in place of those having a high correlation index within said subset. This can ensure the integrity of the flow of data across the network in time of change. [0139] Returning again to the example of FIG. 3, normally,


    c.sub.d=c.sub.da+c.sub.db+c.sub.dc+c.sub.de=0.2+0.4+0.4+0.6=1.6


    {c.sub.d}={0.2,0.4,0.4,0.6}

    [0140] Taking c.sub.d as the average value of the coefficients within the set {c.sub.d}, the metric ƒ.sub.i.sup.(d) contributes to m*(d) if the corresponding correlation coefficient c.sub.di is lower than c.sub.d:

    [00008] θ i ( a ) = { 1 , c ai c a _ 0 c ai > c a _ .Math. .Math. Then , .Math. c d _ = 0.4 .Math. .Math. θ a ( d ) = 1 .Math. .Math. θ b ( d ) = 1 .Math. .Math. θ c ( d ) = 1 .Math. .Math. θ e ( d ) = 0 [0141] Normally, the number of interfaces m*(d) selected for relay from interface d is ‘3’. The coefficients in the subset falling below the average value, or indicator, which determine the metric θ.sub.i.sup.(d), are {0.2, 0.4, 0.4}, said coefficient corresponding to {c*.sub.d} and interfaces a, b and c are selected for relay. For the avoidance of doubt, the coefficient corresponding to interface e i.e. {0.6} is not within the selected subset {c*.sub.d}. [0142] When an update occurs, in relation to interface d, it happens that there are two β highest correlated interfaces within the set {c*.sub.d}—these being interfaces b and c, both with values of 0.4. [0143] There is only one interface that qualifies as the β least correlated interfaces not in {c*.sub.d}, said interface being e. Therefore, only one ‘swap’ can be made and the choice of which interface to swap with is either b or c, because both have the same coefficient of 0.4. By way of example, the selection of which interface is selected for a swap can be made according to the lexicographic priority. This interface is then swapped with the β least correlated interfaces not in {c*.sub.a}, which means the lowest coefficients will be selected from the remaining coefficients from the set, {c.sub.d}={0.4, 0.4, 0.6}. The lowest coefficients correspond to interfaces b and c, and, therefore data will be relayed to peer nodes connected to these interfaces rather than interface a.

    [0144] If a peer disconnects, its coefficients in the correlation matrix will become invalid. At the end of the T.sub.change period, the above mentioned periodical reset is required. Then, node j performs the selective relay on its new m* values for each connection interface.

    [0145] If a new peer b joins, the interface to which it is connected will be chosen for relay regardless of the current values of {c*.sub.a} for each other peer interface with node a. By way of example, the interface with b can be randomly selected for relay every y incoming transactions for an amount of time T.sub.join.

    [0146] T.sub.join is set according to T.sub.bootup and/or T.sub.change periods. This relay will help node b to build its own correlation matrix. At the end of T.sub.join, b will be selected for relay according to the updated values {c*.sub.a} for each other peer interface at node a only.

    [0147] Node i may require to check if its peer j is still alive because no incoming traffic was received from its interface. A temporary relay request for an arbitrary small amount of time T.sub.temp can be sent to j.

    [0148] Nodes who publish a new transaction, i.e. originating nodes, need to select the sets of selected peers for relay carefully to guarantee the dissemination in the network. For instance, a number of nodes m** for any interface a can be selected as first relay, with m*(a)<m**<m.

    [0149] The whole list of parameters is detailed in Table 1, below.

    TABLE-US-00001 TABLE 1 Parameter Description m Number of peers of a Bitcoin node (vertex degree). m.sub.min Minimum number of selected peers for relay. m.sub.max Maximum number of selected peers for relay. m*(a) Current number of selected peers for relay of transactions coming from interface a. m** Number of selected peers for relay of transactions at the originating node. {C.sub.a} List of peers' correlation coefficients of transactions coming from interface a. {C*.sub.a} List of selected peers' correlation coefficients for relay of transactions coming from interface a. c.sub.a Average value of {C.sub.a}. c.sub.ab Correlation coefficient for interfaces a and b. t.sub.a Number of transactions received from interface a. t.sub.ab Number of duplicate transactions received from both interfaces a and b. T.sub.bootup Time required for a new node to build a stable correlation matrix. T.sub.change Time window between two consecutive local correlation matrix updates. T.sub.join Time window for dedicated relay to a new peer. T.sub.temp Time window for temporary relay to a peer. β Number of interfaces selected for soft update. γ Fraction of the number of incoming transactions relayed to a new peer.

    [0150] Malicious Nodes

    [0151] A malicious nodes aims to make the probabilistic model for transaction dissemination less efficient. A malicious node can function or act in any of the following ways. [0152] A malicious node does not propagate the transactions that are supposed to be propagated. Honest nodes connected to this malicious node may be able to retrieve these transactions from other honest (or malicious) peers. However, a complete dissemination of a transaction in the network is not required, assuming that a set of miners can still receive and include them in a new mined block. [0153] A malicious node propagates the same legit transaction multiple times. The receiving node is able to keep track of the previously received transactions by means of a look-up table. Therefore, misbehaviour is easily detectable and malicious peers will be removed. [0154] A malicious node propagates invalid transactions. If the receiving node performs transaction validation, misbehaviour is easily detectable and malicious peers will be removed. [0155] A malicious node generates and propagates a huge number of dummy transactions. Receiver nodes can respond differently to a flood of valid incoming transactions from a peer. In response, (i) the receiver will ask the transmitter to reduce the transaction relay rate. If the problem persists, the transmitter peer will be removed, (ii) the receiver can ask (and check) for a minimum transaction fee on the relayed transactions. This makes the attack expensive for a malicious node. If the minimum transaction fee is not respected, the transmitter peer will be removed, and/or (iii) the transmitter peer is simply removed.

    [0156] Based on the previously mentioned attacks and the probabilistic model, the following properties can be inferred. [0157] Transaction relay rate depends on both bandwidth availability and processing performance of the individual nodes. For this reason, a default maximum rate is not enforced. [0158] Malicious nodes cannot simply stay silent. They must forward valid transactions to keep their connections alive.

    [0159] Messages [0160] To support the implementation of the method herein, which encompasses a probabilistic relay of data or objects from node interfaces on the Bitcoin network, new message types can be introduced. Moreover, some of the current message types detailed above in relation to the current peer-to-peer Bitcoin protocol are no longer used, while those not mentioned below are unchanged with respect to the Bitcoin P2P protocol.

    [0161] Data Messages

    [0162] The message GETDATA to request one or more data objects from another node is no longer used, or removed from use. Similarly, message INV that transmit one or more inventories of objects known to the transmitting peer is no longer used, or removed from use. The NOTFOUND message as reply to GETDATA is no longer used, or removed from use.

    [0163] The MEMPOOL message, already introduced above, requests the IDs of transactions that the receiving node has verified as valid but were not published in a block, i.e. transactions which are in the receiving node's memory pool. The response to this message is one or more INV messages containing the transaction IDs. A node sends as many INV messages as needed to reference its complete memory pool. A node may disable this feature, therefore transmission of its local MEMPOOL is not required.

    [0164] Control Messages

    [0165] PING and PONG messages to confirm that a peer is still connected are removed. Silent nodes, such as malicious nodes, are managed as described above. Message FEEFILTER is kept unchanged. However, peers who do not respect the fees threshold will be removed.

    [0166] New messages TEMP and FLOW are introduced. [0167] The TEMP message is used when a node is required to check if a peer is still alive because no incoming traffic was received from its interface. A temporary relay request TEMP (for an arbitrary small amount of time T.sub.temp) is sent to the peer connected to said interface. If the receiving peer starts relaying but does not respect the given time window T.sub.temp, it will be removed from the matrix. [0168] The FLOW message is used when a node requires a peer to modify the transaction relay rate (e.g. number of transactions per second) according to the local flow control. The rate may increase, decrease or temporarily be suspended (relay rate=0). If the receiving peer does not respect the new relay rate, it will be removed. A new FLOW request will be required to modify the current relay rate. If the receiving peer does not receive a second FLOW request to resume the relay, it will remove the sender. [0169] The MINFEE message is used when a node is required to receive transactions filtered by a minimum transaction fee. The minimum value can be set for individual transactions and/or total amount of fees in case of multiple transactions fit in a single IP packet. If the receiving peer does not respect the transaction fee limit, it will be removed.

    [0170] Dimensioning

    [0171] FIG. 4 illustrates an element of a blockchain network having nodes i, j, k and l. Nodes can have different traffic profiles and can be categorised and referred to, by way of example, as: [0172] “Providers”, which are nodes that primarily generate transactions and node i may represent one or more merchant nodes with dedicated business activities [0173] “Aggregators”, which are nodes that primarily collect transactions, and node I may represent one or more merchant nodes with dedicated business activities. [0174] “Routers”, which are nodes such as node j and node k that primarily route transactions.

    [0175] Each node is able to automatically control the amount of in/out traffic depending on the current status of its connections and the bandwidth resources locally available. The allocation of bandwidth between input and output resources or interfaces can be adjusted according to the type or function of the node. Nodes i and j will be unbalanced, while node j and node k will be balanced because the same amount of data flowing in to a Router node will flow out of said node.

    [0176] A node, for example Node j, has a plurality of interfaces connected to peer nodes. The node can determine its bandwidth (B) capacity based on a maximum amount of data (D) amount processable by the node over a time period (T). Some nodes have bandwidth caps imposed by their network providers.

    [0177] B.sub.j is the application-level bandwidth available at node j. The bandwidth B.sub.j is an overall budget which depends on both (i) the quality of the network access provided by the Internet Service Provider (ISP) and (ii) the network usage by other local applications.

    [0178] The value of B.sub.j depends on the different bandwidth requirements for the input B.sub.j.sup.j and output traffic B.sub.j.sup.o. Therefore,


    B.sub.j=B.sub.j.sup.i+B.sub.j.sup.o

    [0179] For Routers it is assumed that

    [00009] B j 2 = B j i = B j o

    [0180] For Aggregators it is assumed that


    B.sub.j.sup.i=k.sub.jB.sub.j.sup.o

    [0181] k.sub.j is a profile factor of the node and is determined, by way of example, from the type of node i.e. nominally set, or from the difference between the actual data passing in to and out of the node. Said profile factor k.sub.j is preferably determined from the ratio of incoming data to the node and data transmitted from the node, which can include relayed data and/or locally generated data.

    [0182] Data (d.sub.j) can monitored and established from the input data (d.sub.j.sup.j) and output data (d.sub.j.sup.o) passing through interfaces of the node j to and from peer nodes, and a profile factor (k.sub.j) of the node j can be determined from the difference between the input data (d.sub.j.sup.j) to output data (d.sub.j.sup.o)

    [0183] Data (d.sub.j) can monitored over a plurality of time periods (T). The values of data monitored can be reset to zero when a new period T begins. Data generated within node j contributes to output data (d.sub.j.sup.o).

    [0184] The time period (T) can be used to determine the maximum amount of data (D.sub.j) processable by node j, such that


    D.sub.j=D.sub.j.sup.i+D.sub.j.sup.o, [0185] wherein D.sub.j.sup.j and D.sub.j.sup.o are the maximum amount of incoming and outgoing data supported within the time period.

    [0186] The time period T can be a flow control time window T.sub.flow. Considering the variables at node j then,

    [00010] B j = ( k j + 1 ) .Math. D j o T B j o = D j o T

    [0187] Therefore, D.sub.j.sup.j and D.sub.j.sup.o values are system parameters which depend on k.sub.j, T.sub.flow and B.sub.j.

    [00011] D j o = B j .Math. T ( k j + 1 ) D j i = k j .Math. D j o = k j .Math. B j .Math. T ( k j + 1 )

    [0188] Current Factors

    [0189] For optimising the performance of a blockchain network Bitcoin is used by way of example. The specifications of the Bitcoin core allow up to 125 connections to different peers, 8 of which are outbound, by default. Therefore, it will be possible to have at most 117 inbound connections.

    [0190] The number m of peers P connectable to a node is limited thus m.sub.min is 1 and m.sub.max is 125 for the current system. Without current restrictions, the number m of peers P connectable to a node is thus m.sub.min≥1 and m.sub.max≤P.

    [0191] The minimum number of peer nodes (m.sub.min) and a maximum number of peer nodes (m.sub.max) connectable to the node j can be set according to said monitored data (d) and the maximum number of peers (P) connectable to the node j.

    [0192] Over two or more time periods (T) the monitored data (d) can be used to determine a ratio and set the minimum number of peer nodes (m.sub.min) and a maximum number of peer nodes (m.sub.max) connectable to the node j using said ratio. The ratio can be between actual data and the data limit. Other relationships, such as a non-linear relationship between data (d) to maximum data (D), can be used.

    [0193] In addition to the relationship between the data (d) and data limit (D), the type or function of the node j can be factored in to the determination of the number of peer nodes that a node j can connect to. The function is taken in to account using a profile factor (k) that is assigned a value according to the function of the node. The profile factor can be variable.

    [0194] By way of example, the profile factor value is one (1) for nodes configured to route data, greater than one (1) for nodes that collect or aggregate data, and less than one (1) for nodes that generate or provide data.

    [0195] To mitigate fluctuations, or rogue data, averaged data (d*) comprising an average of data (d) over the plurality of time periods (T) can be used to determine the minimum and maximum number of peer nodes connectable to the node j by determining the ratio of averaged data (d*) to the maximum data (D).

    [0196] Further still, a buffer can be factored in to the determination of the minimum number of peer nodes (m.sub.min) and a maximum number of peer nodes (m.sub.max) connectable to the node. Such a buffer can include using a standard deviation (σ) or variance (σ.sup.2) of the data (d) over a plurality of time periods (T), applying a scaling factor to the variance, and determining the minimum and maximum number of peer nodes connectable to the node according to the scaled variance. The scaling factor can be a variance parameter t.

    [0197] Monitored data (d.sub.j) is a sum of the input data (d.sub.j.sup.j) and output data (d.sub.j.sup.o) passing through interfaces of the node j to and from peer nodes. It follows that the minimum and maximum number of peer nodes m connectable to a node j is set by:

    [00012] m min = [ d j * D j i + D j o .Math. P - t .Math. σ j 2 ] m max = [ d j * D j i + D j o .Math. P + t .Math. σ j 2 ] [0198] wherein [0199] m.sub.min and m.sub.max are integer numbers of peer node connections [0200] d.sub.j* is the averaged level of data (d) over a plurality of, [0201] D.sub.j.sup.j and D.sub.j.sup.o are the maximum amount of incoming and outgoing data supported within the time period (T), [0202] P is the maximum number of peers (P) connectable to the node, [0203] σ.sup.2 is the variance of said data (d) for a given number of time periods (T), [0204] t is a scaling factor or variance parameter, and [0205] m.sub.max≤P

    [0206] There can be a direct and proportional relationship between the linear range [1, P] and [m.sub.min, m.sub.max] values.

    [0207] The number of time periods (T), or T.sub.flow time windows, considered for averaging d.sub.j and updating t can be constant. The variance parameter t can be a multiplicative factor.

    [0208] By imposing m.sub.min≥1 and m.sub.max≤P in the previous equations then said equations can be rearranged to determine the range of values of the variance parameter t, which can be written as:

    [00013] d j * D j .Math. P - 1 σ j 2 t P ( 1 - d j * D j ) σ j 2

    [0209] During a node bootstrap, or initial set-up, m.sub.min=1 and m.sub.max=P.

    [0210] The operational range of variables P, d* and D.sub.j, is limited and, therefore, the operational range of t depending on σ.sub.j.sup.2 is limited. By way of example, given P=125 and d.sub.j*/D.sub.j=0.5, then

    [00014] 61.5 σ j 2 t = 6 .Math. 2 . 5 σ j 2

    [0211] The variance parameter can be determined using an approximation of the mean value of the range, representable by:

    [00015] t P - 1 2 .Math. σ j 2

    [0212] Once the m.sub.min and m.sub.max (integer numbers of peer node connections) are determined, the node can further determine a correlation matrix between the interfaces and peer nodes to which it is connected. The matrix is compiled with correlation coefficients representing the correlation between data processed at each interface of said node.

    [0213] Time Windows (T)

    [0214] The time period (T) is determined from the length of time between changes to the correlation matrix.

    [0215] The node can additionally or alternatively receive information related to the activity of nodes on the blockchain network and set the time period (T) according to the length of time between a change to the correlation matrix and/or the change in activity level on the network.

    [0216] The time window between two consecutive local correlation matrix updates can be used to determine T.

    [0217] The time T can be determined, or at least influenced by, the average number of active nodes was 7561.sup.1 (at June 2017), with an average increment of 0.85% each 24 hours. At these conditions, an initial value of T can be fixed at 60 mins considering that an average of 2.5 new nodes joins the network every hour. During normal relay operations of the node, T can be considered inversely proportional to the current cumulative amount of incoming data (d.sub.j.sup.j).

    [0218] When a new peer node is connected then data can be relayed to said new peer for a dedicated period of time T.sub.join. According to the state of the network this can be considered inversely proportional to the current cumulative amount of outgoing data d.sub.j.sup.o.

    [0219] T.sub.bootup is the time required for a new node to build a stable correlation matrix, this value is proportional to a function ƒ(m), i.e., a higher number of connections requires a longer amount of time to build an accurate correlation matrix.

    [0220] For the purpose of the invention, the time period T is the same as T.sub.flow, which is the time window between two consecutive updates of the cumulative amount of incoming and outgoing data. This value can be proportional to T.sub.change and in the simplest case, T.sub.flow=T.sub.change=T.

    [0221] In other words, the time period (T) can be a flow control parameter (T.sub.flow), and said parameter is adjusted to modify the traffic passing through the node. The time period (T) can be adjusted by the node to a predetermined value temporarily. This can be based on local changes to the node that influence the bandwidth and, subsequently the data flow therethrough. In these circumstances an increased time period, such as a period twice the length of the preceding period i.e. 2T can be used for a limited period of time. Similarly, the period T can be set for an initiation period, wherein the node accumulates data. Either a local change or an initiation period can result in temporary relay of data to all peer nodes for a period of time.

    [0222] It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be capable of designing many alternative embodiments without departing from the scope of the invention as defined by the appended claims. In the claims, any reference signs placed in parentheses shall not be construed as limiting the claims. The word “comprising” and “comprises”, and the like, does not exclude the presence of elements or steps other than those listed in any claim or the specification as a whole. In the present specification, “comprises” means “includes or consists of” and “comprising” means “including or consisting of”. The singular reference of an element does not exclude the plural reference of such elements and vice-versa. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a device claim enumerating several means, several of these means may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

    SUMMARY

    [0223] Today's Bitcoin network is heavily centred around mining, in terms of computational effort. With vastly increased volumes of transactions this is not necessarily going to be feasible. The solution described in this specification leads to adjustment or dimensioning of the number of peer connections according to the data to be processed and bandwidth limits. While a probabilistic relay improves the handling and propagation of massive amount of transactions it does not take in to account the performance capabilities or bandwidth available at each node.

    [0224] The invention can provide a method of dimensioning the number of connected peers and adjust the timing of actions taken by a node. Such timings include the reset or re-determination of the correlation matrix, boot-up time and the like.

    [0225] Therefore, in addition to adjusting the connectivity according to the performance to optimise flow across the network by avoiding bottlenecks, the invention compliments the dissemination of data packets by optimising the communications between nodes.

    [0226] Long queues caused by bottlenecks at nodes are inhibited by controlling the connectivity and, optionally, selectively relaying data packets according to the correlation between interfaces.

    [0227] The overall number of peer connections is managed and can be complimented with a reduction in packet transmission while keeping a safe level of information redundancy.

    REFERENCES

    [0228] [1] Bitcoin transaction levels are made available via Blockchain Luxembourg S.A.R.L. and retrievable from http.//blockchain.info. [0229] [2] VISA transaction levels are summarised in a document published by Visa Inc. Visa Inc at Glance. June 2015. It is retrievable from http.//usa_visa.com/dam/VCOM/download/corporat/media/visa-fact-sheet-Jun2015_pdf. [0230] [3] Decker, Christian and Wattenhofer, Roger (2013). Information propagation in the bitcoin network. IEEE Thirteenth International Conference on Peer-to-Peer Computing (P2P), 2013. [0231] [4] A Bitcoin Developer Reference is retrievable from http.//bitcoin.org/en/developer-reference.