METHOD OF PERFORMING A DISTRIBUTED TASK OVER A NETWORK
20220416944 · 2022-12-29
Inventors
Cpc classification
H04W84/18
ELECTRICITY
H04L1/0078
ELECTRICITY
H03M13/3761
ELECTRICITY
H03M13/19
ELECTRICITY
H03M13/611
ELECTRICITY
H04L67/10
ELECTRICITY
International classification
H04L1/00
ELECTRICITY
H03M13/00
ELECTRICITY
H03M13/19
ELECTRICITY
Abstract
An aspect of the invention provides a method of performing a distributed task over a network comprising a plurality of nodes. The method comprises: a plurality of network nodes observing (300) data; applying a first linear code function to the data observed by at least one network node of the plurality of network nodes to obtain (302) at least one function output; applying errors (304) to the at least one function output; a query node selected from the network nodes performing (308) a mixing procedure to aggregate node observations to obtain a first set of aggregated values until a stopping criteria (306) is satisfied; applying (312) a second linear code function to the set of aggregated values to obtain a second set of aggregated values returned to their observed domain; and the query node outputting (314) the second set of aggregated values.
Claims
1. A method of performing a distributed task over a network comprising a plurality of nodes, the method comprising: a plurality of network nodes observing data; applying a first linear code function to the data observed by at least one network node of the plurality of network nodes to obtain at least one function output; applying errors to the at least one function output by the at least one network node of the plurality of network nodes to obtain encoded node observations with errors; a query node selected from the network nodes performing a mixing procedure to aggregate encoded node observations with errors to obtain a first set of aggregated values until a stopping criteria is satisfied; applying a second linear code function to the set of aggregated values to obtain a second set of aggregated values returned to their observed domain; and the query node outputting the second set of aggregated values.
2. The method of claim 1 wherein the stopping criteria is satisfied when the query node has aggregated encoded node observations with errors from a maximum number of nodes.
3. The method of claim 2 further comprising determining the maximum number of nodes at least partly from a Hamming distance.
4. The method of claim 2 further comprising determining the maximum number of nodes at least partly by detecting a threshold number of symbol errors in the aggregated encoded node observations with errors.
5. The method of claim 1 further comprising performing a routing procedure to ensure that the aggregated encoded node observations do not include multiple observations with errors from a single node.
6. The method of claim 5 wherein the routing procedure includes a tree protocol.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0025] Preferred forms of the method of performing a distributed task will now be described by way of example only with reference to the accompanying figures in which:
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
DETAILED DESCRIPTION
[0032]
[0033] Nodes within the network 100, or WSN, are configured to communicate with each other using their respective two-way communication systems.
[0034] Those nodes that are equipped with a sensor each hold observation data. When a user wishes to initiate a distributed task, the user selects a node that will henceforth be considered a query node for the distributed task. For both practical and privacy reasons, the query node spreads the task to a subset of nearby nodes.
[0035] In an embodiment a distributed task includes query node 108. The query node 108, shown in
[0036] In notations below the set of nodes forming the WSN 100 is referred to as , and the set of nodes forming the subnetwork 120 is referred to as
.sub.q, where
.sub.q⊂
.
[0037] As will be further described below, the subnetwork 120 collaboratively solves a distributed task while also limiting the ability of nodes to join the distributed task that are not within the subnetwork 120. Examples of nodes outside the subnetwork 120 include nodes 102, 104 and 106.
[0038] Practically, task subnet 120 that is smaller than network 100, allows for more efficient computations to be performed since information is not required to propagate through the entire public network 100. For privacy purposes this reduced information travel distance means that expected levels of privacy are more easily retained by excluding nodes that are very distant from a query node.
[0039]
[0040] In an embodiment the sensor 202 is configured to perform a specific signal processing task. One example of a task performed by the sensor 202 is to monitor an acoustic signal.
[0041] It will be appreciated that the sensor 202 is one example of a device configured for data acquisition. In other embodiments, data is computed by data centre(s) and/or computer(s) in locations remote from the query node 108. In such cases the network 100 processes the data obtained from the data centre(s) and/or computer(s) as an alternative to the sensor 202 obtaining the data.
[0042] The receiver 204 is configured to receive data from other nodes within the network 100. In an embodiment the receiver 204 is configured to receive data from other nodes within the network 100 that are physically close to node 108. In an embodiment the nodes physically close to node 108 are referred to as a neighbourhood of nodes in relation to node 108. In an embodiment the neighbourhood of nodes includes nodes selected from network 100 and/or subnetwork 120.
[0043] The transmitter 206 is configured to transmit data to other nodes within the network 100. The receiver 204 and transmitter 206 are shown as separate modules for clarity. It will be appreciated that receiver 204 and transmitter 206 could be provided as either separate modules or a single combined module.
[0044] In an embodiment the output module 208 is configured to display or otherwise output the result of a query assigned to query node 108.
[0045] The query node 108 includes message quantizer 210. In an embodiment the message quantizer 210 is configured to receive observed data from the sensor 202. The observed data is quantized to produce messages m.sub.i∀i∈.sub.q.
[0046] An encoder 212 receives the messages from the message quantizer 210. The encoder 212 encodes the messages using a linear code to produce codewords c.sub.i∀i∈.sub.q.
[0047] An error engine 214 receives the codewords from the encoder 212 and applies random symbol errors. For example, U(n, λ) is a λ-dimensional discrete error vector with, for each dimension, integers drawn from {0, . . . , n−1}.
[0048] An aggregator 218 receives codewords containing errors from the error engine 214.
[0049] Aggregator 218 also receives codewords from receiver 204 that is configured to receive data transmitted from other nodes in the network 100, for example node 110 and node 102 from
[0050] In an embodiment aggregator 218 is configured to perform a mixing procedure taking as input the codewords obtained from the observations of sensor 202 and codewords received from other nodes in network 100 by receiver 204.
[0051] Aggregator 218 tests for at least one stopping criteria. In an embodiment there is only feed-forward flow of data towards the query node 108. In such cases the at least one stopping criteria includes a determination that the query node has received a response to the distributed task, such as the completion of an aggregation of values over a tree of nodes with query node 108 at its root.
[0052] In an embodiment the at least one stopping criteria is based at leas number of errors detected in the codewords. While the number of errors remain below a threshold the codewords are passed to transmitter 206 for transmission to other nodes in network 100.
[0053] It will be appreciated that the number of errors detected by aggregator 218 increases as other nodes in network 100 introduce additional errors using their own error engines.
[0054] Query node 108 uses output module 208 to output an estimate from aggregator 218.
[0055] Decoder 222 receives codewords from aggregator 218. Decoder 222 decodes the codewords to produce messages.
[0056] Message dequantizer 224 receives the messages from decoder 222. Message dequantizer 224 dequantizes the messages to produce data. The data is received by the output module 208 and presented to a user as an output of a task or query.
[0057] In an embodiment the message dequantizer 224 and/or the output module 208 are present only in the query node 108. In an embodiment the message dequantizer 224 and/or the output module 208 are present in other nodes in the neighbourhood of nodes, but are configured to operate where the node they are associated to is a query node.
[0058] In an embodiment message dequantizer 224 and/or output module 208 are present in a special class of nodes. In an embodiment, the network comprises nodes where each node can function as a query node, but where message dequantizer 224 and output module 208 are active only in the query node. In an embodiment, only a subset of the nodes of network 100 can function as a query node.
[0059]
[0060] In an embodiment, network 120 is defined for the processing task. Once network 120 is set up, the processing task is typically performed in a distributed manner without central co-ordination.
[0061] A formal notation for the method is set out below as:
TABLE-US-00001 Require: Task subnet .sub.q; message quantizer Q.sub.l,r: message dequantizer R.sub.l,r; linear encoder E.sub.l,n.sup.r; linear decoder D.sub.l,n.sup.r; code length n; number of errors λ Symbol error index vectors e.sub.i = a ~ U (n, λ) ∀i ∈
Nodes observe u.sub.i ∈
.sup.l ∀i ∈
Aggregation defined over task subnet (
.sub.q,
.sub.q) ⊂ (
,
) Encode c.sub.i.sup.0 = E.sub.l,n.sup.r (Q.sub.l,r (u.sub.i)) ∈
.sub.r.sup.n ∀i ∈
Apply errors [c.sub.i.sup.0].sub.e.sub.
k = 0 while STOPPING CRITERIA == FALSE do Determine current mixing matrix P.sup.k+1 ∈
.sub.r.sup.|
.sub.q.sup.| × |
.sub.q.sup.| c.sub.i.sup.k+1 ←
[P.sup.k+1].sub.i,jc.sub.j.sup.k ∀i ∈
.sub.q if ERRORS EACH ITERATION == TRUE then Apply errors [c.sub.i.sup.k+1].sub.e.sub.
.sub.q end if k ← k + 1 end while Decode u.sub.i.sup.k = R.sub.l,r (D.sub.n,l.sup.r(c.sub.i.sup.k))∀i ∈
.sub.q
[0062] Query node 108 observes 300 data using sensor 202. Other nodes in network 100 and/or network 120 also observe data. Query node 108 and the other nodes in network 100 comprise a plurality of network nodes observing data.
[0063] In an embodiment the observed data at each node i is quantized to produce messages m.sub.i∀i∈.sub.q.
[0064] These messages are encoded 302 using a linear code to produce codewords c.sub.i∀i∈.sub.q. In an embodiment the linear code is defined in advance. For example, a first linear code function may be applied to the data observed by at least one network node of the plurality of network nodes to obtain at least one function output.
[0065] After encoding, errors are applied 304 to each node independently. In an embodiment these errors comprise random symbol errors. In an embodiment, U(n, λ) is a λ-dimensional discrete multivariate uniform distribution with integers drawn from {0, 1, . . . , n−1} independently for each dimension. In an embodiment the errors are applied to the at least one function output obtained by the first linear code function.
[0066] In an embodiment a general mixing procedure is performed by aggregator 218 (see
[0067] In an example the method shown in .sub.p.sup.n×n. The update c.sub.i.sup.k+1←
[P.sup.k+1].sub.i,jc.sub.j.sup.k∀i∈
.sub.q is performed using integer arithmetic modulo p, guaranteeing no overflow.
[0068] As more observations are included in the mixture, the number of symbol errors in the current aggregate instance increases. In an embodiment, additional errors are applied 310 at the same indices e.sub.i if so desired.
[0069] In an embodiment, the maximum number of nodes that may join a task before erroneous decoding occurs is determined by the Hamming distance d of the linear code used and/or by the number of symbol errors introduced at each node.
[0070] In an embodiment the value of d and/or the number of symbol errors introduced at each node is/are predefined.
[0071] Once a stopping criteria is met, the codeword resulting from the mixing performed by the aggregator 218 is decoded 312. An example formal notation of the decoding process is u.sub.i.sup.k=R.sub.l,r(D.sub.n,l.sup.r(c.sub.i.sup.k)) ∀i∈.sub.q.
[0072] In embodiment, decoding 312 includes applying a second linear code function to the set of aggregated values to obtain a second set of aggregated values returned to their observed domain. Query node 108 then outputs 314 an estimate of the private aggregation procedure. In an embodiment query node 108 outputs the second set of aggregated values.
[0073]
[0074] In many cases it may be desirable to perform a simple routed summation of node values over the query subnet .sub.q 120 (see
[0075] Some considerations come with performing a distributed summation. First, a routing procedure needs to be performed across the subnet so that messages are only included once in the summation as values are accumulated at the query node.
[0076] In an embodiment a tree protocol is used to remove certain subnet edges for this purpose, resulting in a tree graph rooted on the query node.
[0077] A formal notation for method 400 is set out below as:
TABLE-US-00002 Require: Task subnet .sub.q; prime field characteristic p; mes- sage quantizer Q.sub.l,p; message dequantizer R.sub.l,p; linear en- coder E.sub.l,n.sup.p; linear decoder D.sub.n,l.sup.p; code length n; number of errors λ Symbol error index vectors e.sub.i = a ~ U (n, λ) ∀i ∈
Nodes observe u.sub.i ∈
.sup.l ∀i ∈
Summation defined over task subnet (
.sub.q,
.sub.q) .Math.
,
) Construct tree edge set
.sub.q.sup.0 .Math. rooted at query node q m.sub.i = Q.sub.l,p (u.sub.i), |
.sub.q|max(||m.sub.0||.sub.∞, . . . , ||m.sub.n||.sub.∞) ≤ p Encode c.sub.i.sup.0 = E.sub.l,n.sup.p(m.sub.i) ∈
.sub.p.sup.n ∀i ∈
.sub.q Apply errors [c.sub.i.sup.0].sub.e.sub.
.sub.q k = 0 while
.sub.q.sup.k ≠ 0 do Define leaf nodes
.sub.q.sup.k, leaf parents
.sub.q.sup.k, and leaf edges .sub.
.sub.q.sup.k = {(i, j) |j ∈
.sub.i ∀i ∈
.sub.q.sup.k} using
.sub.q.sup.k [P.sup.k+1].sub.i,j = 1 for i == j, [P.sup.k+1].sub.i,j = 1 for j ∈
.sub.i ∩
.sub.q.sup.k ∀i ∈
.sub.q.sup.k, [P.sup.k+1].sub.i,j = 0 otherwise. .sub.
.sub.q.sup.k+1 ←
.sub.q.sup.k\
.sub.q.sup.k c.sub.i.sup.k+1 ←
[P.sup.k+1].sub.i,jc.sup.jk ∀i ∈
.sub.q k ← k + 1 end while u.sub.q.sup.Sum = R.sub.l,p(D.sub.n,l.sup.p(c.sub.q.sup.k))
[0078] Method 400 describes a Distributed Private Summation (DPS) procedure for the specific case of prime fields r=p.
[0079] A query node 108 observes 402 data using sensor 202. Other nodes in the network 100 and/or network 120 also observe data. In an embodiment the observed data at each node i is quantized to map between and
.sub.p.sup.l.
[0080] Sensor data is encoded 404 by linear encoder 212 (see .sub.p.sup.l and
.sub.p.sup.n. A predefined code length n and a predefined number of symbol errors λ are also necessary.
[0081] Each node determines the number, λ, of codeword symbol indices that will be corrupted by error. These errors are applied 406 to the sensor data.
[0082] In an embodiment a summation procedure is performed by aggregator 218 until a stopping criteria 408 is met. In an embodiment, a stopping criteria is satisfied or met when the tree has been reduced to consist of the query node only.
[0083] Nodes continue to observe signals u.sub.i ∀i∈. and receive data from other nodes. A summation task is defined over a task subnet
.sub.q∈
. In an embodiment the summation task includes determining 410 a sum of leaf nodes.
[0084] A set of edges .sub.q.sup.0 is determined that converts the general task graph into a tree graph rooted at the query node q. Initial codewords c.sub.i.sup.0 are computed by first quantizing and then encoding node observations u.sub.i. In an embodiment, aggregator 218 is configured to iteratively sum through the tree, from the leaf nodes to the root.
[0085] At each iteration k the tree edges .sub.q.sup.k are used to define a leaf node set
.sub.q.sup.k, a set of direct leaf parent nodes
.sub.q.sup.k, and/or a set of all edges connected to leaf nodes denoted
.sub.q.sup.k.
[0086] Each leaf parent stores the sum of its previous codeword and the codewords of all its leaf neighbours (defined as the intersection of the leaf parent's neighbours and the current leaf nodes) as c.sub.i.sup.k+1.
[0087] The current tree edge set is then updated by removing the current leaf edge set .sub.q.sup.k from the previous tree edge set.
[0088] Once the stopping criteria is met, the final output at the query node is the decoded and dequantized codeword after summation termination. The codeword is decoded 412 and output 414.
[0089] .sub.q.sup.0 is determined that converts a general task graph into a tree graph rooted at the query node q. The method 500 iteratively sums through the tree until a stopping criteria 502 is met. One example of a stopping criteria is a determination that there are no unsummed leaf nodes remaining.
[0090] While the stopping criteria remains unmet, at each iteration k the tree edges. .sub.q.sup.k are used to define 504 tree parameters including a leaf node set
.sub.q.sup.k, a set of direct leaf parent nodes
.sub.q.sup.k, and a set of all edges connected to leaf nodes denoted
.sub.q.sup.k.
[0091] Each leaf parent stores 506 the sum of its coded observed data and the codewords of all its leaf neighbours as c.sub.i.sup.k+1. This sum is defined as the sum of the codewords of the intersection residing in the leaf parent's neighbours and the current leaf nodes as well as the coded observation of the leaf parent.
[0092] The current tree edge set is then updated 508 by removing the current leaf edge set .sub.q.sup.k from the previous tree edge set. The sums now present in each of the newly defined leaves, which are in the form of a codeword, are the result of method 500 for subsequent processing by method 400 from
[0093]
[0094] A formal notation for the method is set out below as:
TABLE-US-00003 Require: Task subnet .sub.q; prime field characteristic p satisfy- ing (7); message quantizer Q.sub.l,p; message dequantizer R.sub.l,p; linear encoder E.sub.l,n.sup.p; linear decoder D.sub.n,l.sup.p; code length n; number of errors λ Symbol error index vectors e.sub.i = a ~ U(n, λ) ∀i ∈
Nodes observe u.sub.i ∈
.sup.l ∀i ∈
Consensus defined overtask subnet (
.sub.q,
.sub.q) .Math. (
,
) Determine consensus matrix P ∈
.sub.p.sup.|
.sup.
.sup.
.sub.q|max(||m.sub.0||.sub.∞, . . . , ||m.sub.n||.sub.∞) ≤ p Encode c.sub.i.sup.0 = E.sub.l,n.sup.p(m.sub.i) ∈
.sub.p.sup.n ∀i ∈
.sub.q [c.sub.i.sup.0].sub.e.sub.
.sub.q k = 0 while STOPPING CRITERIA == FALSE do c.sub.i.sup.k+1 ← Σ.sub.j∈N.sub.
.sub.q k ← k + 1 end while m.sub.i.sup.Avg = D.sub.n,l.sup.p(c.sub.i.sup.k)∀i ∈
.sub.q u.sub.i.sup.Avg = mod(|
|R.sub.l,p(m.sub.i.sup.Avg),p)/|
.sub.q| ∀i ∈
.sub.q.
[0095] The method 600 has the potential to allow distributed consensus to be performed in such a way that information travel through the network is limited.
[0096] A static mixing matrix P is determined 602. In an embodiment method 600 requires a predefined code length n and a predefined number of symbol errors λ. Since consensus is being performed over the finite field .sub.p particular care must be taken.
[0097] For example, the task subnetwork cardinality may not be an integer multiple of the field characteristic, le:
|.sub.q|≠αp,α
.
[0098] In practice, the above equation can be guaranteed by making the field size larger than the expected maximum task subnet cardinality, which may often be the case. Next, when determining 602 the mixing matrix P the entries of P may be determined by simultaneously satisfying the following equations:
P∈.sub.p.sup.|ν.sup.
.sub.q.Math.[P].sub.i,j=0,
=P
,
.sup.T=
.sup.TP,
C.sub.P(s)=s.sup.n−1(s−1),
where C.sub.P(s) is the characteristic polynomial of matrix P, with indeterminates given by powers of s. The first 3 of the above equations are easy to satisfy, simply requiring P to share the sparsity pattern of the underlying physical network and be doubly stochastic. The requirement on the characteristic polynomial given by the 4.sup.th equation is less straightforward, and requires distributed computation of the determinant of P.
[0099] Given that the above equations are satisfied, each node determines λ codeword symbol indices that will be corrupted by error for the consensus duration.
[0100] A query node 108 (see .sup.l and
.sub.p.sup.l to produce messages.
[0101] These messages are encoded 606 to map between .sub.p.sup.l and
.sub.p.sup.n.
[0102] Given these requirements, each node applies 608 errors by determining A codeword symbol indices that will be corrupted by error for the consensus duration.
[0103] Nodes continue to observe signals u.sub.i ∀i∈, and a consensus task is defined over a task subnet
.sub.q⊂
. Initial codewords c.sub.i.sup.0 are computed by first quantizing and then encoding node observations u.sub.i.
[0104] At each stage of iterative consensus, each node i∈.sub.q applies 608 errors to its predetermined symbol error indices by sampling new symbol values from U(r, s). The notation [⋅]e.sub.i is used to select the elements of vector (⋅) at the indices contained in the vector e.sub.i.
[0105] Each node shares its corrupted codewords with the local neighbourhood N.sub.i, and then takes a weighted average. In an embodiment, determining the weighted average includes performing 610 a mixing step. The mixing step 610 involves codeword mixing using the static mixing matrix P.
[0106] The dequantized local weighted averages are then quantized to give new codewords c.sub.i.sup.k+1∀i∈.sub.q.
[0107] When the iterative consensus terminates 612 at the stopping criterion, such as after a number of iterations or when local update changes fall below some value, a codeword c.sub.i.sup.k may be decoded 614 and then dequantized to give an estimate u.sub.i.sup.k of the subnet average. In an embodiment, all nodes in the task subnet arrive at the mixed subnet value. The result is then output 616.
[0108] The foregoing description of the invention includes preferred forms thereof. Modifications may be made thereto without departing from the scope of the invention, as defined by the accompanying claims.