Providing security against user collusion in data analytics using random group selection
11341269 · 2022-05-24
Assignee
Inventors
Cpc classification
H04L9/085
ELECTRICITY
H04L2209/46
ELECTRICITY
G06F21/6209
PHYSICS
G06F21/6254
PHYSICS
International classification
G06F21/62
PHYSICS
H04L9/00
ELECTRICITY
H04L9/08
ELECTRICITY
Abstract
Methods for secure random selection of t client devices from a set of N client devices and methods for secure computation of inputs of t client devices randomly selected from N client devices are described. Such random selection method may include determining an initial binary vector b of weight t by setting the first t bits to one: b.sub.i=1, 1≤i≤t, and all further bits to zero: b.sub.i=0, t<i≤N; each client device i (i=1, . . . , N) of the set of N client devices jointly generating a random binary vector b of weight t in an obfuscated domain on the basis of the initial binary vector b including: determining a position n in the binary vector; determining a random number r in {n, n+1, . . . N}; and, using the random number to swap binary values at positions n and r of the binary vector b.
Claims
1. A method for privacy-preserving computation based on inputs of t client devices, the t client devices being randomly selected from a set of N client devices, the method comprising: determining, for use by each client device of the set of N client devices, an initial binary vector v of weight t and with length N; jointly generating, by the client devices of the set of N client devices, a random binary vector b of weight t and with length N based on the initial binary vector v, the random binary vector b being generated in an obfuscated domain, wherein a bit value b.sub.i at position i in the random binary vector b signals whether a client device i (i=1, . . . , N) of the set of N client devices is selected as one of the t client devices or not and wherein, during the generation of the random binary vector b in the obfuscated domain, random numbers are used to randomly swap bit values at different positions in the initial binary vector v; determining, by each client device i of the set of N client devices, a product b.sub.i.Math.x.sub.i of the bit value b.sub.i and an input value x.sub.i, the product b.sub.i.Math.x.sub.i being determined in the obfuscated domain; transmitting, by each client device i of the set of N client devices, the product b.sub.i.Math.x.sub.i to a server system or to a further client device of the set of N client devices; and, performing, by the server system or the further client device, the privacy-preserving computation in the obfuscated domain based on the products b.sub.i.Math.x.sub.i (i=1, . . . , N).
2. The method according to claim 1: wherein determining the initial binary vector v of weight t and with length N includes setting the first t bits to one: v.sub.i=1, 1≤i≤t, and all further bits to zero: v.sub.j=0, t<i≤N, where v.sub.i is a bit value at position i of the initial binary vector v; and wherein generating the random binary vector b of weight t and with length N in the obfuscated domain includes, for each n in in the range {1, . . . , t}: jointly determining, by the client devices of the set of N client devices, a random number r.sub.n in the range {n, n+1, . . . , N}; and, using, by each of the client devices of the set of N client devices, the random number r.sub.n to swap bit values at position n and position r.sub.n of the initial binary vector v.
3. The method according to claim 1 wherein the swapping of the bits is based on a delta function δ.sub.i.sup.n, wherein δ.sub.i.sup.n=1 if a value of the random number r.sub.n=i, and δ.sub.i.sup.n=0 for all other positions i (i=1, . . . , N) in the initial binary vector v.
4. The method of claim 3 wherein the delta function δ.sub.i.sup.n is defined based on a Lagrange polynomial function D.sub.i.sup.n(x):
5. The method according to claim 1 wherein the obfuscated domain is based on a homomorphic cryptosystem.
6. The method according to claim 5, wherein the random binary vector b includes encrypted random bits [b.sub.i], 1≤i≤N, such that Σ.sub.ib.sub.i=t; wherein each client device i of the set of N client devices processes its input x.sub.i on the basis of the random binary vector b of encrypted bits by computing an encrypted value [x.sub.i.Math.b.sub.i]=[b.sub.i].sup.x.sup.
7. The method according to claim 6, further including: transmitting, by each client device i of the set of N client devices, the computed encrypted value [x.sub.i.Math.b.sub.i] to the server system; and, performing, by the server system, the privacy-preserving computation on the basis of the computed encrypted values [x.sub.i.Math.b.sub.i] (i=1, . . . , N).
8. The method according to claim 6, further including: transmitting, by each client device i of the set of N client devices, the computed encrypted value [x.sub.i.Math.b.sub.i] to the further client device of the set of N client devices; and, performing, by the further client devices, the privacy-preserving computation on the basis of the computed encrypted values [x.sub.i.Math.b.sub.i] (i=1, . . . , N).
9. The method according to claim 1 wherein the obfuscated domain is based on a secret-sharing system.
10. The method according to claim 9 wherein the secret-sharing system is based on a modulo computation using a fixed prime p.
11. The method according to claim 10 wherein the random binary vector b includes secret-shared random bits (b.sub.i), 1≤i≤N, such that Σ.sub.ib.sub.i=t; wherein each client device i of the set of N client devices determines a secret-shared input value (x.sub.i) using the fixed prime p, and uses secret-shared input value (x.sub.i) to compute a secret-shared value (x.sub.i.Math.b.sub.i); and wherein the privacy-preserving computation in the obfuscated domain is based on the computed secret secret-shared values (b.sub.i.Math.x.sub.i) (i=1, . . . , N).
12. A system for privacy-preserving computation of inputs of t client devices, the t client devices being randomly selected from a set of N client devices, the system comprising: a set of client devices i (i=1, . . . , N), each client device comprising a computer readable storage medium having computer readable program code embodied therewith, the program code including an obfuscation function for performing computations in an obfuscated domain, and a processor coupled to the computer readable storage medium, wherein, responsive to executing the computer readable program code the processor is configured to perform executable operations, the executable operations comprising: determining, for use by each of the client devices of the set of N client devices, an initial binary vector v of weight t and length N; jointly generating a random binary vector b of weight t and with length N in the obfuscated domain, wherein a bit value b.sub.i at position i in the random binary vector b signals whether the client device i (i=1, . . . , N) of the set of N client devices is selected as one of the t client devices or not, the generating including using random numbers to randomly swap bit values at different positions in the initial binary vector v; determining a product b.sub.i.Math.x.sub.i of the bit value b.sub.i, and an input value x.sub.i, the product b.sub.i.Math.x.sub.i being determined in the obfuscated domain; and, transmitting the determined product b.sub.i.Math.x.sub.i to a server system or to a further client device of the set of N client devices, wherein the server system or the further client device is configured to perform the privacy-preserving computation in the obfuscated domain based on the products b.sub.i.Math.x.sub.i (i=1, . . . , N) determined by each client device i (i=1, . . . , N).
13. A client device for use in a system for privacy-preserving computation of inputs of t client devices, the t client devices being randomly selected from a set of N client devices, the client device being associated with an index i, (i=1, . . . , N), the client device comprising: a computer readable storage medium having computer readable program code embodied therewith, the program code including an obfuscation function for performing computations in an obfuscated domain, and a processor coupled to the computer readable storage medium, wherein, responsive to executing the computer readable program code, the processor is configured to perform executable operations, the executable operations comprising: determining an initial binary vector v of weight t and with length N; jointly generating a random binary vector b of weight t with length N in the obfuscated domain based on the initial binary vector v, wherein a bit value b.sub.i at position i in the random binary vector b signals whether the client device associated with the index i is selected as one of the t client devices or not, the generating including using random numbers to randomly swap bit values at different positions in the initial binary vector v; and determining a product b.sub.i.Math.x.sub.i of the bit value b.sub.i and an input value x.sub.i in the obfuscated domain.
14. The client device of claim 13 wherein the processor is configured to perform executable operations, the executable operations comprising: transmitting the determined product b.sub.i.Math.x.sub.i to a server system or to a further client device, wherein the server system or the further client device is configured to perform the privacy-preserving computation in the obfuscated domain on the basis of the products b.sub.i.Math.x.sub.i (i=1, . . . , N) determined by each client device i (i=1, . . . , N).
15. A method for secure random selection of t client devices from a set of N client devices, the method comprising: determining an initial binary vector v of weight t and with length N by setting the first t bits to one: v.sub.i=1, 1≤i≤t, and all further bits to zero: v.sub.i=0, t<i≤N, where v.sub.i is a bit value at position i of the initial binary vector v; the client devices of the set of N client devices jointly generating a random binary vector b of weight t in an obfuscated domain based on the initial binary vector v, wherein the bit value b.sub.i signals whether a client device i of the set of N client devices is selected as one of the t client devices or not, the generation of the random binary vector b including, for each position n (1, . . . , t) in the initial binary vector v: the client devices jointly determining a random number r.sub.n in the range {n, n+1, . . . , N}; and, each client device using the random number r.sub.n to swap bit values at position n and position r.sub.n of the initial binary vector v.
16. The method according to claim 15 wherein the swapping of the binary values at positions n and r.sub.n is based on a delta function δ.sub.i.sup.n, wherein δ.sub.i.sup.n=1 if a value of the random number r.sub.n=i, and δ.sub.i.sup.n=0 for all other i positions in the initial binary vector v, preferably the delta function Δ.sub.i.sup.n being defined on the basis of a polynomial function D.sub.i.sup.n(x):
17. The method according to claim 15 wherein the obfuscated domain is based on a homomorphic cryptosystem and wherein determining a random number r.sub.n further includes: each client device i generating I random bits α.sub.1.sup.j, 0≤j<l, where l is such that 2.sup.l−1≤N−n<2.sup.l, each client device i encrypting the random bits α.sub.1.sup.j using the homomorphic cryptosystem; each client device i computing l encrypted random bits [α.sub.1.sup.j], 0≤j<l, wherein [α.sup.j]=[α.sub.1.sup.j⊕α.sub.2.sup.j⊕ . . . ⊕α.sub.N.sup.j]); and each client device i computing an encrypted random number [r.sub.n] on the basis of the encrypted random bits [α.sup.j], preferably the computing including [r.sub.n]=[Σ.sub.j=0.sup.l−1α.sup.j.Math.2.sup.j]=Π.sub.j=0.sup.l−1[α.sub.j].sup.2.sup.
18. The method according to claim 15 wherein the obfuscated domain is based on a secret-sharing system, and wherein determining a random number r.sub.n further includes: each client device i generating I random bits α.sub.1.sup.j, 0≤j<l, where l is such that 2.sup.l−1≤N−n<2.sup.l, transforming the random bits α.sub.1.sup.j to a secret-shared domain; the client devices jointly computing l secret-shared random bits α.sup.j
, 0≤j<1 wherein
α.sup.j
=
α.sub.1.sup.j⊕α.sub.2.sup.j⊕ . . . ⊕α.sub.N.sup.j
; and each client device i computing the secret-shared random number
r.sub.n
on the basis of the secret-shared random bits
α.sup.j
; and, determining whether the secret-shared random number
r.sub.n
is smaller than or equal to N−n.
19. The method according to claim 18 wherein computing the secret-shared random number r.sub.n
on the basis of the secret-shared random bits
α.sup.j
, includes computing
r.sub.n
=Σ.sub.j=0.sup.l−1
α.sup.j
.Math.2.sup.j.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Aspects of the invention will be further illustrated with reference to the attached drawings, which schematically will show embodiments. It will be understood that the invention is not in any way restricted to these specific embodiments.
(2)
(3)
(4)
(5)
(6)
(7)
DETAILED DESCRIPTION
(8) The embodiments in this disclosure generally relate privacy-preserving random group selection schemes and multi-party computation schemes, such as used in data-analytics, using such privacy-preserving random group selection, in particular privacy-preserving multi-party computation schemes wherein a random securely selected group of client devices may jointly compute a function on the basis of privacy sensitive user data.
(9) A client device may be implemented on a user apparatus, e.g. a smart phone, a computer, a server, a (smart) television, a vehicle or any consumer or industrial electronic apparatus that is configured to process and/or generate private data. Client devices may typically comprise a wireless or a hardwired communication interface allowing client devices to communicate with each other or with network elements, e.g. servers, in the network. The client devices and the server may use a secure protocol, in particular a secure multi-party computation protocol based on an obfuscation scheme, e.g. an encryption scheme or secret sharing scheme, which ensures secure communication between the client devices and the aggregation server 112. The server may comprise a processor 116 adapted to process the private data in the obfuscated domain. Communication between the client devices and the server device in the network may be based on a suitable client server protocol 120.sub.1,2 e.g. an HTTP protocol or the like. Further, during the execution of the secure multi-party computation protocols described in this application, client devices may exchange information with each other using a central model, e.g. a trusted server or using a decentralized communication model on the basis of a suitable peer-to-peer protocol 121, e.g. the well-known BitTorrent protocol or a derivative thereof.
(10) In an embodiment, when the system of
(11) In an embodiment, the homomorphic cryptosystem may be configured as a so-called homomorphic threshold cryptosystem. Examples of homomorphic threshold cryptosystems are described in the article by Y. Desmedt and Y. Frankel, with title “threshold cryptosystems”, CRYPTO 1989, P. 308-315, Springer-Verlag. The threshold homomorphic cryptosystem is associated with key information 110.sub.1,2, including public key information, e.g. a public encryption key e that is shared by the client devices and the aggregator server, and secret key information including a secret decryption key d, which may be secret-shared among the client devices. Each client device i may be provided with a decryption share d.sub.i of the secret decryption key d, wherein the decryption operation needs to be jointly performed by sufficiently many clients.
(12) An example of a privacy-preserving computing system that is based on a threshold homomorphic cryptosystem is described in a related. European patent application EP 17177653.7 with title “Privacy preserving computation protocol for data analytics”, which is herewith incorporated by reference into the description of this application.
(13) In another embodiment, when the system of x
means that each client device i has a (meaningless) share x.sub.i, such that Σ.sub.ix.sub.i, mod p=x. Adding two secret-shared values is simply locally adding the shares. Multiplying two secret-shared values
x
and
y
however, requires a protocol and communication between the parties. As a prerequisite, the parties should hold secret-sharings of three random numbers a, b, and c, such that c=(a.Math.b) mod p.
(14) In the system of
(15) The secure protocol is configured as a privacy-preserving protocol allowing a server processor 116 to securely process, e.g. aggregate, private data m.sub.i of a group of (at least) t client devices randomly selected from N client devices without the risk that the server and/or client devices learn the individual private data m.sub.i and without leaking information of user inputs to sources. Selection of the client devices by the aggregation server may be realized by a selector 114.
(16) In many big data applications, the aggregation process is repeated multiple times wherein each time the selector 114 may select a different group of client devices from the N client devices. However, as shown by Kononchuk et al., when the computing system repeats the aggregation process a number of times for different groups of users, information about the private data may leak away. For example, when the aggregator server would collude with t−1 users, it is possible to obtain the private input of any other user in this way.
(17) This information leakage problem may be addressed by implementing the selector 114 as a secure random group selection processor. This selector may trigger the N client devices to execute a random group selection protocol. As shown in
(18) Prior art solutions include a random group selection process that involves the execution of permutations in the encrypted domain resulting in a computation-heavy protocol, which renders the protocol not, or at least less, suitable for practical applications. It is an aim of the embodiments in this application to provide an efficient privacy-preserving random group selection process which eliminates, or at least reduces, the risk of leakage of privacy sensitive information in multi-party computation schemes, which are often used in data analytics applications. In an embodiment, such random group selection process may be used in a privacy-preserving computing system as described with reference to
(19) The random group selection processes used in the embodiment of this application may be based on a set of bits, e.g. a binary vector b.sub.i, 1≤i≤N, which is used to identify which client device in a set of N client devices is selected. For example, setting the k-th bit of the vector to one (i.e. b.sub.k=1) may indicate that client device with client index i=k has been selected to participate in a multi-party computation scheme, e.g. secure data aggregation process. Hence, randomly selecting a group of client devices includes securely and randomly setting t bits in the vector to one. The random group selection process according to the invention thus includes the generation of a binary vector of weight t, using a random bit generation method in a domain in which the bits are obfuscated, e.g. the encrypted domain or the secret-sharing domain. Here, the (Hamming) weight t of a binary vector may be defined as the sum of the bit values of the binary vector.
(20) The random binary vector generation may include a swapping process, wherein random numbers are used to swap two bits in the binary vector. This process may include an initial step in which a binary vector of weight t is initialized by setting the first t bits to one: b.sub.i=1, 1≤i≤t, and all further bits to zero: b.sub.i=0, t<i≤N. Thereafter, a bit swapping process may be repeated t times. An example of such bit swapping protocol may look as follows.
(21) For n=1 to t do: 1. Client devices jointly generate a random number r in (n, n+1, . . . N); 2. Each client device swaps the binary values at positions n and r of vector b.
(22) After execution of this bit swapping protocol, each bit b.sub.i is one with probability t/N. It only requires the generation of t random numbers and performing t swaps instead of generating N random permutations in cleartext and N times permuting all items in the encrypted domain as known from the prior art. The randomly generated vector represents a randomly selected group of client devices, wherein a bit at the k-th position in the vector signal indicates whether client device with index k is part of the selected group. In order to avoid leakage of the selection process to unauthorized users, the swapping protocol is executed in an obfuscated domain. Transforming the swapping protocol to such obfuscated domain, e.g. the encrypted domain or the secret-sharing domain, is however not a trivial exercise. This will be illustrated in more detail in the embodiments hereunder.
(23) In an embodiment, the bit swapping protocol may be executed in the encrypted domain using e.g. a homomorphic encryption system. In that case, a binary vector b of encrypted bits may be generated in which the first t encrypted bits may be set to one: [b.sub.i]=[1], 1≤i≤t and the rest of the encrypted bits may be set to zero: [b.sub.i]=[0], t<i≤N. Further, t encrypted random numbers [r.sub.n], 1≤n≤t, need to be generated by the system such that r.sub.n is uniformly drawn from the set (n, n+1, . . . N).
(24) Thereafter, a bit swapping process may be executed in which the encrypted random numbers [r.sub.n] are used to swap encrypted bits in the binary vector.
(25) In order to process bits of the binary vector, a suitable bit manipulation function may be used. In an embodiment, a delta function δ.sub.i.sup.n may be used for processing bits in the binary vector, wherein δ.sub.i.sup.n=1, if r.sub.n=i, and 0, for all other i positions in the vector. Hence, the delta function sets the bit at position r.sub.n in the binary vector to one, and to 0 for all other positions.
(26) In order to transform the delta function to an obfuscated domain, the function may be defined on the basis of a Lagrange interpolation scheme wherein a function D.sub.i.sup.n(x) may define the following polynomial:
(27)
Here, the coefficients d.sub.j represent coefficients of the Lagrange polynomial D.sub.i.sup.n(x). Based on this function, δ.sub.i.sup.n can be constructed as follows: δ.sub.i.sup.n=D.sub.i.sup.n(r.sub.n), n≤i≤N, which represents a bit on position r.sub.n in the binary vector. This function can be transformed into the encrypted domain using an additively homomorphic cryptosystem and the Lagrange coefficients d.sub.j:
[δ.sub.i.sup.n]=Π.sub.j=0.sup.N−n[r.sup.j].sup.d.sup.
(28) As shown by this expression, the homomorphic encryption transforms the summation into a product. This encrypted delta function may be used to process, e.g. swap, bits in the encrypted domain. This way, a bit swapping process may be executed in the encrypted domain, which includes swapping t encrypted bits of the binary vector based on encrypted random numbers.
(29) A process of securely generating random bits, e.g. in the form of a random binary vector, according to an embodiment of the invention may look as follows:
(30) For n=1 to t do
(31) 1. a set of N client devices jointly generate an encrypted random number [r], such that r∈{n, n+1, . . . , N}; 2. the client devices jointly compute [r.sup.j], for each j, 1≤j≤N−n, using secure multiplications; a. for i=n to N do: i. each user computes the coefficients d.sub.j, 0≤j≤N−n, of the polynomial D.sub.i.sup.n(x); ii. each client device computes [δ.sub.i.sup.n]=Π.sub.j=0.sup.N−n[r.sup.j].sup.d.sup.
(32) In step 2.d of the protocol above, the new value of b.sub.i becomes δ.sub.i.sup.n+b.sub.i−δ.sub.i.sup.n.Math.b.sub.i, which equals 1, if δ.sub.i.sup.n=1, and b.sub.i, it δ.sub.i.sup.n=0. Since b.sub.n=1, and δ.sub.i.sup.n=1 exactly when i=r, we get b.sub.r←b.sub.n, where the other b.sub.i remain unaltered. This secure random bit generation protocol takes not more than 3 tN secure multiplications and t random number generations. In contrast, the generation by means of permutation matrices as described by Kononchuk et al. requires N matrix multiplications, which sums up to N.sup.4 secure multiplications.
(33) In the above-described random bit generation protocol, the N client devices need to jointly generate an encrypted random number r, r∈(n, n+1, . . . , N).
(34) In an embodiment, such encrypted random number may be generated by the client devices as follows: 1. each client device i may generate l random bits α.sub.i.sup.j, 0≤j<l, where l is such that 2.sup.l−1≤N≤n<2.sup.l, and encrypts the random bits using an homomorphic cryptosystem; 2. the client devices jointly compute l random bits [α.sup.j], 0≤j<l, which can be computed using N secure multiplications, since the exclusive-or α.sub.1⊕α.sub.2=α.sub.1+α.sub.2−2α.sub.1.Math.α.sub.2 (here α.sup.j are truly random as long as the user of one client device is honest: [α.sup.j]=[α.sub.1.sup.j⊕α.sub.2.sup.j⊕ . . . ⊕α.sub.N.sup.j]); 3. the client devices jointly compute [r]=[Σ.sub.j=0.sup.l−1α.sup.j.Math.2.sup.j]=Π.sub.j=0.sup.l−1[α.sub.j].sup.2.sup.
(35) If γ−1, this protocol securely generates an encrypted random number r from the set {n, n+1, . . . , N}. The computation of the comparison is known per se. An example of such computation is described in B. Schoenmakers and P. Tuyls, Practical Two-Party Computation based on the Conditional Gate, ASIACRYPT 2004, Lecture Notes in Computer Science 3329 (2004) 119136. Springer-Verlag. Typically, to generate many bounded random numbers, per random number four executions of the protocol above is sufficient.
(36) In the above-described protocol, the client devices use a secure multiplication protocol, which is described hereunder in greater detail, in this scheme, initially, the client devices hold encryptions of additively homomorphically encrypted numbers [x] and [y], and would like to securely compute an encryption [x.Math.y] of the product. To that end, the client devices may execute the following steps: 1. each client device i may generate two large random numbers r.sub.i.sup.x and r.sub.i.sup.y, encrypt both numbers [r.sub.i.sup.x] and [r.sub.i.sup.y], and broadcast the two encryptions to all other client devices; 2. each client device may compute two encrypted random numbers [r.sup.x]=[Σ.sub.ir.sub.i.sup.x]=Π.sub.i[r.sub.i.sup.x] and [r.sup.y]=[Σ.sub.ir.sub.i.sup.y]=Π.sub.i[r.sub.i.sup.y]; 3. the client devices may then compute [x+r.sup.x]=[x].Math.[r.sup.x] and [y+r.sup.y]=[y].Math.[r.sup.y], and decrypt both values into x+r.sup.x and y+r.sup.y; 4. client device i may further compute [z.sub.i]=([r.sup.y].Math.[y]).sup.r.sup.
(37) In the scheme above, the integers x+r.sup.x and y+r.sup.y may be safely decrypted, since the values of x and y are additively blinded by a large random number, which is unknown to each party. It can be shown that z=x.Math.y: (x+r.sup.x).Math.(y+r.sup.y)=xy+xr.sup.y+r.sup.xy+r.sup.xr.sup.y=xy+Σ.sub.iz.sub.i.
(38) In an embodiment, the above-described privacy-preserving random group selection process may be used for secure aggregation of private data.
(39) The method allows secure computation of user inputs of t client devices, which are randomly selected from a group of N client device. The process includes a secure selection of client devices in the obfuscated domain, so that neither the client devices, nor the server, know which inputs are used in the computation. Here, the number t may be selected based on various considerations including e.g. the computation and communication time that is needed in order to process data of the selected client devices, the trust in the network, etc. The selection includes an efficient bit swapping process. The process may be implemented using different obfuscation schemes.
(40)
(41) In case threshold decryption is used, the aggregator may request t users to decrypt. [y]. In an embodiment, this may be combined with verifiable aggregation to assure that the aggregator added all inputs. Such scheme is described in a related European patent application EP 17177653.7 with title “Privacy preserving computation protocol for data analytics”. This way, the aggregator obtains y=Σ.sub.i∈Hx.sub.i, where the set H={i|b.sub.i=1} is randomly chosen, and hidden from all parties.
(42) In a further embodiment, a secret-sharing based secure multi-party computation system may be used to generate the random bits. The random bits b.sub.i may be computed and are secret-shared between N client devices. If desired, it is possible to transform such a shared secret to a homomorphically encrypted bit [b.sub.i]. This way, an aggregation server may add up encrypted inputs (as described hereunder in more detail).
(43) When using secret-sharing, as e.g. described in the article by Damgard et al., “Unconditionally secure constant-rounds multi-party computation for equality, comparison, bits and exponentiation”, each secret value x is secret-shared which is indicated by the notation x
.
(44) Similar to the above-described embodiment related to homomorphic encryption, the main problem is how to securely generate N secret-shared bits b.sub.i
, such that Σ.sub.ib.sub.i=t. Also in this case, prior art solution such as Kononchuk et al. suggest to generate N random permutations of length N by client devices, encrypting these permutations and concatenating these encrypted permutations into one random permutation in the encrypted domain, which will lead to very inefficient and computational heavy processes. Translating the random bit generation process of the invention into the secret-sharing domain will provide a very efficient and secure random group selection process.
(45) The protocol for securely multiplying two secret-shared numbers however differs from a secure multiplication based on (threshold) homomorphic encryption. In particular, in a secret-sharing scheme, values are computed on the basis of a modulo of a fixed prime p. Hence, having a secret share x
means that each party i has a (meaningless) share x.sub.i, such that Σ.sub.ix.sub.i mod p=x. Adding two secret-shared values is simply a client device locally adding the shares. Multiplying two secret-shared values
x
and
y
however, requires a protocol and communication between client devices. As a prerequisite, the client devices should hold secret-sharings of three random numbers a, b, and c, such that c=(a.Math.b) mod p. Then, a multiplication may look as follows: 1. the client devices locally compute
d
=
x
+
a
, and
e
=
y
b
; 2. the client devices reveal the secrets d and e by broadcasting their shares of these two numbers; 3. the client devices compute
z
=de−d.Math.
b
−e.Math.
a
+
c
.
wherein z=x.Math.y, since x.Math.y=(d−a)(e−b)=de−db−ae+ab.
(46) Given the secure multiplication protocol for secret-sharings, the secure random bit generation protocol in the encrypted domain can be easily translated into the secret-share domain by using .
instead of [.].
(47) The secure random bit generation protocol in the secret-share domain may include an initialization process wherein a binary vector of secret-shared bits may be generated, in which the first t secret-shared bits may be set to one: b.sub.i
=
1
, 1≤i≤t and the rest of the secretly shared bits may be set to zero:
b.sub.i
=
0
, t<i≤N.
(48) Thereafter, a bit swapping process may be executed in the secret-shared domain, which includes swapping t secret-shared bits, e.g. the t secret shared bits of bit value one, of the binary vector in a similar way as described above with reference to the swapping process in the encrypted domain. In an embodiment, the process may look as follows:
(49) For n=1 to t do 1. a set of N client devices jointly generate a secret-shared random number r
, such that r∈{n, n+1, . . . , N}; 2. the client devices jointly compute
r.sup.j
, for each j, 1≤j≤N−n, using secure multiplications; a. for i=n to N do: i. each client device computes the coefficients d.sub.j, 0≤j≤N−n, of the polynomial D.sub.i.sup.n(x); ii. each client device computes
δ.sub.i.sup.n
=Σ.sub.j=0.sup.N−n
r.sup.j
; b. each client device computes
b
=Σ.sub.i=n . . . N
δ.sub.i.sup.n.Math.b.sub.i
, after N−n+1 joint secure multiplications, such that b=b.sub.r; c. set
b.sub.n
=
b
; d. for i=n to N do each client device computes
b.sub.i
=
δ.sub.i.sup.n
+
b.sub.i
−
δ.sub.i.sup.n.Math.b.sub.i
after a joint secure multiplication of (δ.sub.i.sup.n) and
b.sub.i
.
(50) In the above-described random bit generation protocol, the client device need to jointly generate an encrypted random number r, r∈{n, n+1, . . . , N}. In an embodiment, such encrypted random number may be generated by the client devices as follows: 1. each client device i may generate l random bits α.sub.i.sup.j, 0≤j<l, where l is such that 2.sup.l−1≤N−n<2.sup.1, and transform the random bits to the secretly-shared domain; 2. the client devices jointly compute l random bits α.sup.j
, 0≤j<l;
α.sup.j
=
α.sub.l.sup.j⊕α.sub.2.sup.j⊕ . . . ⊕α.sub.N.sup.j
, which can be computed within N secure multiplications, since the exclusive-or α.sub.1⊕α.sub.2=α.sub.1+α.sub.2−2α.sub.1.Math.α.sub.2 (here α.sup.j are truly random as long as the user of one client device is honest); 3. the client devices jointly compute
r
=Σ.sub.j=0.sup.l−1
α.sub.j
.Math.2.sup.j; 4. the client devices check whether r≤N−n, by securely computing the comparison bit
γ
, where γ=1, if r≤N−n, and 0, otherwise. This can be done within l secure multiplications, since the secret-shared bits of
α.sup.j
of r are given; 5. the client devices de-obfuscates γ and if γ=1,
r
←
r+
=
r
+
n
.
(51) If γ=1, this protocol securely generates a secret-shared random number r from the set {n, n+1, . . . , N}. Typically, to generate many bounded random numbers, per random number four executions of the protocol above should suffice. In general, an addition of two encrypted values [x] and [y] is computed by [x+y]=[x].Math.[y], whereas secret-shared values are just (locally) added: x+y
=
x
+
y
. Similarly, [c.Math.x]=[x].sup.c becomes
c.Math.x
=
x
.Math.c, which is locally multiplying (modulo p) each share with the known number c.
(52) b.sub.i
, 1≤i≤N, such that Σ.sub.ib.sub.i=t (step 502); 2. client device i may determine a secret-sharing (x.sub.i) using the fixed prime p and private share x.sub.i, jointly with the other client devices (in the semi-honest model, which is easily achieved by setting the local shares of the other players to zero), and jointly compute
b.sub.i.Math.x.sub.i
by means of a secure multiplication (step 504); 3. the secret-shared result
y
may be determined based on the shares, e.g. by locally adding the corresponding shares: y=Σ.sub.ix.sub.ib.sub.i (step 506).
Such decentralized scheme may also be implemented in the encrypted domain wherein the aggregation calculation [y]=[Σ.sub.ix.sub.i.Math.b.sub.i]=Π.sub.i[x.sub.i.Math.b.sub.i] may be performed by one or more client devices, or all client devices of the N client devices. In that case, each client device is configured to transmit or broadcast the securely computed products [bi.Math.xi] to the other N−1 client devices in the set. Hence, in case an external device (or one of the client devices) needs the result of the aggregation, the one or more client devices can perform the aggregation calculation, and decrypt the aggregation result.
(53) An advantage of such scheme is that it is no longer necessary to verify the aggregation step performed by the aggregator server. Hence, this way, computations on the secret (and randomised) user data, which is usually done by a central server, may be performed by the client devices. This enables schemes performing all kinds of computations on the user data, without restricting to the homomorphic property of the encryption system.
(54) The systems and methods for privacy-preserving computation of private data as described in this application are particularly useful for implementation in data analytics applications, which require the processing of large amounts of privacy sensitive data. An example of such data analytics application may be a secure privacy-preserving recommender system, wherein a group of participants (voters) may register with the platform, and install an electronic voting application on their computer or mobile device.
(55) The problem of preserving data privacy of users, who repeatedly join the execution of a service with different groups, is present in many data analytics applications requiring multi-party computation, including for example recommendation systems (see Z Erkin, et al, Generating private recommendations efficiently using homomorphic encryption and data packing, IEEE Trans. Inform. Forensics Secur. 7(3), 1053-1066, 2012), aggregating data in smart grids, reputation systems (see e.g. J Kacprzyk, Group decision making with a fuzzy linguistic majority, Fuzzy Sets Syst, 18, 105-118, 1986), unsupervised machine learning (see e.g. F Anselmi et al., Unsupervised learning of invariant representations with low sample complexity: the magic of sensory cortex or a new framework for machine learning? J. Theor. Comput. Sci. 633(C), 112-121 (2016). CBMM, MIT, arXiv:1311.4158v5), collective decision-making (see e.g. A Mathes, Folksonomies—cooperative classification and communication through shared metadata, Comput. Mediated Commun. 47, 1-28, 2004). nr. 10), data clustering (see e.g. Z Erkin et al., in IEEE International Workshop on Information Forensics and Security, Privacy-preserving user clustering in a social network, IEEE, Washington, 2009), and social classification (see e.g. H Kargupta et. al, in ICDM, IEEE Computer Society, On the privacy preserving properties of random data perturbation techniques, IEEE, Washington, 2003, pp. 99-106). The embodiments in this disclosure can be advantageously used in such data analytics applications to eliminate or at least reduce undesired information leakage and to secure the privacy of users against collusion by malicious users during the execution of such services.
(56) The systems and methods described in this application may be used in any data analytics application that requires an aggregated result (e.g. a sum and/or product) of private data of individual users. For example, possible data analytics applications may include the processing of medical data, e.g. online processing of clinical data or medical records of patients, processing data of voting and secret ballot schemes, processing metrics in television and multimedia applications (e.g. audience ratings of channel usage in television broadcast or streaming applications), processing of financial data of commercial and/or institutional organisations, etc.
(57)
(58) Memory elements 604 may include one or more physical memory devices such as, for example, local memory 608 and one or more hulk storage devices 610. Local memory may refer to random access memory, or other non-persistent memory device(s) generally used during actual execution of the program code. A bulk storage device may be implemented as a hard drive or other persistent data storage device. The processing system 600 may also include one or more cache memories (not shown) that provide temporary storage of at least some program code in order to reduce the number of times program code must be retrieved from bulk storage device 610 during execution.
(59) Input/output (I/O) devices depicted as input device 612 and output device 614 optionally can be coupled to the data processing system. Examples of input device may include, but are not limited to, for example, a keyboard, a pointing device such as a mouse, or the like. Examples of output device may include, but are not limited to, for example, a monitor or display, speakers, or the like. Input device and/or output device may be coupled to data processing system either directly or through intervening I/O controllers. A network adapter 616 may also be coupled to data processing system to enable it to become coupled to other systems, computer systems, remote network devices, and/or remote storage devices through intervening private or public networks. The network adapter may comprise a data receiver for receiving data that is transmitted by said systems, devices and/or networks to said data and a data transmitter for transmitting data to said systems, devices and/or networks. Modems, cable modems, and Ethernet cards are examples of different types of network adapter that may be used with data processing system 650.
(60) As pictured in
(61) In one aspect, for example, data processing system 600 may represent a client data processing system. In that case, application 618 may represent a client application that, when executed, configures data processing system 600 to perform the various functions described herein with reference to a “client”. Examples of a client can include, but are not limited to, a personal computer, a portable computer, a mobile phone, or the like.
(62) In another aspect, data processing system may represent a server. For example, data processing system may represent an (HTTP) server in which case application 618, when executed, may configure data processing system to perform (HTTP) server operations. In another aspect, data processing system may represent a module, unit or function as referred to in this specification.
(63) The terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an”, and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising”, when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
(64) The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art, without departing from the scope and spirit of the invention. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.