Efficient Three-Party Private Set Intersection (PSI)

20230102423 · 2023-03-30

    Inventors

    Cpc classification

    International classification

    Abstract

    Techniques for implementing efficient three-party private set intersection (PSI) are provided. In one set of embodiments these techniques make use of an oblivious key-value store (OKVS), which is a cryptographic data structure that encodes a set of key-value pairs ({k.sub.i, v.sub.i}) and exhibits the following properties: (A) if a receiver decodes the OKVS on some input q=k.sub.j, the output will be v.sub.j, and (B) the receiver cannot tell, from the outputs generated by the OKVS, what keys (i.e., k.sub.i's) are encoded. By using an OKVS, the techniques of the present disclosure can achieve three-party PSI in a manner that is more efficient and scalable than existing protocols.

    Claims

    1. A method for implementing efficient three-party private set intersection (PSI), the method comprising: generating, by a first computer system operated by a first party, a random key k; transmitting, by the first computer system, the random key k to a second computer system operated by a second party; creating, by the first computer system, a pseudorandom function F.sub.k using the random key k; creating, by the first computer system, an oblivious key-value store (OKVS) using a set of key-value pairs ({x.sub.i, F.sub.k(x.sub.i)}) for each item x.sub.i in an item set X that is private to the first party; and transmitting, by the first computer system, the OKVS to a third computer system operated by a third party.

    2. The method of claim 1 wherein upon receiving the random key k, the second computer system: creates the pseudorandom function F.sub.k using the random key k; and computes y′.sub.i=F.sub.k(y.sub.i) for each item y.sub.i in an item set Y that is private to the second party.

    3. The method of claim 2 wherein upon receiving the OKVS, the third computer system: decodes the OKVS using each item z.sub.i in an item set Z that is private to the third party, resulting in a set of values z′.sub.j.

    4. The method of claim 3 wherein the second and third computer systems execute a two-party PSI protocol using y′.sub.i and z′.sub.i as inputs in order to determine an intersection of the item sets X, Y, and Z.

    5. The method of claim 4 wherein the two-party PSI protocol executed by the second and third computer systems is a server-aided two-party PSI protocol.

    6. The method of claim 5 wherein the first computer system acts as a server in the server-aided two-party PSI protocol.

    7. The method of claim 1 wherein the OKVS is configured to: output F.sub.k(x.sub.j) when decoded on an input x.sub.j in the item set X; and output a random value when decoded on an input that is not in the item set X.

    8. A non-transitory computer readable storage medium having stored thereon program code executable by a first computer system operated by a first party, the program code embodying a method for implementing efficient three-party private set intersection (PSI), the method comprising: generating a random key k; transmitting the random key k to a second computer system operated by a second party; creating a pseudorandom function F.sub.k using the random key k; creating an oblivious key-value store (OKVS) using a set of key-value pairs ({x.sub.i, F.sub.k(x.sub.i)}) for each item x.sub.i in an item set X that is private to the first party; and transmitting the OKVS to a third computer system operated by a third party.

    9. The non-transitory computer readable storage medium of claim 8 wherein upon receiving the random key k, the second computer system: creates the pseudorandom function F.sub.k using the random key k; and computes y′.sub.i=F.sub.k(y.sub.i) for each item y.sub.i in an item set Y that is private to the second party.

    10. The non-transitory computer readable storage medium of claim 9 wherein upon receiving the OKVS, the third computer system: decodes the OKVS using each item z.sub.i in an item set Z that is private to the third party, resulting in a set of values z′.sub.i.

    11. The non-transitory computer readable storage medium of claim 10 wherein the second and third computer systems execute a two-party PSI protocol using y′.sub.i and z′.sub.i as inputs in order to determine an intersection of the item sets X, Y, and Z.

    12. The non-transitory computer readable storage medium of claim 11 wherein the two-party PSI protocol executed by the second and third computer systems is a server-aided two-party PSI protocol.

    13. The non-transitory computer readable storage medium of claim 12 wherein the first computer system acts as a server in the server-aided two-party PSI protocol.

    14. The non-transitory computer readable storage medium of claim 8 wherein the OKVS is configured to: output F.sub.k (x.sub.j) when decoded on an input x.sub.j in the item set X; and output a random value when decoded on an input that is not in the item set X.

    15. A first computer system operated by a first party, the first computer system comprising: a processor; and a non-transitory computer readable medium having stored thereon program code that, when executed, causes the processor to: generate a random key k; transmit the random key k to a second computer system operated by a second party; create a pseudorandom function F.sub.k using the random key k; create an oblivious key-value store (OKVS) using a set of key-value pairs ({x.sub.i, F.sub.k(x.sub.i)}) for each item x.sub.i in an item set X that is private to the first party; and transmit the OKVS to a third computer system operated by a third party.

    16. The first computer system of claim 15 wherein upon receiving the random key k, the second computer system: creates the pseudorandom function F.sub.k using the random key k; and computes y′.sub.i=F.sub.k(y.sub.i) for each item y.sub.i in an item set Y that is private to the second party.

    17. The first computer system of claim 16 wherein upon receiving the OKVS, the third computer system: decodes the OKVS using each item z.sub.i in an item set Z that is private to the third party, resulting in a set of values z′.sub.i.

    18. The first computer system of claim 17 wherein the second and third computer systems execute a two-party private set intersection (PSI) protocol using y′.sub.i and z′.sub.i as inputs in order to determine an intersection of the item sets X, Y, and Z.

    19. The first computer system of claim 18 wherein the two-party PSI protocol executed by the second and third computer systems is a server-aided two-party PSI protocol.

    20. The first computer system of claim 19 wherein the first computer system acts as a server in the server-aided two-party PSI protocol.

    21. The first computer system of claim 15 wherein the OKVS is configured to: output F.sub.k(x.sub.j) when decoded on an input x.sub.j in the item set X; and output a random value when decoded on an input that is not in the item set X.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0003] FIG. 1 depicts an example environment according to certain embodiments.

    [0004] FIG. 2 depicts the high-level design of a three-party PSI protocol according to certain embodiments.

    [0005] FIG. 3 depicts a protocol workflow according to certain embodiments.

    DETAILED DESCRIPTION

    [0006] In the following description, for purposes of explanation, numerous examples and details are set forth in order to provide an understanding of various embodiments. It will be evident, however, to one skilled in the art that certain embodiments can be practiced without some of these details or can be practiced with modifications or equivalents thereof.

    1. Overview

    [0007] Embodiments of the present disclosure are directed to an efficient and secure protocol for implementing private set intersection in the context of three parties (i.e., three-party PSI). Unlike existing three-party PSI protocols, the protocol of the present disclosure makes use of an oblivious key-value store (OKVS), which is a cryptographic data structure that encodes a set of key-value pairs ({k.sub.i, v.sub.i}) having (pseudo)random v.sub.i values. When an OKVS is provided to a receiver, the receiver can evaluate (i.e., decode) the OKVS to generate an output for any input q.

    [0008] The characteristics of the OKVS ensure that (A) if the receiver decodes the OKVS on some input q=k.sub.j, the output will be v.sub.j, and (B) the receiver cannot tell, from the outputs generated by the OKVS, what keys (i.e., k.sub.i's) are encoded.

    [0009] By leveraging an OKVS, the protocol of the present disclosure can achieve three-party PSI in a manner that is more efficient and scalable than existing techniques. The foregoing and other aspects are described in further detail in the sections that follow.

    2. Example Environment and High-Level Protocol Design

    [0010] FIG. 1 depicts an example environment 100 in which embodiments of the present disclosure may be implemented. As shown, environment 100 includes three computer systems 102(1), 102(2), and 102(3) that are operated by three parties (e.g., organizations, individuals, etc.) P.sub.1, P.sub.2, and P.sub.3 respectively. Each party maintains on its computer system a set of items 104 that is private to (i.e., only known by) that party. In particular, party P.sub.1 maintains an item set X (reference numeral 104(1)) comprising items {x.sub.1, . . . , x.sub.m}, party P.sub.2 maintains an item set Y (reference numeral 104(2)) comprising items {y.sub.1, . . . , y.sub.m}, and party P.sub.3 maintains an item set Z (reference numeral 104(3)) comprising items {z.sub.1, . . . , z.sub.m}. These items may include any type of data instance that is of interest to the parties, such as user identifiers (IDs), device serial numbers, data checksums, etc.

    [0011] It is assumed that parties P.sub.1, P.sub.2, and P.sub.3 wish to execute three-party PSI with respect to their item sets X, Y, and Z and thus learn the intersection of these three sets, without revealing any other information to each other. For example, if P.sub.1, P.sub.2, and P.sub.3 are different companies and item sets X, Y, and Z are customer datasets maintained by the companies, P.sub.1, P.sub.2, and P.sub.3 may wish to identify the customers that they have in common for cross-marketing purposes, without letting each other know the customers that are unique to each company.

    [0012] However, as noted the Background section, existing protocols for implementing three-party PSI generally suffer from poor performance and scalability. This is largely due to the fact that these existing protocols employ a particular type of cryptographic primitive, known as an oblivious programmable pseudorandom function (OPPRF), which requires, among other things, (1) multiple rounds of communication between every pair of parties {P.sub.1, P.sub.2}, {P.sub.1, P.sub.3}, and {P.sub.2, P.sub.3}, and (2) the use of asymmetric (i.e., public-private) key operations. The combination of (1) and (2) results in a degree of communication and resource overhead that can quickly render these existing protocols impractical as the number of items in each party's item set is scaled upward.

    [0013] To address the foregoing and other similar problems, FIG. 2 depicts the high-level design of an efficient three-party PSI protocol that may be implemented by parties P.sub.1, P.sub.2, and P.sub.3 of FIG. 1 according certain embodiments. This protocol leverages two types of cryptographic primitives (both of which are distinct and different from OPPRF): a pseudorandom function, which is a deterministic function that is created via a key and, for every possible input, generates an output that is indistinguishable from the output of a truly random function; and an oblivious key-value store (OKVS), which is a key-value data structure that supports decode and encode operations. The decode operation causes the OKVS to output a value in response to a query q.

    [0014] The encode operation takes as input a set of key-value pairs ({k.sub.i, v.sub.i}) for i=1, . . . , n comprising (pseudo)random values {v.sub.1, . . . , v.sub.n} and generates an OKVS that (A) outputs v.sub.j when decoded on a query q=k.sub.j and (B) outputs a random value when decoded on a query q .Math.{k.sub.1, . . . , k.sub.n}. Property (B) ensures that the receiver of the OKVS (i.e., the party performing decode operations) cannot tell what k.sub.i's are encoded.

    [0015] Starting with step (1) of the protocol (reference numeral 200), a first party (e.g., P.sub.1) can generate a random key k and transmit k to one of the other two parties (e.g., P.sub.2).

    [0016] At step (2) (reference numeral 202), Party P.sub.1 can create a pseudorandom function F.sub.k using random key k. Party P.sub.1 can further create an OKVS S by invoking the encode operation using the key-value pairs ({x.sub.i, F.sub.k(x.sub.i)}) for every x.sub.i in its item set X and can transmit S to the other party that did not receive random key k (i.e., P.sub.3) (step (3); reference numeral 204). Note that due to the two OKVS properties mentioned above, OKVS S will output F.sub.k(x.sub.j) when decoded on a query q=x.sub.j and output a random value when decoded on a query q .Math. X.

    [0017] At steps (4) and (5) (reference numerals 206 and 208), party P.sub.2 can create the same pseudorandom function F.sub.k created by party P.sub.1 at step (2) using random key k and can compute y′.sub.i=F.sub.k(y.sub.i) for every y.sub.i in its item set Y. In addition, at step (6) (reference numeral 210), party P.sub.3 can invoke the decode operation of OKVS S on every z.sub.i in its item set Z, resulting in outputs z′.sub.i.

    [0018] Finally, at step (7) (reference numeral 212), parties P.sub.2 and P.sub.3 can execute a two-party PSI protocol over inputs {y′.sub.i}.sub.i∈[m] and {z′.sub.i}.sub.i∈[m], thereby allowing the parties to obtain the intersection of item sets X, Y, and Z (i.e., X ∩ Y ∩ Z). Note that at the end of step (6), if an item a is in the intersection, both P.sub.2 and P.sub.3 will have y′.sub.j=z′.sub.j, =F.sub.k(a) for some j and j′. For party P.sub.2, this follows directly from the computation performed at step (5). For party P.sub.3, this follows from property (A) of OKVS S mentioned earlier, which ensures that S outputs F.sub.k(x.sub.j) when decoded on a query q=x.sub.j (and thus will output F.sub.k(a) on an item a that exists in both item set X of P.sub.1 and item set Z of P.sub.3). On the other hand, if item a is not in the intersection, then either P.sub.2 or P.sub.3 (or both) will not have F.sub.k(a). Thus, the execution of the two-party PSI protocol at step (7) finds exactly those items that are in the item sets of all three parties.

    [0019] With the protocol shown in FIG. 2, a number of advantages are realized over existing three-party PSI protocols. First, because OKVS S is communicated via a single message between only two parties (i.e., P.sub.1 and P.sub.3) and because P.sub.3 can decode S on each of the items in its item set Z without any further message exchanges with P.sub.1, the communication and resource overhead of this protocol is significantly lower, resulting in improved performance and scalability.

    [0020] Second, this protocol can be flexibly modified to make use of any two-party PSI protocol known in the art at step (7). In certain embodiments, a server-aided two-party PSI protocol can be employed, which is generally faster than conventional two-party PSI. In these embodiments, the protocol of the present disclosure can rely solely on symmetric key primitives and thus can completely avoid public key operations. A particular implementation of this approach is detailed in section (3) below.

    [0021] It should be appreciated that FIGS. 1 and 2 are illustrative and not intended to limit embodiments of the present disclosure. For example, although these figures indicate that item sets X, Y, and Z of parties P.sub.1, P.sub.2, and P.sub.3 each include the same number of items (i.e., m), this is not necessary; each item set can include a different number of items. Further, although FIGS. 1 and 2 depict a particular arrangement of entities within environment 100, other arrangements are possible (e.g., the functionality attributed to a particular entity may be split into multiple entities, entities may be combined, etc.). One of ordinary skill in the art will recognize other variations, modifications, and alternatives.

    3. Protocol Workflow

    [0022] FIG. 3 depicts a workflow 300 that provides additional details regarding the processing that may be performed by parties P.sub.1, P.sub.2, and P.sub.3 as part of the three-party PSI protocol shown in FIG. 2 according to certain embodiments. Workflow 300 assumes that parties P.sub.2 and P.sub.3 use a server-aided two-party PSI protocol in order to obtain the intersection of their y′.sub.t and z′.sub.t values per step (7) of FIG. 2, with party P.sub.1 acting as the two-party protocol server.

    [0023] Starting with block 302, party P.sub.1 can generate a random key k and transmit k to party P.sub.2. This random key may be a number of length s bits, such as 128 or 256 bits.

    [0024] At block 304, party P.sub.1 can create a pseudorandom function F.sub.k using random key k as input. In various embodiments, pseudorandom function F.sub.k can be computed in polynomial time with respect to key length s and cannot be distinguished from a truly random function in polynomial time.

    [0025] Party P.sub.1 can then evaluate F.sub.k(x.sub.i) for every x.sub.i ∈ X (block 306) and can create, using an OVKS encode operation, an OKVS S using key-value pairs ({x.sub.1, F.sub.k(x.sub.1)}, . . . , {x.sub.m, F.sub.k(x.sub.m)}) (block 308). As noted previously, the creation of OKVS S in this manner will cause S to output F.sub.k (x.sub.j) when decoded on a query q=x.sub.j and output a random value when decoded on a query q .Math. X. Upon creating OKVS S, party P.sub.1 can transmit a representation of S to party P.sub.3 (block 310). The specific nature of this representation can vary depending on the implementation of the encode operation. For example, in a particular embodiment, OKVS S may be represented as an (m−1) degree polynomial p where p is interpolated over points ({x.sub.1, F.sub.k(x.sub.1)}, . . . , {x.sub.m, F.sub.k(x.sub.m)}). In other embodiments OKVS S may be represented using a more complex data structure, such as a PaXoS data structure.

    [0026] At blocks 312 and 314, party P.sub.2 can generate pseudorandom function F.sub.k using random key k received from party P.sub.1 and can compute y′.sub.i=F.sub.k(y.sub.i) for every y.sub.i in its item set Y (i.e., for i=1, . . . , m). In addition, at block 316, party P.sub.2 can compute z′.sub.i=decode(S, z.sub.i) for every z.sub.i in its item set Z (i.e., for i=1, . . . , m).

    [0027] Parties P.sub.2 and P.sub.3 can thereafter carry out a server-aided two-party PSI protocol with respect to their computed values {y′.sub.i}.sub.i∈[m] and {z′.sub.i}.sub.i∈[m], with party P.sub.1 acting as the server. For example, at block 318, parties P.sub.2 and P.sub.3 can agree on a random key g that is different from previously discussed random key k.

    [0028] Upon agreeing upon random key g, party P.sub.2 can create a new pseudorandom function F.sub.g using g (block 320), compute y″.sub.i=F.sub.g (y′.sub.i) for every y.sub.i in its item set Y (i.e., for i=1, . . . , m) (block 322), and transmit {y″.sub.i}.sub.i.Math.[m] to party P.sub.1 (block 324). Similarly, party P.sub.3 can generate pseudorandom function F.sub.g using g (block 326), compute z″.sub.i=F.sub.g(z′.sub.i) for every z.sub.i in its item set Z (i.e., for i=1, . . . , m) (block 328), and transmit {z″.sub.i}.sub.i∈[m] to party P.sub.1 (block 330).

    [0029] At block 332, party P.sub.1 can receive all of the y″.sub.i and z″.sub.i values from parties P.sub.2 and P.sub.3 respectively. Note that because party P.sub.1 does not know random key g, all of these values look like random values to P.sub.1. Party P.sub.1 can then compare the set of y″.sub.i's to the set of z″.sub.i's, identify the intersection of these two sets, and send the intersection back to parties P.sub.2 and P.sub.3.

    [0030] Finally, parties P.sub.2 and P.sub.3 can determine, from the intersection of {y″.sub.i}.sub.i.Math.[m] and {z″.sub.i}.sub.i.Math.[m] received from party P.sub.1, the intersection of sets X, Y, and Z (block 334) and workflow 300 can end.

    [0031] Certain embodiments described herein can employ various computer-implemented operations involving data stored in computer systems. For example, these operations can require physical manipulation of physical quantities-usually, though not necessarily, these quantities take the form of electrical or magnetic signals, where they (or representations of them) are capable of being stored, transferred, combined, compared, or otherwise manipulated. Such manipulations are often referred to in terms such as producing, identifying, determining, comparing, etc. Any operations described herein that form part of one or more embodiments can be useful machine operations.

    [0032] Further, one or more embodiments can relate to a device or an apparatus for performing the foregoing operations. The apparatus can be specially constructed for specific required purposes, or it can be a generic computer system comprising one or more general purpose processors (e.g., Intel or AMD x86 processors) selectively activated or configured by program code stored in the computer system. In particular, various generic computer systems may be used with computer programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations. The various embodiments described herein can be practiced with other computer system configurations including handheld devices, microprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and the like.

    [0033] Yet further, one or more embodiments can be implemented as one or more computer programs or as one or more computer program modules embodied in one or more non-transitory computer readable storage media. The term non-transitory computer readable storage medium refers to any storage device, based on any existing or subsequently developed technology, that can store data and/or computer programs in a non-transitory state for access by a computer system. Examples of non-transitory computer readable media include a hard drive, network attached storage (NAS), read-only memory, random-access memory, flash-based nonvolatile memory (e.g., a flash memory card or a solid state disk), persistent memory, NVMe device, a CD (Compact Disc) (e.g., CD-ROM, CD-R, CD-RW, etc.), a DVD (Digital Versatile Disc), a magnetic tape, and other optical and non-optical data storage devices. The non-transitory computer readable media can also be distributed over a network coupled computer system so that the computer readable code is stored and executed in a distributed fashion.

    [0034] Finally, boundaries between various components, operations, and data stores are somewhat arbitrary, and particular operations are illustrated in the context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within the scope of the invention(s). In general, structures and functionality presented as separate components in exemplary configurations can be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component can be implemented as separate components.

    [0035] As used in the description herein and throughout the claims that follow, “a,” “an,” and “the” includes plural references unless the context clearly dictates otherwise. Also, as used in the description herein and throughout the claims that follow, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise.

    [0036] The above description illustrates various embodiments along with examples of how aspects of particular embodiments may be implemented. These examples and embodiments should not be deemed to be the only embodiments and are presented to illustrate the flexibility and advantages of particular embodiments as defined by the following claims. Other arrangements, embodiments, implementations and equivalents can be employed without departing from the scope hereof as defined by the claims.