Methods and systems for somewhat homomorphic encryption and key updates based on geometric algebra for distributed ledger/blockchain technology
11764943 · 2023-09-19
Assignee
Inventors
- David W. Honorio Araujo da Silva (Colorado Springs, CO, US)
- Carlos A. Paz de Araujo (Colorado Springs, CO)
- Hanes Barbosa Marques de Oliveira (Colorado Springs, CO, US)
- Marcelo Araujo Xavier (Dearborn, MI, US)
Cpc classification
H04L9/3239
ELECTRICITY
H04L9/0861
ELECTRICITY
H04L9/3033
ELECTRICITY
H04L9/0891
ELECTRICITY
H04L9/3073
ELECTRICITY
International classification
H04L9/00
ELECTRICITY
Abstract
Disclosed are methods and systems to encrypt data with SomeWhat Homomorphic Encryption (SWHE) properties for submission to a distributed ledger/blockchain that allows further open operations retained in the distributed ledger/blockchain on the encrypted data that will be properly reflected when the encrypted result is decrypted by the data owner. The somewhat homomorphic properties include addition and scalar division. Also disclosed is an ability to update a secret key applied for a ciphertext such that a single piece of data may be provided on the distributed ledger/blockchain by a data owner to a new data owner without also exposing other data encrypted with the original secret key of the original data owner.
Claims
1. A method for performing somewhat homomorphic operations on encrypted data in a distributed ledger/blockchain system without decrypting said encrypted data and where data resulting from said somewhat homomorphic operations remains encrypted, the method comprising: generating off-chain by a data owner node device a secret/private key sk and a public evaluation key pk, wherein said secret/private key sk is comprised of a first key multivector (
2. The method of claim 1: wherein said process of generating off-chain by said data owner node device a secret/private key sk and a public evaluation key pk further comprises: setting said prime number q equal to a prime number; randomly generating 16 integer values; setting a first 8 of said 16 integer values as coefficient values of said first key multivector (
3. The method of claim 1 wherein said process of performing on-chain by a calculation node device at least one somewhat homomorphic operation further comprises calculating said result ciphertext multivector (
4. The method of claim 1 wherein said process of performing on-chain by a calculation node device at least one somewhat homomorphic operation further comprises calculating said result ciphertext multivector (
5. The method of claim 1 wherein said distributed ledger/blockchain system is a private/permissioned system.
6. The method of claim 1 further comprising: generating off-chain by said data owner node device a second secret/private key sk.sub.2, wherein said second secret/private key sk.sub.2 is comprised of a first key multivector (
7. The method of claim 6: wherein said process of calculating off-chain by said data owner node device said token t further comprises calculating said first token multivector (
8. The method of claim 7 wherein said result ciphertext multivector (
9. The method of claim 6 wherein operations of each of said data owner node device, said calculation node device, and said new data owner node device are performed by one or more hardware devices as desired by a user.
10. A distributed ledger/blockchain system that performs somewhat homomorphic operations on encrypted data without decrypting said encrypted data and where data resulting from said somewhat homomorphic operations remains encrypted, the distributed ledger/blockchain system comprising: a data owner node device, comprising a memory comprising computer program instructions for: generating, off-chain, a secret/private key sk and a public evaluation key pk, wherein said secret/private key sk is comprised of a first key multivector (
11. The distributed ledger/blockchain system of claim 10: wherein said data owner node device further sets said prime number q equal to a prime number, randomly generates 16 integer values, sets a first 8 of said 16 integer values as coefficient values of said first key multivector (
12. The distributed ledger/blockchain system of claim 10 wherein said calculation node device further calculates said result ciphertext multivector (
13. The distributed ledger/blockchain system of claim 10 wherein said calculation node device further calculates said result ciphertext multivector (
14. The distributed ledger/blockchain system of claim 10 wherein said distributed ledger/blockchain system is a private/permissioned system.
15. The distributed ledger/blockchain system of claim 10: wherein said data owner node device generates, off-chain, a second secret/private key sk.sub.2, wherein said second secret/private key sk.sub.2 is comprised of a first key multivector (
16. The distributed ledger/blockchain system of claim 15: wherein said data owner node device further calculates said first token multivector (
17. The distributed ledger/blockchain system of claim 16 wherein said result ciphertext multivector (
18. The distributed ledger/blockchain system of claim 15 wherein operations of each of said data owner node device, said calculation node device, and said new data owner node device are performed by one or more hardware devices as desired by a user.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) In the drawings,
(2)
(3)
(4)
DETAILED DESCRIPTION OF THE EMBODIMENTS
(5) The various embodiments aim to address the challenge of expanding Blockchain Technologies (BT) by implementing a somewhat homomorphic encryption scheme that not only enables computation on encrypted data but also yields a key update protocol with which one can selectively reveal consolidated data from a blockchain application. Constructions of the various embodiments are meant to be compliant with the fundamental requirements of BT, including ownership control and non-repudiation. In isolation, BT and homomorphic encryption (HE) can both suffer from performance issues. Combining the two only escalates that risk. We rely on Clifford Geometric Algebra as the single algebraic structure for introducing efficient solutions for merging BT with HE. One target application considers a trusted environment with pre-screened parties, which allows the various embodiments to consider cryptographic solutions based on relaxed notions of security. One possible means of implementation is to encode the various embodiments using the Ruby language.
(6) I. Introduction
(7) Blockchain may be offered as a virtualized resource—Blockchain as a Service (BaaS). Blockchain has been used to reduce costs and the complexity of management, but blockchain raises concerns about the custody of data and the classical Trust Model.
(8) Blockchain is a distributed ledger where the state lies on a linked list of interdependent blocks, persisted under consensus. Blockchain defines a conjunction of technologies behind Bitcoin, where anonymous parties would join a network without permission, so blockchain was initially considered permissionless. The replication of the information amongst participants was onerous. So new initiatives approached the state replication paradigm in a more efficient way, increasing throughput by different consensus mechanisms. Companies then employed the new Distributed Ledger Technology (DLT) concept to promote cooperation, on a premise of identified nodes (i.e., permissioned).
(9) Private blockchains provide immutability and nonrepudiation, but blockchain has restrictive analysis over encrypted data when it is done without segregating information, participants or third-party trusted architectures. Public key cryptography is used in blockchain operations, but its limitations towards computability spawned a new race for Homomorphic Encryption (HE) schemes. Additionally, adversarial behaviors can arise in DLT environments from semi-honest partners, such as the use of shared data to leverage a commercial advantage. HE offers the ability to correctly evaluate encrypted data allowing the outsourcing of computation without loss of privacy. Thus, parties can agree on common scripts implementing HE prior to the operation over data assets.
(10) A. The Problems
(11) The various embodiments may be comprised of functional blocks, each of which may be tailored as described in more detail below according to objectives for scope, capability and security. The following sections provide a mathematical and numerical description of these functional blocks.
(12) Exclusivity is ingrained in the meaning of data property and no repossession lawsuit can restore ownership of digital data once the digital data is shared. Moreover, a legal contract cannot prevent misbehavior, being only a prior agreement on posterior punitive actions. Consequently, under the expectancy of misconduct, companies may avoid cooperation even when contracts provide legal protection.
(13) Conversely, a smart contract defines the behavior prior to an accordance between parties, resembling a legal preventive action that avoids undesired executions. If combined with HE, it can realize blind computations (i.e., big data analytics) where the entity performing the calculations does not know the results of the calculations without gaining permission from the data owner. Additionally, if improved with a homomorphic key update, it can offer a mechanism to transfer ownership without leaving the trusted environment.
(14) Cloud computing became a very expensive structure to reproduce in-house and the market made providing cloud computing a premise to stay competitive. Therefore, blockchain, with its ever-growing database and inherent complexity, has an appeal to be used as a third-party service for storage and management. However, blockchain's philosophy is built on the concept of cryptographic proof instead of trust, which creates a conundrum and weakens the ability of the technology to remain trustworthy since cloud suppliers work under the assumption of reputation and legal agreements. Another competence of blockchains is the capability to share digital assets avoiding power imbalance between parties. Nevertheless, fearing a loss of ownership, companies may restrain themselves from sharing sensitive data that could favor analysis and leverage a commercial relationship or research effort.
(15) In summary, cloud services bring uncertainty in the treatment of Confidentiality, Integrity, and Availability (CIA), whereas partnerships can be restrictive due to lack of trust or regulatory concerns. Therefore, the problems that the various embodiments are addressing in this disclosure includes the following issues.
(16) Problem 1: Given an immutable ledger on a permissioned blockchain setting, provide an efficient privacy preserving smart contract for computing arithmetic functions over encrypted data without violating the principles of ownership (the legal right of data access) and non-repudiation (the assurance that one cannot deny the validity of the data).
(17) Problem 2: Given a smart contract that solves Problem 1, provide the ability to transfer ownership of homomorphically encrypted data without revealing to non-owners anything that is supposed to be known only by data owners.
(18) One of the motivations for Bitcoin was the avoidance of a single point of failure. Avoiding a single point of failure may mean the absence of a computational node due to censorship or a dishonest behavior from a participant. Therefore, publicizing transactions was a means to verify integrity, although secrecy was not a concern. User's anonymity was provided by public key encryption and cryptography took care of the trust for the model.
(19) Furthermore, companies realized that their commercial relationships would benefit from the provenance given by the immutable tracking of events. Also, company partnerships are mediated by legal contracts, facilitating a transition to electronic scripts—smart contracts—when transacting digital assets. Therefore, the applicability of cryptography expands and problems such as data ownership become the main concern. On-chain solutions sometimes are based on the segregation of information or just under-covered data or scripts, not allowing computation or any kind of analytical result over encrypted assets. On the other hand, off-chain implementations can violate the credibility that is only earned when every operation is performed within public sight.
(20) For the various embodiments, two main guidelines must be preserved in order to favor CIA and DLT core principles. First, any sensitive data must be encrypted or decrypted off-chain, in possession of the owner. Second, any operation over an encrypted asset must occur on-chain and the algorithm must be known by the parties and consequently agreed upon before execution. Finally, the script implementing the mathematical operations must be sufficiently efficient to not overload the performance of the consensus mechanism at hand.
(21) C. Clifford Geometric Algebra
(22) Clifford geometric algebra is known by the richness, robustness and flexibility of its algebraic structure, which allows us to take advantage of concepts from several different branches of mathematics such as vector and matrix spaces, integer, rational and complex arithmetic, all in a single compact system. Clifford Geometric Algebra (herein simplified to GA) is a very powerful mathematical system. Some advantages commonly associated with GA computing include compactness of algorithms, implicit use of parallelism and high runtime performance and robustness. In working on the various embodiments it was further noted that three major benefits of working with GA based would be: (1) the ability of working with notions from several different branches of mathematics in a single framework (i.e., modular arithmetic, complex arithmetic, matrix algebra, etc.); (2) how much may be accomplished by even a very small set of computationally inexpensive algebraic tools; and (3) the simplicity of the construction itself, which favors understanding, maintenance and analysis.
(23) An embodiment may advantageously utilize Geometric Algebra to provide the encryption and decryption of numeric messages that may be stored and/operated on within the distributed ledger/blockchain. The use of Clifford Geometric Algebra (aka. Geometric Algebra) to provide the encryption and decryption provides the mathematical basis for the homomorphic operations of an embodiment.
(24) Geometric Algebra is an area of mathematics that describes the geometric interaction of vectors and other objects in a context intended to mathematically represent physical interactions of objects in the physical world. As used herein, this area of mathematics encompasses Geometric Algebra, Conformal Geometric Algebra and Clifford Algebra (referred to collectively herein as “Geometric Algebra” or “GA”). Generally, Geometric Algebra defines the operations, such as geometric product, inverses and identities, which facilitate many features of the various embodiments disclosed herein. Further, Geometric Algebra allows for the organization and representation of data into the “payload” of a multivector where the data in the payload may represent, for example, plaintext, ciphertext, or identifying signatures. Consequently, the various embodiments make beneficial use of Geometric Algebra properties to provide encryption, decryption, and homomorphic operations in a relatively computationally simplistic manner while still providing robust security for both data in motion and data at rest (e.g., data stored in the Cloud).
(25) It may be demonstrated that through multivector decompositions and a small subset of operations in the Clifford Geometric algebra it is possible to propose new methods for general-purpose data representation and data encryption with multivectors. The methods of the various embodiments may be used as part of the necessary reconciliation of data availability and privacy preservation. This is important because once data is encrypted, one cannot meaningfully process it, unless the encryption function is homomorphic with respect to one or more operations. Therefore, homomorphism is a key concern in constructions of the various embodiments since there is particular interest in encryption schemes that allow homomorphic computations over concealed data.
(26) Some fields of applications are inherently complex, as is the case for blockchain technologies and cryptography. The combination of blockchain and cryptography could easily increase the associated complexity exponentially should one fail to take into account the additional complexity from any particular tool or approach. In scenarios like combining blockchain and cryptography, it seems critical to consider solutions that are simple to implement, but are still powerful, so one can achieve much without necessarily adding complexity. For the various embodiments GA seems to be an appealing candidate for providing an efficient cryptographic protocol that aims to expand blockchain capabilities without violating its rigid, but necessary, constraints.
(27) Favoring the quick distinction of a multivector from any other data structure, we use capital letters with an overbar as in (3, 0) generate a geometric product space here denoted by
.sub.q.sup.3. A multivector is given by
.sub.0=m.sub.0ē.sub.0, the vector part
.sub.1=m.sub.1ē.sub.1+m.sub.2ē.sub.2+m.sub.3ē.sub.3, the bivector part
.sub.2=m.sub.12ē.sub.12+m.sub.13ē.sub.13+m.sub.23ē.sub.23, and the trivector or pseudoscalar part
.sub.3=m.sub.123ē.sub.123. An example of a three-dimension (3D) multivector Ā that includes a scalar, a vector, a bivector, and a trivector is:
Ā=a.sub.0+a.sub.1ē.sub.1+a.sub.2ē.sub.2+a.sub.3ē.sub.3+a.sub.12ē.sup.12+a.sub.13ē.sub.13+a.sub.23ē.sub.23+a.sub.123ē.sub.123
where ē.sub.i is a unit vector along the i-axis and ē.sub.12 represents the orientation of the area created by a.sub.12. Notably, a Geometric Algebra multivector in N-space (i.e., a N-dimension multivector) has 2.sup.N coefficients whereas a standard N-dimension vector has only N coefficients. Accordingly, the Geometric Algebra multivectors provide a sense of size, direction, and volume while a standard vector would only provide a sense of size and direction. As the concepts involved in Geometric Algebra are part of a deep and rich mathematical file, some general observations may be helpful to the description of the various embodiments disclosed herein, below. First, each of the a.sub.i values in the multivector Ā above may be “packed” with information and each a.sub.i value may range from zero to very large (e.g., >256,000 bits or an entire message). Secondly, the inverse of Ā when multiplied by Ā yields unity, or:
ĀĀ.sup.−1=1
Thus, if a second multivector
ĀĀ.sup.−1
(28) Computations on the coefficients of .sub.q.sup.3 and, thus, we write
.sub.q.sup.3. The multivector involutions reverse and Clifford conjugation are denoted by
(29)
such that .sub.q.sup.3 follow the standard geometric product definition in Cl(3, 0) and have added that the computation for all coefficients is now reduced modulo q. The scalar multiplication for Āα for α ∈
.sub.q is computed by multiplying each coefficient of Ā by α modulo q. The scalar division A/a is computed by multiplying each coefficient of Ā by x=α.sup.−1 mod q where x is the modular multiplicative inverse of α with respect to q such that αx=1 mod q. Thus, the following notations are equivalent: α.sup.−1 mod q=|a.sup.−1|.sub.q=|1/α|.sub.q.
(30) As for the basic operations in .sub.q.sup.3, similar to the operations of a vector space, one can add, subtract, scalar multiply and scalar divide multivectors component-wise. Multiplication of multivectors is achieved with the geometric product, which is given by Ā
(31) D. Homomorphisms
(32) Given two messages a, b, a function f is homomorphic with respect to a given operation ∘ if f(a∘b)=f(a)∘f(b). When we represent the messages a, b as the multivectors Ā,
(33) The essential purpose of homomorphic encryption is to allow computation on encrypted data without decrypting the data in order to perform the computation. In this way, the encrypted data can remain confidential and secure while the encrypted data is processed for the desired computation. Accordingly, useful tasks may be accomplished on encrypted (i.e., confidential and secure) data residing in untrusted environments. In a world of distributed computation and heterogeneous networking, the ability to perform computations on encrypted data may be a highly desirable capability. Hence, finding a general method for computing on encrypted data is likely a highly desirable goal for cryptography.
(34) A sought-after application of homomorphic encryption may be for distributed ledger/blockchain systems. Encrypting blockchain stored data may mitigate the threat of data being compromised by a breach, but then the owners of the data would not then be able to perform operations (i.e., add, scalar divide, etc.) on the blockchain stored data. In order to perform operations on encrypted data stored in the blockchain, it would be necessary to download the encrypted blockchain stored data, recover/decrypt the data, perform all desired operations on the data locally, encrypt the resulting data and send the resulting data back to the blockchain. Alternatively, if a user wants another blockchain node to perform the computations, the other node would require access to the user's encryption/security keys. It is becoming increasing undesirable to provide others access to a user's security keys as the more entities that have access to the security keys inherently increases the susceptibility of the security keys to being breached, or even stolen by an unscrupulous user. Homomorphic encryption would allow the blockchain to operate on encrypted data without decryption, and without access to the client's security keys.
(35) For the various embodiments, the “payload” may be packed in the values of the scalars and coefficients of the multivector elements. The packing method may define, among many things, the Geometric Algebra operations permissible for an embodiment. For example, the Rationalize operation on multivectors yields zero when all multivector coefficients are equal. Such multivectors having all equal coefficients have no inverse and the geometric product of such multivectors having all equal coefficients with another multivector has no inverse. Different aspects of the various embodiments, including the decryption methodology that utilizes the inverse of the security key(s) multivector to perform the decryption. Therefore, to avoid problems when performing an inverse operation, the various multivectors being utilized in the various embodiments should not have all equal value coefficients, unless specifically identified as being meant to be non-invertible.
(36) II. Target Definitions
(37) Before introducing our specifics of the constructions of the various embodiments to address the problems discussed in Section I-A, a definition of the general syntax and notions aimed to be achieved is presented. This is useful for many reasons including the ability if the desired goals are achieved, but also how well the goals are achieved.
(38) A. SWHE Scheme
(39) The syntax of a SomeWhat Homomorphic Encryption (SWHE) scheme is defined as follows:
(40) Definition 1: A SWHE scheme denoted as:
Π=(Gen, Enc, Dec, Add, SDiv) Eq. 2
is a tuple of efficient (i.e., probabilistic polynomial-time) algorithms with the syntax given by the following paragraphs.
(41) Gen is a probabilistic polynomial-time key-generation algorithm that takes as input the security parameter 1.sup.λ and outputs a private-key sk and a public evaluation key pk. The secret key implicitly defines a ring that will serve as the message space. We write the syntax as (sk, pk)←Gen(1.sup.λ). The security parameter is usually given in unary notation which indicates a λ-bit string of 1s so the efficiency of the algorithm is expected to be polynomial-time in λ.
(42) Enc is a probabilistic polynomial-time encryption algorithm that takes as input a secret key sk and message m and outputs a ciphertext c as a n-dimensional tuple. We write the syntax as c←Enc(sk,c).
(43) Dec is a deterministic polynomial-time encryption algorithm that takes as input a secret key sk and a ciphertext c and outputs a message m. We write the syntax as m=Dec(sk,c).
(44) Add is a deterministic polynomial-time addition algorithm that takes two ciphertexts c.sub.1 and c.sub.2 and outputs a ciphertext c which corresponds to the component-wise addition of c.sub.1 and c.sub.2 reduced modulo pk. We write the syntax as c=Add(pk, c.sub.1, c.sub.2).
(45) SDiv is a deterministic polynomial-time scalar division algorithm that takes a ciphertext c.sub.1 and a scalar α and outputs a ciphertext c which corresponds to the scalar division of all elements of c by α reduced modulo pk. We write the syntax as c=SDiv(pk, c.sub.1, α).
(46) Correctness requires the following:
(47) 1) For all sk, pk output by Gen, and all m ∈ we have Dec(sk,Enc(sk, m)=m.
(48) 2) For all c.sub.i←Enc(sk, m.sub.i), i=1, 2 and all α∈ , the following holds:
Dec(sk,Enc(sk, Add(pk, c.sub.1, c.sub.2)))=m.sub.1+m.sub.2,
Dec(sk,Enc(sk, SDiv(pk, c.sub.1, α)))=m.sub.1/α. Eq. 3
(49) Definition 2: A SWHE scheme Π is secure if for a uniform m ∈ , all (sk, pk)←Gen(1.sup.λ) and all c←Enc(sk,c), no efficient adversary A can recover m by knowing only pk and c.
(50) B. Key Update Protocol
(51) Definition 3: A key update protocol denoted as: Eq. 4
Σ=(TokGen, KeyUpd) Eq. 3
is a tuple of efficient algorithms with the syntax given by the following paragraphs.
(52) TokGen is a deterministic polynomial-time token generation algorithm that takes an old secret key sk.sub.old and a new secret key sk.sub.new and outputs a token t. We write the syntax as t=TokGen (sk.sub.old, sk.sub.new).
(53) KeyUpd is a deterministic polynomial-time key update algorithm that takes a token t and a ciphertext c.sub.old, previously encrypted with sk.sub.old, and outputs a ciphertext c.sub.new that is encrypted with sk.sub.new. We write the syntax as c.sub.new=KeyUpd(t, c.sub.old).
(54) Definition 4: The key update protocol Σ is secure if for all uniform sk.sub.old and sk.sub.new output by Gen (1.sup.λ) and t output by TokGen, the probability of any efficient adversary A to recover either sk.sub.old or sk.sub.new by knowing t, c.sub.old and c.sub.new is negligible.
(55) III. Description of the SWHE Scheme
(56) In this section we propose a construction for an embodiment that aims to satisfy the definitions in Section II-A. But first, let us introduce our motivation and a couple of useful remarks and definitions.
(57) Motivation 1: We want to design a SWHE scheme that is secure based on the assumption that solving an underdetermined system of equations is computationally hard. In order to achieve this goal, we propose a design of an encryption function based on randomness and underdeterminancy. We want to transform a message m into a random multivector
(58) Motivation 2: We want to build an encryption scheme to be applied in a private (permissioned) blockchain among trusted parties. Thus, we are providing privacy in a trusted environment assuming that all the parties must follow a given protocol.
(59) Remark 1: Due to Motivation 2, we assume that a relaxed threat model is in place where the adversary is not supposed to have any knowledge about the message that originated a given ciphertext. This allows us to propose an experimental and compact solution to solve Problems 1 and 2, as well as allowing us to introduce and discuss instances of a new approach for expanding BT capabilities with HE.
(60) In Definition 1, the algorithm SDiv, for any useful result, might imply a fractional output. We will introduce a construction in which the encryption function receives positive integers as inputs and generates ciphertexts where the underlying computation is performed over the integers modulo a prime. Since Enc takes integers in Z.sub.q as input and generates ciphertexts also over integers in .sub.q, the decryption function is expected to output integers in
.sub.q. The algorithm Add performs homomorphic addition of ciphertexts and the decryption of the results is also an integer. However, in the specific case of SDiv, a ciphertext is divided by a scalar which might result in a non-integer rational number. The scalar division is performed over the integers, with the modular multiplicative inverse. In order to map the integer result of a scalar division to its corresponding rational representation we will use the Extended Euclidean Algorithm (EEA) according to Definition 5 below.
(61) Definition 5: Given a prime p and a positive integer c ∈ .sub.p,
(62) let the EEA be computed as follows:
(63) 1) Set a.sub.0=p, a.sub.1=c; b.sub.0=0, b.sub.1=1; i=1.
(64) 2) While a.sub.i>└√{square root over (p/2)}┘ compute q=└a.sub.i−1/a.sub.i┘ a.sub.i+1=a.sub.i−1−qa.sub.i b.sub.i+1=b.sub.i−1−qb.sub.i i=i+1.
(65) 3) a/b=a.sub.i/b.sub.i
(66) 4) Return a/b. We write the syntax as a/b=EEA (p, c).
(67) Now we are ready to introduce constructions that satisfy the definitions in II-A. Note that the following constructions take into account the bit size concerns of a computer program. Conceptually, the various embodiments are not limited by the bit size as the conceptual model may theoretically encompass infinite bit size values.
(68) Gen takes as input 1.sup.λ and proceeds as follows: (1) set b=λ/8; (2) let q be the smallest prime greater than 2b; (3) choose uniform 16 b-bit integers and define .sub.q.sup.3 such that the first 8 integers are the coefficients of
=
.sub.q.
(69) Enc takes as input sk=(
m.sub.12=|−m.sub.0−m.sub.1+m.sub.2+m.sub.3−m.sub.13+m.sub.23+m.sub.123+m|.sub.q. Eq. 5 2) For j ∈ {0, 1, 2, 3, 12, 13, 23, 123}, define
(70) Dec takes as input sk=(.sub.q.sup.3 and proceeds as follows: 1) Retrieve
m=|m.sub.0+m.sub.1−m.sub.2−m.sub.3+m.sub.12+m.sub.13−m.sub.23−m.sub.123|.sub.q. Eq. 7 4) Update the value m by mapping it to a rational format such that m=a/b=EEA(q, m). Output m.
(71) Add takes as inputpk and .sub.q.sup.3 and computes and outputs
(72) SDiv takes as inputpk, .sub.q.sup.3 and a scalar α in
.sub.q, and computes and outputs
(73) Lemma 1: For all uniformly generated coefficients of m.sub.j ∈ .sub.q, where j ∈ {0, 1, 2, 3, 12, 13, 23, 123}, q is prime, and for all m.sub.12 as defined in Eq. 5, the result in Eq. 7 holds.
(74) Proof Given the definition of m.sub.12 in Eq. 5, let's re-write Eq. 7 as m=m.sub.a+m.sub.b such that:
m.sub.a=m.sub.0+m.sub.1−m.sub.2−m.sub.3+m.sub.12, Eq. 8
m.sub.b=m.sub.13−m.sub.23−m.sub.123. Eq. 9
If we substitute for mu in Eq. 8 we have:
m.sub.a=m−m.sub.13+m.sub.23−m.sub.123, Eq. 10
so, when we compute m.sub.a+m.sub.b adding Eqs. 9 and 10 we obtain:
m.sub.a+m.sub.b=m−m.sub.13+m.sub.23+m.sub.123+m.sub.13−m.sub.23−m.sub.123=m, Eq. 11
(75) Lemma 2: For any prime q, any non-zero g ∈ .sub.q and any
.sub.q.sup.3, we have
(76) Proof. For any prime q, all non-zero elements g ∈ .sub.q have a unique modular multiplicative x=|g.sup.−1|.sub.q such that |gx|.sub.q=1. When we compute
.sub.q. According to the Bézout's identity, if gcd(g, q)=1, then we can write:
gx+gy=gcd(g, q)=1, Eq. 12
where x, y have integer solutions. We can then rewrite Eq. 12 as:
gx−1=(−y)q and gx≡1 mod q, Eq. 13
and, thus, x is the modular multiplicative inverse of g with respect to q.
(77) For small values of q one can naively compute x by iterating x from 1 to q−1 until finding the result that satisfies |gx|.sub.q=1. However, a better way is to use the Extended Euclidean Algorithm (EEA) which can efficiently compute modular multiplicative inverses for large values of g and q as long as gcd(g, q)=1.
(78) Theorem 1: For all sk output by Gen and m ∈ .sub.q, we have Dec(sk, Enc(sk, m))=m.
(79) Proof: Recall that in the definition of Gen, .sub.q, we obtain
(80) Lemma 3: For all a, b ∈ .sub.q that is transformed into Ā,
.sub.q.sup.3 according the first two steps in the Enc algorithm, decoding Ā+
.sub.q results in a+b and therefore the transformation of a, b into Ā,
(81) Proof: For all a, b ∈ .sub.q that are represented by Ā,
.sub.q.sup.3, respectively, where the coefficients of Ā,
.sub.q with the exception of a.sub.12 in Ā and b.sub.12 in
s.sub.12=a−a.sub.0−a.sub.1+a.sub.2+a.sub.3−a.sub.13+a.sub.23+a.sub.123+b−b.sub.0−b.sub.1+b.sub.2−b.sub.3−b.sub.13+b.sub.23+b.sub.123. Eq. 14
If we organize the coefficients of
s.sub.a=a.sub.0+b.sub.0−a.sub.2−a.sub.3+b.sub.0+b.sub.1−b.sub.2−b.sub.3 Eq. 15
s.sub.b=s.sub.12
s.sub.c=a.sub.13−a.sub.23−a.sub.123+b.sub.13−b.sub.23−b.sub.123,
we compute s.sub.a+s.sub.b to obtain:
(82)
so, essentially, computing s.sub.a+s.sub.b+s.sub.c gives s.sub.a+s.sub.b+s.sub.c=a+b.
(83) Lemma 4: Lemma 2 also applies to scalar multiplication and scalar division of all Ā, .sub.q.sup.3 by all scalar g ∈
.sub.q. Recall that scalar division by g mod q is a scalar multiplication by g.sup.−1 mod q. A multivector scalar multiplication holds the properties of additivity in the scalar and additivity in the (multi)vector [44], and, therefore, we have:
Āg+
(84) Lemma 5: For all prime q, .sub.u where u=└√{square root over (q/2)}┘, the following holds:
m/α=Dec(sk, SDiv(pk,
(85) Proof. On the encrypted domain, where computation is performed modulo q, for q is a prime, the scalar division of
(86) Due to Lemma 5, and since we assume that homomorphic scalar divisions will always occur, in order to guarantee the desired result of scalar divisions over encrypted data, we reduce the message space originally defined as .sub.q in Gen by
.sub.u, for u=└√{square root over (q/2)}┘.
(87) Theorem 2: For all (sk, pk) output Gen, .sub.q.sup.3, and m.sub.1, m.sub.2, α ∈
.sub.q, the following holds:
Dec(sk, Add(pk, Enc(sk, m.sub.1), α))=m.sub.1.Math.α. Eq. 19
and
Dec (sk, Add(pk, Enc(sk, m.sub.1), Enc(sk, m.sub.2)))=m.sub.1+m.sub.2, Eq. 20
(88) Proof Given m.sub.1, m.sub.2, α ∈ .sub.q, sk=(
.sub.q.sup.3 as follows:
We compute
(89)
By applying Lemma 3 ad 4, we obtain m.sub.1+m.sub.2. Similarly, let
(90)
By applying Lemma 3, 4 and 5, we obtain m.sub.1/α.
(91) Theorem 3: If an adversary can efficiently solve a system of equations with 8 non-redundant equations and 24 unknowns then
can efficiently recover m from
(92) Proof: Let a multivector Ā ∈ .sub.q.sup.3 be written as:
Ā=A
.sub.0+
A
.sub.1+
A
.sub.2+
A
.sub.3 Eq. 24
where <.Math.>.sub.i, for i ∈ {0,1,2,3}, is called a multivector grade. Grades 0 and 3 contain a single element each and grades 1 and 2 contain three elements each, for a total of 8 elements.
(93) Given .sub.q.sup.3 such that
.sub.i. Similarly, if one wants to recover
.
(94) Assuming the adversary only knows
.sub.i=
.sub.i, Eq. 25
where each element of is faced with a total of 24 unknown variables. This means that the system of equations the adversary needs to solve in order to recover
(95) Lemma 6: The proposed SWHE Scheme is secure assuming that no adversary can efficiently solve (that is, solve under polynomial-time) an underdetermined system of equations which its underdeterminancy is not affected by the number of ciphertexts samples under consideration.
(96) Proof: Given .sub.i=
.sub.i Eq. 26
.sub.i=
.sub.i
.sub.i=
.sub.i.
The system would then have a total of 24 equations (8 for each cyphertext) and 32 unknowns if solving for both .sub.i=
.sub.i=
.sub.i+
.sub.i=
.sub.i, Eq. 27
i.e., the 8 equations with respect to the elements of .sub.i=
.sub.i=
.sub.i=α.sup.−1
.sub.i. Eq. 28
Notice that the equations for the elements of
(97) IV. Description of the Key Update Protocol
(98) In this section we propose a construction that aims to satisfy the definitions presented in Section II-B.
(99) Motivation 3: We want to design a key update protocol that securely allows one to update the secret key of an existing ciphertext without revealing the corresponding message, the old key or the new key, also based on the assumption that solving a non-redundant underdetermined system of equation is computationally hard. In order to achieve this goal, we propose a design for a protocol based on underdeterminancy. From the old and the new key, we want to generate a token that is expected to not reveal information about either the old or the new key. Once the token is generated, one should be able to use it for changing the keys on an existing ciphertext under the old key, generating a new ciphertext under the new key. In this process, one should not be able to derive the underlying plaintext message.
(100) TokGen takes as input two secret keys sk.sub.1=(
(101) KeyUpd takes as input the token t=(
(102) Theorem 4: For all sk.sub.1 and sk.sub.2 output by Gen, and all
it holds:
(103) Proof Given the setup in Theorem 4, we verity that:
(104)
(105) Theorem 5: If an adversary A can efficiently solve a system of equations with more unknowns than non-redundant equations then A can efficiently recover m from
(106) Proof Given a token t=(, with knowledge of
.sub.i=
.sub.i Eq. 31
.sub.i=
.sub.i
.sub.i=
.sub.i
.sub.i=
K.sub.12.sup.−1
.sub.i
Notice that this system of equations contains a total of 32 equations (8 equations for each of the multivectors
(107) Theorem 6: If an adversary can efficiently solve a system of equations with 8 non-redundant equations and 16 unknowns then
can efficiently recover sk.sub.1 or sk.sub.2 from t.
(108) Proof: The proof of Theorem 6 can be borrowed from the proof of Theorem 5, as the same system of equations and its characteristics apply in this case.
(109) V. Application
(110) In order to provide practical insights on how to connect the proposed constructions to a real-world DLT-based system, we introduce an illustrative design where we apply our SWHE scheme and key update protocol. In our example we describe an instance of the data ownership problem, where regulatory restrictions reduce the solution space for data computation. Due to space limitations, we cannot fully describe the system internals in all its details (i.e., consensus mechanism for persisting data), so we will provide a minimally required high level description of its building blocks.
(111) Motivation 4: $300 billion out of more than $1.7 trillion are spent annually on medical research alone [47] and advancements depend on the reproducibility of experiments and the scientific correctness underlying it. Moreover, healthcare systems operate under strict regulations [48] in order to protect the secrecy of patients, resulting in a very siloed industry [18]. In such scenario, blockchain technologies have the potential to mediate the access to healthcare data [49], avoiding power imbalance over digital assets. With the addition of HE, a DLT system can protect the privacy of individuals' Electronic medical records (EMRs) while offering compliant analysis over their data.
(112) Definition 6: This blockchain application is composed by the building blocks described in the following paragraphs.
(113) User .sub.A: The original data owner. Responsible for persisting information on-chain and the one that decides when and to whom the ownership is transferred.
(114) User .sub.B: An existing user from the same consortium of User
.sub.A. User
.sub.B has access to the off-chain cryptographic library and can perform homomorphic computations on-chain at any time. User
.sub.B is interested in getting insights of data processed at the blockchain.
(115) App component .sub.c: Software that works as an interface between the user and the SWHE scheme and the key update protocol. The App component
.sub.C imports the algorithms Gen, Enc, and Dec from the SWHE scheme and the TokGen from the key update protocol.
(116) Blockchain component .sub.C: A system composed by the ledger (the blockchain database) and a smart contract that controls the access to the ledger. The smart contract imports Add, SDiv from the SWHE scheme and KeyUpd from the key update protocol.
(117) Definition 7: .sub.C is a tuple with the following efficient algorithms: NewRecord, GetRecords, GenReport, GenResult, GetReport and GetResult, as described in the following paragraphs.
(118) GenReport generates a report calculating the median of a given number of records. We write the syntax as GenReport (id.sub.SLedger), and operates as follows: 1) First, GetRecords is called, retrieving the records represented by ids.sub.Ledger; 2) Then, Add operates the addition of multivectors inside the records returned by GetRecords; 3) SDiv takes all summed multivectors given by Add and divides by the number of records returned by GetRecords; and, 4) Finally NewRecord is used to persist the aggregated data.
(119) GenResult takes as input an id, id.sub.Ledger, and the generated tokens t to update the keys of a report. We write the syntax as GenResult (id.sub.Ledger; t), and operates as follows: 1) First, GetReport is called, retrieving the report of id.sub.Ledger; 2) Second, KeyUpd is used to change the keys of report id.sub.Ledger; and, 3) Finally, NewRecord is used to persist the resulting data.
(120) GetResult takes as input id.sub.Ledger and retrieves a report that had its keys updated. We use the syntax GetResult (id.sub.Ledger).
(121) Example 1: In our example, .sub.A represents a hospital that owns patients' records.
.sub.8 stands for a research institution that wants to make analysis over patients' data. The medical industry runs under strict regulation and health institutions are forbidden to share personal information from individuals. However, a disease outbreak urged the aforementioned organizations to cooperate. Therefore, the hospital agreed to share information under a security protocol, that could lead to a better triage of patients and, perhaps, a path to a cure.
(122) In the DLT environment, both institutions will have a copy of the data, but their ownership is tied to their keys. Since the smart contract is using a SWHE scheme, computations can be performed homomorphically by .sub.B and the property over the resulting analysis can be transferred through the key update protocol by
.sub.A.
(123) .sub.8 wants to calculate the average number of pre-existing conditions of every patient that died from the new illness. Therefore,
.sub.8 generates a report over a selection of expired patients. Then,
.sub.A analyzes the result and decides to grant permission. To do so, a symmetric key is shared with
.sub.8 through a traditional key exchange protocol. Now,
.sub.A updates the keys of the report, allowing
.sub.8 to finally detect a high number of pre-existing conditions in patients that did not recover.
(124) VI. Conclusions
(125) Through practical constructions of the various embodiments the realization of a somewhat homomorphic encryption (SWHE) scheme is demonstrated and a key update protocol as a strategy for expanding the current capabilities of blockchain technologies (BT) is also demonstrated. With a very small set of elementary functions found in Clifford geometric algebra, the various embodiments are able to provide simple and yet efficient cryptographic protocols to equip BT with a homomorphic smart contract. Without violating current business logic constraints in BT, one can use constructions of the various embodiments to homomorphically analyze encrypted data, generate reports and transfer the data ownership without compromising the original key holder's and/or third parties' privacy. The disclosure further provides evidence of the various embodiments' proposed algorithms' correctness as well as the security properties the algorithms carry, under some strong assumptions such as the attacker's knowledge restricted to public information.
(126) Hardware Implementation for Data Concealment Embodiments (
(127)
(128) Further, generally, any computing device capable of communication over any form of electronic network or bus communication platform may be one, two or all three of the node devices 104-108 shown in
(129) Various embodiments may implement the network/bus communications channel for the blockchain system 102 using any communications channel capable of transferring electronic data. For instance, the network/bus communication connection may be an Internet connection routed over one or more different communications channels during transmission between the node devices 104-108 and the blockchain system 102. Likewise, the network/bus communication connection may be an internal communications bus of a computing device, or even the internal bus of a processing or memory storage Integrated Circuit (IC) chip, such as a memory chip or a Central Processing Unit (CPU) chip. The network/bus communication channel may utilize any medium capable of transmitting electronic data communications, including, but not limited to: wired communications, wireless electro-magnetic communications, fiber-optic cable communications, light/laser communications, sonic/sound communications, etc., and any combination thereof of the various communication channels.
(130) The various embodiments may provide the control and management functions detailed herein via an application operating on the node computing devices 104-108. The node computing devices 104-108 may each be a computer or computer system, or any other electronic devices device capable of performing the communications and computations of an embodiment. The node computing devices 104-108 may include, but are not limited to: a general purpose computer, a laptop/portable computer, a tablet device, a smart phone, an industrial control computer, a data storage system controller, a CPU, a Graphical Processing Unit (GPU), an Application Specific Integrated Circuit (ASI), and/or a Field Programmable Gate Array (FPGA). Notably, the first 102 and/or second 104 computing devices may be the storage controller of a data storage media (e.g., the controller for a hard disk drive) such that data delivered to/from the data storage media is always encrypted so as to limit the ability of an attacker to ever have access to unencrypted data. Embodiments may be provided as a computer program product which may include a computer-readable, or machine-readable, medium having stored thereon instructions which may be used to program/operate a computer (or other electronic devices) or computer system to perform a process or processes in accordance with the various embodiments. The computer-readable medium may include, but is not limited to, hard disk drives, floppy diskettes, optical disks, Compact Disc Read-Only Memories (CD-ROMs), Digital Versatile Disc ROMS (DVD-ROMs), Universal Serial Bus (USB) memory sticks, magneto-optical disks, ROMs, random access memories (RAMs), Erasable Programmable ROMs (EPROMs), Electrically Erasable Programmable ROMs (EEPROMs), magnetic optical cards, flash memory, or other types of media/machine-readable medium suitable for storing electronic instructions. The computer program instructions may reside and operate on a single computer/electronic device or various portions may be spread over multiple computers/devices that comprise a computer system. Moreover, embodiments may also be downloaded as a computer program product, wherein the program may be transferred from a remote computer to a requesting computer by way of data signals embodied in a carrier wave or other propagation medium via a communication link (e.g., a modem or network connection, including both wired/cabled and wireless connections).
(131) Operational Flow Chart for Encryption/Decryption and SWHE Calculations for an Embodiment (
(132)
(133) Operational Flow Chart for Key Update Operation for an Embodiment (
(134)
(135) The foregoing description of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed, and other modifications and variations may be possible in light of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical application to thereby enable others skilled in the art to best utilize the invention in various embodiments and various modifications as are suited to the particular use contemplated.