Cryptosystem and method using isogeny-based computations to reduce a memory footprint

Abstract

A computer processing system and method for reducing memory footprint that includes initiating, through at least one computer processor, a cryptography session utilizing an i-degree isogeny arithmetic computation having chained computations therein. The cryptography session includes implementing a first iteration cycle, of a plurality of iteration cycles, and a implementing a remaining amount of the plurality of iteration cycles, each of the plurality iteration cycles computing custom character isogenies using a compressed Z value to complete the -degree isogeny arithmetic computation. The first iteration cycle includes individually computing a plurality of sequentially occurring pivot points within the chained computations, implementing a Co—Z algorithm within the plurality of sequentially occurring pivot points to compute and store the compressed Z value on one of the plurality of temporary registers and computing a first custom character isogeny of the -degree isogeny arithmetic computations using the compressed Z value.

Claims

1. A computer-implemented cryptography method using isogeny-based computations for reducing a memory footprint comprising the steps of: providing at least one computer processor resident on an electronic computer device and having at least one register file with a plurality of temporary registers resident thereon; and initiating, through the least one computer processor, a cryptography session utilizing an custom character -degree isogeny arithmetic computation having chained computations therein, wherein the cryptography session includes: implementing a first iteration cycle, of a plurality of iteration cycles, that includes: individually computing a plurality of sequentially occurring pivot points within the chained computations; implementing a Co—Z algorithm within the plurality of sequentially occurring pivot points to compute a compressed Z value; storing the compressed Z value generated from the Co—Z algorithm on one of the plurality of temporary registers; and computing a first custom character isogeny of the -degree isogeny arithmetic computations using the compressed Z value; and implementing a remaining amount of the plurality of iteration cycles, each of the plurality iteration cycles computing isogenies using the compressed Z value to complete the -degree isogeny arithmetic computation.

2. The method according to claim 1, wherein the first iteration cycle further comprises: implementing the Co—Z algorithm alternatively within the plurality of sequentially occurring pivot points to compute the compressed Z value.

3. The method according to claim 2, wherein: the custom character -degree isogeny arithmetic computation is carried out with a multiplication-based complexity of O(e log e).

4. The method according to claim 1, wherein the first iteration cycle further comprises: exclusively storing the compressed Z value generated from the Co—Z algorithm on the one of the plurality of temporary registers throughout the plurality of the iteration cycles.

5. The method according to claim 1, further comprising: updating the compressed Z value generated by the Co—Z algorithm for each of the plurality of sequentially occurring pivot points after computing the first custom character isogeny of the -degree isogeny arithmetic computations.

6. The method according to claim 5, wherein the first iteration cycle further comprises: utilizing a single compressed Z value generated by the Co—Z algorithm and stored by the one of the plurality of temporary registers, wherein the Z value is updated sequentially after individually computing a plurality of sequentially occurring pivot points within the chained computations.

7. The method according to claim 6, wherein: the Z value is updated after computing a first custom character isogeny of the -degree isogeny arithmetic computations.

8. The method according to claim 6, wherein: the compressed Z value, Z, in Co—Z algorithm is at least partially defined by a mathematical formula, Z=Z×Z.sub.tmp, wherein X.sub.k=X.sub.k×Z.sub.tmp.

9. The method according to claim 8, wherein: the first custom character isogeny is at least partially defined by a mathematical formula, X.sub.k=X.sub.k×Z.sub.new.

10. A computer processing cryptosystem using isogeny-based computations to reduce a memory footprint comprising: at least one computer processor resident on an electronic computer device and having a register file with at least two temporary registers resident therein, the at least one computer processor operably configured to implement an custom character -degree isogeny arithmetic computation, utilizing a Co—Z algorithm, having chained computations that include a plurality of sequentially occurring pivot points, the plurality of sequentially occurring pivot points including the at least one computer processor operably configured to implement the Co—Z algorithm to perform a computation generating a compressed Z value that is operably configured to be stored within one of the at least two temporary registers for computing within the plurality of sequentially occurring pivot points to generate a custom character -degree isogeny arithmetic computation isogeny computation output.

11. The computer processing cryptosystem according to claim 10, wherein: the at least one computer processor is operably configured to implement the Co—Z algorithm over a first iteration cycle of a plurality of iteration cycles, the first iteration cycle including the processor operably configured to compute a first custom character isogeny of the -degree isogeny arithmetic computations using the compressed Z value.

12. The computer processing cryptosystem according to claim 11, wherein: the at least one computer processor is operably configured, after first iteration cycle, to implement a remaining amount of the plurality of iteration cycles, each of the plurality iteration cycles configured to compute another custom character isogeny.

13. The computer processing cryptosystem according to claim 11, wherein the first iteration cycle further comprises: the at least one computer processor operably configured to implement the Co—Z algorithm alternatively within the plurality of sequentially occurring pivot points to compute the compressed Z value.

14. The computer processing cryptosystem according to claim 11, wherein: the custom character -degree isogeny arithmetic computation is carried out with a multiplication-based complexity of O(e log e).

15. The computer processing cryptosystem according to claim 10, wherein: the compressed Z value, Z, in Co—Z algorithm is at least partially defined by a mathematical formula, Z=Z×Z.sub.tmp, wherein X.sub.k=X.sub.k×Z.sub.tmp.

16. The computer processing cryptosystem according to claim 10, wherein: the first custom character isogeny is at least partially defined by a mathematical formula, X.sub.k=X.sub.k×Z.sub.new.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) The accompanying figures, where like reference numerals refer to identical or functionally similar elements throughout the separate views and which together with the detailed description below are incorporated in and form part of the specification, serve to further illustrate various embodiments and explain various principles and advantages all in accordance with the present invention.

(2) FIG. 1 is a process flow diagram depicting a method of using isogeny-based computations to reduce a memory footprint in a processing cryptosystem in accordance with one embodiment of the present invention;

(3) FIG. 2 is a schematic diagram depicting a large degree isogeny employing the use of the Co—Z algorithm in accordance with an embodiment of the present invention;

(4) FIG. 3 is a large-degree isogeny computation graph of custom character .sup.6 in accordance with an embodiment of the present invention; and

(5) FIG. 4 is a schematic diagram depicting a register file resident on a processor having a plurality of processor registers in accordance with an embodiment of the present invention.

DETAILED DESCRIPTION

(6) While the specification concludes with claims defining the features of the invention that are regarded as novel, it is believed that the invention will be better understood from a consideration of the following description in conjunction with the drawing figures, in which like reference numerals are carried forward. It is to be understood that the disclosed embodiments are merely exemplary of the invention, which can be embodied in various forms.

(7) The present invention provides a novel and efficient computer processing cryptosystem using isogeny-based computations to reduce a memory footprint. More specifically, the system and method is directed toward a cryptosystem that utilizes a Co—Z isogeny-based algorithm and computations to compress the required storage necessary for efficiently computing large-degree isogeny computations. This approach is slightly slower than the optimal speed method, but requires roughly half the amount of memory. One goal of the present invention is to achieve performance, yet reserve the memory footprint, when utilizing these large-scale computations in isogeny-based cryptosystems.

(8) More specifically, as shown in FIGS. 1-3, the present invention enables isogeny-based arithmetic to be performed in hardware and software implementations very quickly, yet significantly reduces the number of temporary registers needed to hold intermediate values in isogeny-based arithmetic computations. As such, the present invention enables low-power and energy-efficient implementations of isogeny-based cryptography to be achieved, which is significantly beneficial when applied to the Internet of Things (IoT) industry and devices.

(9) FIGS. 2 and 3 will be described in conjunction with the process flow chart of FIG. 1. Although FIG. 1 shows a specific order of executing the process steps, the order of executing the steps may be changed relative to the order shown in certain embodiments. Also, two or more blocks shown in succession may be executed concurrently or with partial concurrence in some embodiments. Certain steps may also be omitted in FIG. 1 for the sake of brevity. In some embodiments, some or all of the process steps included in FIG. 1 can be combined into a single process. FIG. 2 depicts a schematic diagram showing a large-degree isogeny employing the use of the Co—Z algorithm in accordance with an embodiment of the present invention. While the figures depicted herein show several advantageous features of the present invention, as will be described below, the invention can be implemented in other ways and by utilizing other components and functionality.

(10) As discussed above, the fastest known naive method is to store critical “pivot” points throughout the isogeny-based computation to avoid computing unnecessary information. This pivoting system is represented in FIGS. 2 and 3. However, the Co—Z algorithm carries out a compression of the Z coordinates of many points within the pivoting system into a single Z coordinate. Therefore, the Co—Z algorithm can use the fast-naive method with approximately half the memory overhead, thereby greatly aiding low-memory hardware and software implementations that seek to implement isogeny-based computations.

(11) An elliptic curve defined over a finite field, F.sub.q, can be written in its short Weierstrass form as: E.sub.(a,b)/F.sub.q:y.sup.2=x.sup.3+ax+b, where a,b∈F.sub.q. An elliptic curve is composed of all points (x,y) that satisfy the above equation as well as the point at infinity. This forms an abelian group over point addition, the underlying basis of the scalar point multiplication in elliptic curve Diffie-Hellman, Q=kP, where P,Q∈E and k is a scalar. By using abstract geometry to define point addition and doubling formulas, those of skill in the art will appreciate that a scalar point multiplication can be efficiently performed by performing a series of point doublings and additions. However, instead of performing affine point addition and affine point doubling for a scalar point multiplication, we define projective formulas over projective coordinates (X:Y:Z) such that x=X/Z and y=Y/Z. With this representation, only a single inversion is performed at the end of the scalar point multiplication.

(12) Furthermore, an isogeny is defined over a finite field, F.sub.q, φ:E.fwdarw.E′ as a non-constant rational map defined over F q such that φ satisfies group homomorphism from E(F.sub.q) to E′(F.sub.q). An isogeny can be thought of as a mapping from one elliptic curve class to another that preserves the point at infinity. Two curves are isogenous if an isogeny exists between them. Specifically, for two elliptic curves to be isogenous over a finite field, they must have the same number of points. The degree of an isogeny is its degree as a rational map. For every prime, custom character ≠p, there exist +1 isogenies of degree from a specific isomorphism class. Unique isogenies can be computed over a kernel, κ, such that φ:E.fwdarw.E/κ by using Vélu's formulas.

(13) With reference to FIG. 3, a large-degree isogeny computation graph of custom character .sup.6 is depicted. In particular, the left side of the graph depicts a multiplication-based strategy with complexity O(e.sup.2) and the right side of the graph depicts an optimal strategy with complexity O(e log e) where the cost of a point multiplication by is 1.5 times as expensive as an isogeny evaluation (1.5:1). Large-degree isogenies can be broken into a chain of smaller degree isogeny computations that are computed iteratively. From a base curve E.sub.0 and kernel point R.sub.0=R of order custom character .sup.e, a chain of -degree isogenies can be computed as follows:
E.sub.i+1=E.sub.i/.sup.e-i-1R.sub.1,φ.sub.i:E.sub.i.fwdarw.E.sub.i+1,R=φ.sub.i(R.sub.i).

(14) A problem with said computation, however, can be visualized as an acyclic graph, shown on the left side in FIG. 3. The naive (but memory-efficient) approach is on the left with quadratic time complexity. This method would only use two temporary registers in this example (point R.sub.k). By utilizing the relative cost to traverse the graph, an optimal strategy of traversal can be computed, with time complexity O(e log e). This faster method would then use 6 temporary registers (points R.sub.0, custom character .sup.2 R.sub.0, and .sup.4 R.sub.0 as we traverse the left side of the pyramid).

(15) To further elaborate on these costs, it may be emphasized within an elliptic curve, having elliptic curve points represented with an X and a Z coordinate in the “Montgomery Kummer” coordinates form. Each coordinate requires 2F.sub.q elements to be stored. The naive approach on the left in FIG. 3 requires only a single temporary point which is 2 temporary registers. The process would be carried out as follows:

(16) 1. Start at the top of the pyramid with R.sub.0

(17) 2. Store R.sub.0 as a temporary point

(18) 3. Compute custom character .sup.5 R.sub.0 with algebraic geometry (point arithmetic)

(19) 4. Perform an custom character -isogeny with .sup.5 R.sub.0 as the kernel (more point arithmetic)

(20) 5. Push the temporary point R.sub.0 through the isogeny to obtain R.sub.1

(21) The computation is iteratively performed with R.sub.0, then R.sub.1, all the way to R.sub.5, wherein each step only needs to store one temporary point.

(22) The optimal performance strategy on the right in FIG. 3, however, is different, in that multiple pivot points are stored along the way. The process would be carried out as follows:

(23) 1. Start at the top of the pyramid with R.sub.0

(24) 2. Store R.sub.0 as a temporary point

(25) 3. Compute custom character .sup.5 R.sub.0 with algebraic geometry (point arithmetic) while also storing temporary pivot points .sup.2 R.sub.0 and .sup.4 R.sub.0 for efficiency

(26) 4. Perform an custom character -isogeny with .sup.5 R.sub.0 as the kernel (more point arithmetic)

(27) 5. Push the all temporary points R.sub.0, custom character .sup.2 R.sub.0, and .sup.4 R.sub.0 through the isogeny to obtain R.sub.1, .sup.2 R.sub.1, and .sup.4 R.sub.i, respectively.

(28) The optimal strategy is much more efficient, but requires a lot more storage. For higher key sizes, the depth of the isogeny computation graph is deeper, resulting in the need to store more temporary points for efficient computations.

(29) In Algorithm 1, below, additional steps to implement these formulas in the present invention, and on hardware and software devices, are depicted.

(30) Algorithm 1 Computing a large degree isogeny using a strategy with Co—Z arithmetic. Our patent applications applies the boxed lines to greatly reduce the memory footprint, of this computation.

(31) Input: Isogeny degree e in custom character .sup.e,

(32) A lookup table of size e with

(33) the optimal strategy for S,

(34) Kernel point (X.sub.0, Z.sub.0),

(35) Montgomery carve: A.sub.0y.sup.2=x.sup.3+B.sub.0x.sup.2+x,

(36) Stack structure composed of (X.sub.i, s.sub.i)

(37) Output: Isogenous curve A.sub.ey.sup.2=x.sup.3+B.sub.ex.sup.2+x

(38) 1. pt_stack=┌┐. X.sub.R=X.sub.0, Z=Z.sub.0

(39) 2. for j=1 to e do

(40) 3. while i<e−j do

(41) 4. for P at index k in pt_stack do

(42) 5. X.sub.k=X.sub.k×Z.sub.tmp

(43) 6. Z=Z×Z.sub.tmp

(44) 7. end for

(45) 8. Push (X.sub.R,s.sub.i) to the top of the pt_stack

(46) 9. index S[e−1−j+1]

(47) 10. (X.sub.R,Z.sub.tmp)= custom character ×index×(X.sub.R,Z),

(48) 11. i=i+index

(49) 12. end while

(50) 13. A.sub.j, B.sub.j=gel_ custom character _isogeny (Z.sub.k,Z)

(51) 14. Z.sub.new=1

(52) 15. for X at index kin pt_stack do

(53) 16. X.sub.k,Z.sub.tmp2=eval_ custom character _isogeny (X.sub.k, Z)

(54) 17. X.sub.k=X.sub.k×Z.sub.new

(55) 18. for X at index s<kin pt_stack do

(56) 19. X.sub.s=X.sub.s×Z.sub.tmp2)

(57) 20. end for

(58) 21. Z.sub.new=Z.sub.new×Z.sub.tmp2

(59) 22. end for

(60) 23. Pop top element (X, i.sub.s) from pt_stack

(61) 24. X.sub.R=X, Z=Z.sub.new, i=i.sub.s

(62) 25. end for

(63) 26. return A.sub.ey.sup.2=x.sup.3+B.sub.ex.sup.2+x

(64) The above-referenced algorithm exemplifies and represents the computing of a large degree isogeny using the Co—Z arithmetic algorithm. The portion of the above-referenced algorithm depicted with rectangles in lines 4-7 of the above code represents the computations carried out in each of the triangles (i.e., R.sub.0, I.sup.2R.sub.0, etc.) shown in the right of FIG. 3, wherein the triangles represent where pivot points are stored. The portion of the above-referenced algorithm depicted with rectangles in lines 17-21 of the above code represents the computations carried out in each of the squares (i.e., I.sup.5R.sub.0, I.sup.4R.sub.1, etc.) shown in the right of FIG. 3, wherein the triangles represent where pivot points are stored. As such, the above-referenced algorithm applies the rectangular boxed lines of code to greatly reduce the memory footprint of the overall computation.

(65) The above process may also be represented in the process flow diagram of FIG. 1, in association with FIGS. 3-4, wherein the process may begin at step 100, and then immediately proceeds to step 102 of providing at least one computer processor 400 resident on an electronic computer device and having at least one register file 402 with a plurality of temporary registers 404a-n resident thereon, wherein “n” represents any number greater than one. The register file 402 is resident, housed on, and/or operably connected to at least one computer processor (represented schematically as element 400).

(66) As those of skill in the art will appreciate, a register file is an array of processor registers in a central processing unit (CPU). In one embodiment, the processor registers 404a-n are circuit-based register files and may be implemented by way of fast static random-access memories (RAMs) with multiple ports. Such RAMs are distinguished by having dedicated read and write ports, whereas ordinary multi-ported RAMs will usually read and write through the same ports. In other embodiments, the processor registers 404a-n may be implemented by way of fast dynamic RAMs. The instruction set architecture of a CPU will almost always define a set of registers which are used to stage data between memory and the functional units on the chip. In simpler CPUs, these architectural registers correspond one-for-one to the entries in a physical register file (PRF) within the CPU. More complicated CPUs use register renaming, so that the mapping of which physical entry stores a particular architectural register changes dynamically during execution.

(67) The process may continue to step 104 of initiating, through the least one computer processor 400, a cryptography session utilizing an custom character -degree isogeny arithmetic computation having chained computations therein. The cryptography session may include iteration cycles (wherein the first iteration cycle is represented schematically in FIG. 3 as numeral 300). The process may then proceed to step 106 of carrying out a first iteration cycle, of a plurality of iteration cycles, wherein the first iteration cycle may include sub-steps 106a-n (represented with dashed lines in FIG. 1).

(68) More specifically, step 106a may include individually computing a plurality of sequentially occurring pivot points within the chained computations. Step 106b may include implementing a Co—Z algorithm within the plurality of sequentially occurring pivot points to compute a compressed Z value. As discussed above, the process 400 may also be operably configured to implement the Co—Z algorithm alternatively within the plurality of sequentially occurring pivot points to compute the compressed Z value. Step 106c may include storing the compressed Z value generated from the Co—Z algorithm on one of the plurality of temporary registers 404a-n.

(69) Additionally, the compressed Z value generated from the Co—Z algorithm on the one of the plurality of temporary registers 404a-n may be exclusively stored throughout the plurality of the iteration cycles

(70) Step 106n may include computing a first custom character isogeny of the -degree isogeny arithmetic computations using the compressed Z value. Thereafter, step 108 will include implementing a remaining amount of the plurality of iteration cycles, wherein each of the plurality iteration cycles computing isogenies using the compressed Z value to complete the e-degree isogeny arithmetic computation. The process may terminate in step 110.

(71) It should be emphasized that the optimal strategy is necessary for any implementation of isogeny-based cryptography as it takes much longer (more than 10 times slower for an already slow algorithm). The number of temporary registers, e.g., registers 404a-n, are a result of the pivot points that are needed for the optimal strategy. In one such implementation, we used 6 pivot points. The naive storage of these 6 pivot points is of the form (X.sub.0, Z.sub.0), (X.sub.1, Z.sub.1) . . . (X.sub.5, Z.sub.5) where the actual point's x-coordinate is equal to x.sub.i=X.sub.i/Z.sub.i. We propose to merge each of these Z.sub.i values to a single Z so that the 6 pivot points become (X.sub.0′, Z), (X.sub.1′, Z) . . . (X.sub.5′, Z). We can transform to this representation with several multiplications: X.sub.0′=X.sub.0 Z.sub.1 Z.sub.2 Z.sub.3 Z.sub.4 Z.sub.5, Z=Z.sub.0 Z.sub.1 Z.sub.2 Z.sub.3 Z.sub.4 Z.sub.5. It is clear that x.sub.0=X.sub.0/Z.sub.0=X.sub.0′/Z. This does add several additional field multiplications throughout the large-degree isogeny, which will add 15% additional cycles for the large degree isogeny computation. This further reduces 12 registers held by 6 pivot points to only 7 registers

(72) In one exemplary Co—Z algorithmic example, in order to visualize the savings from the above-described process, a side-by-side comparison of the total memory units needed for a large-degree isogeny is depicted in Table 1 (below). In this toy example, only a maximum of three temporary points is stored. The naive method on the left stores all temporary point information requiring a total of 6 temporary registers. The present invention, however, “compresses” the point information by sharing a common denominator to reduce the total memory consumption to 4 registers. As the number of temporary points increases, the memory savings from this approach reaches 50%.

(73) TABLE-US-00001 TABLE 1 Naive Patent Operation R.sub.0 R.sub.1 R.sub.2 R.sub.3 R.sub.4 R.sub.5 R.sub.0 R.sub.1 R.sub.2 R.sub.3 Copy input point X.sub.0 Z.sub.0 X.sub.0 Z.sub.0 Store temp point 1 X.sub.0 Z.sub.0 X.sub.1 Z.sub.1 X.sub.0′ X.sub.1′ .sub. Z.sub.01 Store temp point 2 X.sub.0 Z.sub.0 X.sub.1 Z.sub.1 X.sub.2 Z.sub.2 X.sub.0″ X.sub.1″ X.sub.2″ .sub. Z.sub.012 Compute isogeny φ X.sub.0 Z.sub.0 X.sub.1 Z.sub.1 X.sub.2 Z.sub.2 X.sub.0″ X.sub.1″ X.sub.2″ .sub. Z.sub.012 Push temp points φ(X.sub.0) φ(Z.sub.0) φ(X.sub.1) φ(Z.sub.1) φ(X.sub.2) φ(Z.sub.2) φ(X.sub.0) φ(X.sub.1) φ(X.sub.2) φ(Z.sub.012) through isogeny φ Use (φ(X2), φ(Z2)) φ(X.sub.0) φ(Z.sub.0) φ(X.sub.1) φ(Z1) φ(X.sub.0) φ(X.sub.1) φ(Z.sub.012) as next input point

(74) Rather than using two registers per point in storing intermediate information, the present invention would compress the number of registers to one register per point plus an additional register as overhead.

Cryptosystem and method using isogeny-based computations to reduce a memory footprint

Assignee

Inventors

Cpc classification

Classification Explorer

H04L9/0838

ELECTRICITY

Classification Explorer

H04L9/3066

ELECTRICITY

Classification Explorer

H04L9/0637

ELECTRICITY

Classification Explorer

H04L9/0631

ELECTRICITY

Classification Explorer

H04L2209/30

ELECTRICITY

Classification Explorer

G06F7/725

PHYSICS

Classification Explorer

H04L9/50

ELECTRICITY

International classification

Classification Explorer

H04L9/30

ELECTRICITY

Classification Explorer

G06F7/72

PHYSICS

Classification Explorer

H04L9/06

ELECTRICITY

Abstract

Claims

Description