Optimized hardware architecture and method for ECC point doubling using Jacobian coordinates over short Weierstrass curves
09929862 ยท 2018-03-27
Assignee
Inventors
Cpc classification
G09C1/00
PHYSICS
H04L9/3066
ELECTRICITY
International classification
H04L9/00
ELECTRICITY
G09C1/00
PHYSICS
Abstract
An optimized hardware architecture and method introducing a simple arithmetic processor that allows efficient implementation of an Elliptical Curve Cryptography point doubling algorithm for Jacobian coordinates. The optimized architecture additionally reduces the required storage for intermediate values.
Claims
1. A data cryptographic apparatus comprising: computational logic configured to perform an elliptic curve cryptography (ECC) point doubling operation using Jacobian coordinates over a short Weierstrass curve of the form y=x.sup.3+ax+b where a=3; a register memory configured to store a point in the Jacobian coordinates, wherein the register memory is configured for two temporary storage variables, T.sub.1 and T.sub.2; a modular multiplier electrically coupled to the register memory, wherein the modular multiplier is configured to perform at most one modular multiplication for each step in a sequence of steps in the ECC point doubling operation; and a simple arithmetic processor configured; to perform modular subtraction, modular multiplication by two, and modular division by two in support of the ECC point doubling operation utilizing four logical one bit left shifters, an adder, and a subtractor to output either 3C, AB/2, AB, A2B, or 2AB for an input of variables A, B, and C, wherein the simple arithmetic processor is electrically coupled to the register memory and the modular multiplier, wherein the simple arithmetic processor outputs a result of the ECC point doubling operation in the Jacobian coordinates.
2. A mobile device comprising the apparatus of claim 1.
3. A smartcard comprising the apparatus of claim 1.
4. The mobile device of claim 2, wherein the mobile device is a smartphone.
5. A method for performing an elliptic curve cryptography (ECC) point doubling operation using Jacobian coordinates over a short Weierstrass curve of the form y=x.sup.3+ax+b where a=3, comprising: accepting, as variable input at a simple arithmetic processor of a computational device, a point in the Jacobian coordinates; configuring the simple arithmetic processor of the computational device for modular subtraction, modular division by two, and modular multiplication by two utilizing four logical one bit left shifters, an adder, and a subtractor, wherein an input of variables A, B, and C to the simple arithmetic processor results in an output of either 3C, AB/2, AB, A2B, or 2AB; enabling, a modular multiplier of the computational device to execute a sequence of steps to perform the ECC point doubling operation of the point in the Jacobian coordinates, wherein the modular multiplier performs at most one modular multiplication for each step in the sequence of steps, wherein the sequence of steps requires no more than two temporary variables; and outputting, by the computational device, a result of the ECC point doubling operation in the Jacobian coordinates.
6. The method of claim 5, wherein the computational device is part of a mobile device.
7. The method of claim 6, wherein the mobile device is a smartphone.
8. The method of claim 5, wherein the computational device is part of a smartcard.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
DETAILED DESCRIPTION
(8) PDBL algorithm 300 in accordance with the invention is shown in
(9) As input in step 301, PDBL algorithm 300 shown in
(10) The most computationally intensive operation in PDBL algorithm 300 in
(11) It is important to note that besides the modular multiplication steps performed in steps 303, 308 and 309 of PDBL algorithm 300, additional, comparatively simple operations are performed as well: modular subtraction and addition and modular multiplication and division by 2. Note that multiplication or division by a power of 2 in binary is merely a shift operation. In order to accelerate execution of PDBL algorithm 300 and eliminate the need for additional temporary registers, an embodiment in accordance with the invention of simple arithmetic unit (SAU) 400 with the inputs and outputs as shown in
(12)
(13)
(14) SAU 400 shown in
(15) Input A goes to both input 0 of MUX 720 and logical one bit left shifter 715 on line 671. Logical one bit left shifter 715 multiplies input A by two and outputs 2A on line 771 to the 1 input of MUX 720. Output line 776 of MUX 720 provides the minuend input for subtractor 710. Input B goes to logical one bit right shifter 716, logical one bit left shifter 717 and input 1 of MUX 725 on line 672. Logical one bit right shifter 716 divides input B by two and outputs B/2 on line 772 to input 0 of MUX 725. Logical one bit left shifter 717 multiplies input B by two and outputs 2B on line 774 to input 2 of MUX 725. Output line 777 of MUX 725 connects to the subtrahend input of subtractor 710. Input C connects to adder 722 and to logical one bit left shifter 718 on line 673. Logical one bit left shifter 718 multiplies input C by two and outputs 2C to adder 722 on line 775. Subtractor 710 outputs E (see
(16) Multi-cycle multiplier 610 functions by multiplying the values on lines 635 and 640 together and outputting the result on lines 650 and 650. Steps 301-302 of PDBL algorithm 300 are performed on the microprocessor (not shown) without using multi-cycle multiplier 610 and SAU 400.
(17) Step 303 utilizes both multi-cycle multiplier 610 and SAU 400. Register memory 695 provides X.sub.1 on line 665 to input 0 of MUX 620 with MUX 620 set to 0 and Z.sub.1 is provided from register memory 695 on both lines 635 and 640 to multi-cycle multiplier 610. Multi-cycle multiplier 610 computes Z.sub.1.sup.2 which is output on line 650 to input 1 of MUX 630 with MUX 630 set to 1. MUX 620 sends X.sub.1 to input A of SAU 400 on line 671 and MUX 630 sends Z.sub.1.sup.2 to input B of SAU 400 on line 672. MUX 720 in SAU 400 is set to 0 and MUX 720 sends A on line 776 from line 671 to the minuend input of subtractor 710 on line 776. MUX 725 in SAU 400 is set to 1 and MUX 725 sends on line 777 B from line 672 to the subtrahend input of subtractor 710 on line 777. Subtractor 710 computes E (which is AB=X.sub.1Z.sub.1.sup.2) of which is passed to register memory 695 on line 696 and stored in temporary register T.sub.2.
(18) Step 304 utilizes both multi-cycle multiplier 610 and SAU 400. Register memory 695 provides X.sub.1 on line 665 to input 0 of MUX 620 and MUX 620 is set to 0. MUX 620 sends X.sub.1 to input A of SAU 400 on line 671. Register memory 695 provides T.sub.2 on line 660 to input 0 of MUX 630 with MUX 630 set to 0 and register memory 695 also provides T.sub.2 to input C of SAU 400 on line 673. MUX 720 in SAU 400 is set to 1 and MUX 720 sends 2A from line 771 on line 776 to the minuend input of subtractor 710. MUX 725 in SAU 400 is set to 1 and MUX 725 sends B from input line 672 on line 777 to the subtrahend input of subtractor 710 on line 777. Input C (T.sub.2) of SAU 400 on line 673 is sent to both logical one bit left shifter 718 and adder 720. The output 2C on line 775 from logical one bit left shifter 718 goes to adder 720. Adder 720 outputs D (which is 3C=3T.sub.2) on line 690 and subtractor 710 computes E (which is 2AB=2X.sub.1T.sub.2) on line 696 to register memory 695 which passes E and D on lines 635 and 640, respectively, to multi-cycle multiplier 610 which computes E*D and sends the result on line 650 to register memory 695 where the result is stored in temporary register T.sub.2.
(19) Step 305 utilizes multi-cycle multiplier 610. T.sub.2 is provided from register memory 695 to both lines 635 and 640 to multi-cycle multiplier 610 which computes and outputs T.sub.2.sup.2 on line 650 to register memory 695 where the result is stored in X.sub.3.
(20) Step 306 utilizes both multi-cycle multiplier 610 and SAU 400. Register memory 695 provides Y.sub.1 on line 665 to input 0 of MUX 620 and MUX 620 is set to 0. MUX 620 sends Y.sub.1 to input A of SAU 400 on line 671. Logical one bit left shifter 718 takes input A on line 671, multiplies input A by two and outputs 2A on line 771 to MUX 720. MUX 720 in SAU 400 is set to 1 and MUX 720 sends 2A on line 776 to the minuend input of subtractor 710. Binary 0 is supplied on line 660 to input 0 of MUX 630 with MUX 630 set to 0. MUX 630 sends binary 0 from line 660 to input B of SAU 400 on line 672. MUX 725 in SAU 400 is set to 1 and MUX 725 sends binary 0 on line 777 to the subtrahend input of subtractor 710. Subtractor 710 computes 2AB on line 696 to register memory 695 as E (which is 2AB=2Y.sub.1) which passes the value through on line 635 to multi-cycle multiplier 610 and register memory 695 provides Z.sub.1 on line 640 to multi-cycle multiplier 610. Multi-cycle multiplier 610 computes E*Z.sub.1 (2Y.sub.1*Z.sub.1) and sends the result on line 650 to register memory 695 where it is stored in Z.sub.3.
(21) Step 307 utilizes both multi-cycle multiplier 610 and SAU 400. Register memory 695 provides Y.sub.1 on line 665 to input 0 of MUX 620 and MUX 620 is set to 0. MUX 620 sends Y.sub.1 to input A of SAU 400 on line 671. Logical one bit left shifter 715 takes input A on line 671, multiplies input A by two and outputs 2A on line 771 to input 1 of MUX 720. MUX 720 in SAU 400 is set to 1 and MUX 720 sends 2A on line 776 to the minuend input of subtractor 710. Binary 0 is supplied on line 660 to input 0 of MUX 630 with MUX 630 set to 0. MUX 630 sends binary 0 from line 660 to input B of SAU 400 on line 672. MUX 725 in SAU 400 is set to 1 and MUX 725 sends binary 0 on line 777 to the subtrahend input of subtractor 710. Subtractor 710 computes 2AB (which is 2Y.sub.1) as E on line 696 to register memory 695 which passes E through both on line 635 and on line 640 to multi-cycle multiplier 610. Multi-cycle multiplier 610 computes E.sup.2 (which is (2Y.sub.1).sup.2) and sends the result to register memory 695 on line 650 where it is stored in Y.sub.3.
(22) Step 308 utilizes both multi-cycle multiplier 610 and SAU 400. Register memory 695 provides X.sub.3 on line 665 to input 0 of MUX 620 and MUX 620 is set to 0. MUX 620 sends X.sub.3 to input A of SAU 400 on line 671 which connects to input 0 of MUX 720 with MUX 720 set to 0. MUX 720 sends A on line 776 to the minuend input of subtractor 710. Register memory 695 provides Y.sub.3 on line 635 to multi-cycle multiplier 610 and provides X.sub.1 on line 640 to multi-cycle multiplier 610. Multi-cycle multiplier 610 computes Y.sub.3*X.sub.1 and sends the result to input 1 of MUX 630 and MUX 630 is set to 1. MUX 630 sends Y.sub.3*X.sub.1 to input B of SAU 400 on line 672. Logical one bit left shifter 717 takes input B on line 672, multiplies input B by two and outputs 2B (2Y.sub.3*X.sub.1) on line 774 to input 2 of MUX 720. MUX 720 is set to 2 and sends 2B on line 777 to the subtrahend input of subtractor 710. Subtractor 710 computes E (which is A2B=X.sub.32Y.sub.3*X.sub.1) on line 696 to register memory 695 where it is stored in X.sub.3.
(23) Step 309 utilizes both multi-cycle multiplier 610 and SAU 400. In step 308, Y.sub.3*X.sub.1 was computed by multi-cycle multiplier 610. Hence, Y.sub.3*X.sub.1 is still present in the output register (not shown) of multi-cycle multiplier 610 and in Step 309 is sent on line 650 to input 1 of MUX 620 and MUX 620 is set to 1. MUX 620 provides Y.sub.3*X.sub.1 to input A of SAU 400 on line 671 which connects to input 0 of MUX 720. MUX 720 in SAU 400 is set to 0 and MUX 720 sends A (which is Y.sub.3*X.sub.1) on line 776 to the minuend input of subtractor 710. Register memory 695 provides X.sub.3 on line 660 to input 0 of MUX 630 and MUX 630 is set to 0. MUX 630 sends X.sub.3 to input B of SAU 400 on line 672 which connects to input 1 of MUX 725. MUX 725 is set to 1 and provides B on line 777 to the subtrahend input of subtractor 710. Subtractor 710 computes E (AB=Y.sub.3*X.sub.1) which is sent on line 696 to register memory 695 which passes the value through on line 635 to multi-cycle multiplier 610 and register memory 695 provides T.sub.2 on line 640 to multi-cycle multiplier 610. Multi-cycle multiplier 610 computes E*T.sub.2 (which is (Y.sub.3*X.sub.1X.sub.3)*T.sub.2) and sends the result on line 650 to register memory 695 where it is stored in temporary register T.sub.1.
(24) Step 310 utilizes both multi-cycle multiplier 610 and SAU 400. Register memory 695 provides T.sub.1 on line 665 to input 0 of MUX 620 and MUX 620 is set to 0. MUX 620 sends T.sub.1 to input A of SAU 400 on line 671 which connects to input 0 of MUX 720. MUX 720 in SAU 400 is set to 0 and MUX 720 sends A (T.sub.1) on line 776 to the minuend input of subtractor 710. Y.sub.3 is provided from register memory 695 to both lines 635 and 640 to multi-cycle multiplier 610 which computes Y.sub.3.sup.2 and which is output on line 650 to input 1 of MUX 630 with MUX 630 set to 1. MUX 630 provides Y.sub.3.sup.2 on line 672 to input B of SAU 400. Logical one bit right shifter 716 takes input B on line 672, divides input B by two and outputs B/2 (Y.sub.3.sup.2/2) to input 0 of MUX 725 and MUX 725 is set to 0. MUX 725 sends B/2 on line 777 to the subtrahend input of subtractor 710. Subtractor 710 computes E (AB/2=T.sub.1Y.sub.3.sup.2/2) which is sent on line 696 to register memory 695 where it is stored in Y.sub.3.
(25) Step 311 is performed in the microprocessor and returns the result of PDBL algorithm 300 which is (X.sub.3, Y.sub.3, Z.sub.3) for input (X.sub.1, Y.sub.1, Z.sub.1).