[US Patent & Trademark Office, Patent Full Text and Image Database] [Home] [Boolean Search] [Manual Search] [Number Search] [Help] [HIT_LIST] [Bottom] [View Shopping Cart] [Add to Shopping Cart] [Image] ( 1 of 1 ) ------------------------------------------------------------------------------- United States Patent 5,999,627 Lee , et al. December 7, 1999 ------------------------------------------------------------------------------- Method for exponentiation in a public-key cryptosystem Abstract The present invention relates to an improved method for performing modular exponentiation to a fixed base element. The method includes exponentiating a first digital input signal g by a second digital input signal R, where g is a fixed signal unique to a cryptographic system and R is a randomly generated digital signal, to compute a third digital signal g.sup.R. The exponentiating includes pre-computing and storing a plurality of values depending only upon the fixed signal g in a plurality of memory locations within a computing device and then speeding up the computation of g.sup.R using the stored values. The invented exponentiation method can substantially reduce the amount of computation required to compute the value for g.sup.R. Exponentiation methods according to embodiments of the present invention may be used in a variety of cryptographic systems, e.g., Schnorr identification scheme, Digital Signature Standard (DSS), and Diffie-Hellman key agreement scheme, etc. ------------------------------------------------------------------------------- Inventors: Lee; Pil-joong (Pohang, KR); Lim; Chae-hoon (Kyungsangnam-do, KR) Assignee: Samsung Electronics Co., Ltd. (Kyungki-do, KR) Appl. No.: 003875 Filed: January 7, 1998 Foreign Application Priority Data ----------------------- Jan 07, 1995[KR] 95-224 Current U.S. Class: 380/30; 380/28; 708/606 Intern'l Class: H04K 001/00; G06E 001/04 Field of Search: 380/28,30 708/620-632,606,491 ------------------------------------------------------------------------------- References Cited [Referenced By] ------------------------------------------------------------------------------- U.S. Patent Documents 5299262 Mar., 1994 Brickell et al. 380/28. 5870478 Feb., 1999 Kawamura 380/30. Other Advances in Cryptology-Eurocrypt '92, "Fast Exponentiation References with Precomputation," by Brickell et al., pp. 200-207, May 1992. Advances in Cryptology-Crypto '94, "More Flexible Exponentiation with Precomputatiuon," by Lim et al, Aug. 1994. Primary Examiner: Hayes; Gail O. Assistant Examiner: Sayadian; Hrayr A. Attorney, Agent or Firm: Sughrue, Mion, Zinn, Macpeak & Seas, PLLC ------------------------------------------------------------------------------- Parent Case Text ------------------------------------------------------------------------------- This disclosure is a continuation-in-part of U.S. patent application Ser. No. 08/467,310, filed Jun. 6, 1995, now abandoned. ------------------------------------------------------------------------------- Claims ------------------------------------------------------------------------------- We claim: 1. In a cryptographic system, a computer-implemented method for transforming a first signal into a second signal in a manner infeasible to invert, the method comprising: exponentiating a first digital input signal g by a second digital input signal R, wherein g is a fixed signal unique to a cryptographic system and R is a randomly generated digital signal of n bits, n being an integer specified by the cryptographic system, to obtain a third digital signal g.sup.R, wherein the exponentiating comprises: (a) determining integers h and v, and calculating a and b such that a is the least integer not less than a fractional number n/h and b is the least integer not less than a fractional number a/v, whereby an arbitrary n bit number can be divided into h blocks of size a and each divided block can be further subdivided into v smaller blocks of size b; (b) precomputing and storing, in a plurality of memory locations G[j][f] indexed by j and f, 0.ltoreq.j.ltoreq.v-1 and 0.ltoreq.f.ltoreq.2.sup.h -1, within a computing device, a plurality of values (g.sub.h-1.sup.e.sbsp.h-1 g.sub.h-2.sup.e.sbsp.h-2 . . . g.sub.1.sup.e.sbsp.1 g.sub.0.sup.e.sbsp.0).sup.2.spsp.jb where g.sub.1 =g.sup.2.spsp.ia for 0.ltoreq.i.ltoreq.h-1, and f=e.sub.h-1 e.sub.h-2 . . . e.sub.1 e.sub.0 with e.sub.i =0 or 1, wherein the index f corresponds to the decimal value of the binary representation e.sub.h-1 e.sub.h-2 . . . e.sub.1 e.sub.0 and ranges from 0 to 2.sup.h -1; (c) dividing the exponent R into h blocks R.sub.i of size a for i=0, 1, . . . , h-1 and subdividing each of the blocks R.sub.i into v sub-blocks R.sub.i,j of size b for j=0, 1, . . . , v-1; and (d) computing the second signal x=g.sup.R, using the stored values from the memory locations G[j][I.sub.j,k ], as (e) using the second signal x to encrypt or decrypt data. ##EQU26## where I.sub.j,k denotes an h-bit number formed by taking the k-th bits from the sub-blocks indexed by j in the second subscript, the sub-blocks including R.sub.h-1,j, . . . , R.sub.1,j, R.sub.0,j. 2. The computer-implemented method of claim 1, wherein computing g.sup.R according to step (d) includes utilizing p processors operating in parallel wherein the i-th processor, for 0.ltoreq.i.ltoreq.p, i and p being integers computes the value ##EQU27## using the stored values from the memory location G [j][I.sub.j,k ] for iw.ltoreq.iw+w and 0.ltoreq.I.sub.j,k <2.sup.h, to produce p computational results, where w is a least integer not less than a fractional number v/p and when v is not equal to a product pw, the index j in the (p-1)-th processor ranges from j=(p-1)w to v-1, and then multiplying together said p computational results to compute the value of g.sup.R. 3. In a cryptographic system, a computer-implemented method for transforming a set of input digital signals into an output digital signal, said method comprising: exponentiating each input signal x.sub.i by another input signal K.sub.i of b bits for i=0, 1, . . . , m-1, wherein x.sub.i and K.sub.i are variable values processed in a cryptographic system, and multiplying the exponentiated values together to produce an output digital signal ##EQU28## wherein the exponentiating comprises: (a) arranging m exponents K.sub.i into an h.times.v matrix form, h and v being integers such that a product hv is not less than the integer m, and z being expressed as ##EQU29## (b) computing and temporarily storing a plurality of values X[j][f] in a plurality of memory locations within a computing device, wherein X[j][f] denotes a value stored at a two dimensional array X indexed by row j and column f in a memory of the computing device defined by an expression: ##EQU30## for each j(0.ltoreq.j.ltoreq.v-1) and f (0.ltoreq.i.ltoreq.2.sup.h -1), where f=e.sub.h-1 e.sub.h-2 . . . e.sub.1 e.sub.0 with e.sub.i =0 or 1, and wherein f denotes the decimal value of the binary representation e.sub.h-1 e.sub.h-2 . . . e.sub.1 e.sub.0 ; and (c) computing the output signal z using the values stored in step (b) as ## EQU31## where I.sub.j,k denotes an h-bit integer formed by the k-th bit from each element of the j-th column in the h.times.v matrix consisting of b-bit sub-blocks (d) using the output signal z to encrypt or decrypt data. 4. The computer-implemented method of claim 3, wherein step (c) is performed in parallel using p processors wherein the i-th processor, for 0.ltoreq.i
b, the process can be proceeded as above or the computation can be done after partitioning E into smaller blocks. The first case yields the performance of a+2 t-2 multiplications in the worst case and a(2.sup.h -1)/2.sup.h +1.5 t-2 multiplications on average. However, if t is much larger than b, the performance can be further improved by dividing E into smaller blocks. Thus, for more general formula, suppose that E is partitioned into u blocks of almost equal size. (It is considered that the whole configuration for computing g.sup.R y.sup.E is u.times.1.vertline.h.times.v.) Let c be the bit-length of the partitioned blocks (i.e., c=.left brkt-top.t/ u.right brkt-top.). Then, y.sup.2.spsp.kc for k=1, 2, . . . , u-1, and each product of their possible combinations has to be first computed, which all together requires c(u-1)+2.sup.u -u-1 multiplications. For the range of interested t, e.g., up to t=80, u will be at most three. Now, if c.ltoreq.b, then at most c additional multiplications are sufficient in the worst case and c(2.sup.u -1)/2.sup.u are sufficient on average. Therefore, the total number of multiplications required in this case is a+b+uc+(2.sup.u -u-3) in the worst case and a(2.sup.h -1)/2.sup.h +b+c(u2.sup.u -1)/2.sup.u +2.sup.u -u-3 on average. Similarly, for the case of c>b, it can be easily shown that the number of multiplications is a+(u+1)c+2.sup.u -u-3 in the worst case and a(2.sup.h -1)/ 2.sup.h +c((u+1)2.sup.u -1)/2.sup.u +2.sup.u -u-3 on average. With the invented exponentiation method, the Schnorr-like identification and/or signature schemes can be made more practical for smart card implementations. For example, with 160-bit exponents and t=30, the verification condition can be checked in 80.5 multiplications on average, if 1,920 bytes of storage are available (for a 4.times.2 configuration). Similarly, a signature with t=80 can be verified in 144.13 multiplications on average using the same storage capacity. This is a considerable improvement with only a very small storage capacity, compared with the binary method requiring 246.5 multiplications for t =30 and 259.0 multiplications for t=80 on average. Moreover, identification or signature verifications are usually performed in much more powerful terminals capable of being equipped with a large amount of memory. In such an environment, the 8.times.2 configuration, for example, may be adopted. Then identity verifications can be done in 60.2 multiplications on average for t=30 and signature verifications can be done in 126.6 multiplications on average for t=80, using about 32 Kbytes of storage. Next, it will be explained that small additional communication can considerably reduce the number of multiplications for computing g.sup.R y.sup.E again. That is, the verifier can save the on-line computational load for preparing y.sub.k =y.sup.2.spsp.kc for k=1, 2, . . . , u-1, if they are pre-computed and stored by the signer (or the prover) and then transmitted to the verifier together with other data. For example, for the signature scheme with t=80, if the signer sends two additional values y.sub.1 and y.sub.2, where y.sub.1 =y.sup.2.spsp.27 and y.sub.2 =y.sub.1.sup.2.spsp.27, together with a signature for message, then the signature verification can be done in 90.13 multiplications on average with the 4.times.2 configuration. Therefore, 54 multiplications can be saved with the increase of a small increase of communication. This corresponds to about a 3-fold improvement on average over the binary method which requires 259 multiplications on average. It can be seen that the BGMW method is less efficient for the computation of the form g.sup.R y.sup.E in either case considered above. That is, in case of no additional communication, if the exponents are represented in non-binary power base, more computations are needed in performing the on-line pre-computation required for y.sup.E. When additional communication is allowed, more pre-computed values must be transmitted due to the use of a small base. The above method of combining pre-computation and additional communication can be used to speed up the verification of the digital signature standard (DSS) as well. In DSS, the computation of the type g.sup.R y.sup.E should be performed with .vertline.R.vertline.=.vertline.E.vertline.=160, and therefore, without additional communication, no advantage can be gained with pre-computation. However, if the signer sends three additional blocks {y.sub.1, y.sub.2 and y.sub.3 }, where y.sub.i =y.sub.i-1.sup.2.spsp.40 for i={1, 2, 3}, and if the verifier adopts the 4.times.2 configuration, then the signature can be verified in 124 multiplications on average. This is more than a 2-fold improvement over the binary method which requires 279 multiplications on average, with only 1,920 bytes of storage and 192 bytes of additional communication (for a 512-bit modulus). FIG. 8 shows the number of multiplications required for signature generation and verification in three signature schemes (Schnorr, DSS and Brickell-McCurley), under the assumption that the signer sends three additional pre-computed values for the public key together with a signature, as mentioned above. Here, only the number of multiplications for exponentiation operations is taken into account, which disregards some necessary operations such as reduction mod q and multiplicative inverse mod q where g is a prime of about 160 bits. Two configurations of 4.times.2 and 8.times.2 are taken as examples, since the former is suitable for smart card applications and the latter for more general applications with a relatively large storage capacity available. For comparison, the performance of the binary method is also presented. Another Embodiment of the present invention is to do the exponentiation g.sup.R in parallel with multiple processors. For example, suppose that v processors are available. Then, one can assign the j-th processor to compute the j-th column of the h.times.v configuration (see FIG. 1). That is, the j-th processor can be assigned to compute ##EQU24## in the expression of ##EQU25## where the pre-computed values are the same as before but each processor oily stores in its local memory 2.sup.h -1 pre-computed values needed for its computation. Then the computation of each processor can be completed in at most 2(b-1) multiplications. Thereafter, .left brkt-top.log.sub.2 v.right brkt-top. multiplications are needed in addition to produce the final result. Therefore, the total number of multiplications required is at most 2(b-1)+.left brkt-top.log.sub.2 v.right brkt-top.. FIG. 9 shows the number of multiplications required for parallel processing in the case of 160/512 bit exponents, according to the number of processors (denoted by np) and the storage needed per processor (denoted by sp). As shown in FIG. 9, with only a small number of processors, the performance can be greatly improved. For example, for a 512-bit exponent, the computation of g.sup.R can be done in 32 multiplications with four processors if each processor stores 255 pre-computed values in its local memory (about 16 Kbytes). With more processors, say sixteen, the exponentiation can be done in ten multiplications with the same storage capacity. Meanwhile, in the above, only the case of v processors being available for an h.times.v configuration was considered for the sake of convenient explanation. However, an h.times.v configuration can also be used with smaller processors. For example, if p processors are available for h.times.v configuration, each processor can be assigned to compute w=.left brkt-top.v/p.right brkt-top. columns in the h.times.v configuration. In this case, each processor should store w(2.sup.h -1) pre-computed values in its local memory. Then, it is easy to see that the number of required multiplications is given by (w+1)b+.left brkt-top.log.sub.2 p.right brkt-top.-2. Practically, in most cases, it is more advantageous to assign many columns to each processor in a time-storage tradeoff than one column assignment considered in FIG. 9. As described above, according to the present invention, a new method for fast exponentiation with pre-computation has been described. The invented method is very simple and easy to implement systematicaly, but achieves better performance than the BGMW method. Also, the method according to the present invention is flexibly applicable to various computing environments due to its wide range of time-storage tradeoffs. In particular, the computation by smart cards can be substantially speeded up with only a very small storage capacity. The present invention can also speed up the computation of the form g.sup.R y.sup.E with y variable. This can greatly improve the practicality of the Schnorr-type identification and signature scheme, since the verifier as well as the prover (signer) can gain great computational advantage with a moderate storage capacity. Finally, the parallel processing according to the present invention may be useful in high performance servers with multiple processors. The present invention provides a method for reducing the amount of computation to evaluate g.sup.R for a fixed g and a random R by storing a plurality of pre-computed values in memory within a computing device. The computation of g.sup.R is an essential operation in all discrete logarithm-based cryptographic systems, such as the Schnorr identification scheme, the Digital Signature Standard (DSS) and Diffie-Hellman key agreement or exchange scheme. For example, FIG. 10 is a block diagram of a cryptographic system constructed in accordance with a preferred embodiment of the present invention. The cryptographic system comprises computers 10 and 12, having network interfaces or modems 14 and 18, respectively, for communicating with each other over a communications network 16. The computer 10 includes a CPU 102, a memory 104, and an input/output port 106 and the computer 12 includes a CPU 122, a memory 124 and an input/output port 126. CPUs 102 and 122 perform various mathematical processes. Memories 104 and 124 store values utilized in the various mathematical processes, in particular pre-computed values needed for exponentiation in the present invention. Also, input/output ports 106 and 126 allow connection to the modems 14 and 18. The present invention may be used with the system of FIG. 10 in a variety of ways to transmit coded data between the computer 10 and the computer 12 through the network 16. In the Schnorr identification scheme, suppose that user A wants to prove its identity to user B. As system parameters, two primes p and q, and base g are made public such that q.vertline.p-1, g.noteq.1, g.sup.q =1 mod p. The prover A possesses a secret key s .epsilon.[0,q) and publishes the corresponding public key y=g.sup.x mod p, and wants to prove that he knows the secret key x to user B. The procedure for identification is as follows: ______________________________________ user A user B ______________________________________ 1) randomly selects k .epsilon. [0, q) and computes r = g.sup.k mod p ##STR2## randomly selects e .epsilon. [0, 2.sup.t) for some integer t 2) computes s = k - xe mod q ##STR3## 3) ##STR4## accepts proof of identity if g.sup.s y.sup.e = r mod p ______________________________________ In the above example, the exponentiation method according to the present invention can be used in calculating g.sup.k in 1) and g.sup.s y.sup.e in 3). In the Digital Signature Standard (DSS), there are three public system parameters p, q and g: primes p, q such that q.vertline.p-1, and base g such that g.noteq.1 and g.sup.q =1 mod p. And each signer A selects its secret key x .epsilon.[0,q) and publishes the public key y=g.sup.x mod p. The signer A can generate a signature for a message m using its secret key and sends the message including the signature to the verifier B. Then the verifier B can check the validity of the signature using A's public key. The signature generation and verification procedures are described below, where H denotes the secure hash algorithm (SHA). Signer A generates signature (r,s) for message m and sends (r,s) to Verifier B: 1) randomly selects k .epsilon.[0,q) and computes r=(g.sup.k mod p) mod q 2) computes s=k.sup.-1 (H(m)+rx) mod q Verifier B verifies the signature (r,s) by checking that (g.sup.s.spsp.-1.sup.H (m) y.sup.s.spsp.-1.sup.r mod p) mod q=r. In the above example, exponentiation methods according to embodiments of the present invention can be used to calculate g.sup.k and g.sup.s.spsp.-1.sup.H(m) y.sup.s.spsp.-1.sup.r. In the Diffie-Hellman key exchange scheme, user A and user B agree upon a common secret key K.sub.AB through communications over a public channel. Here, p, q and g are public parameters defined as before. 1) user A randomly selects R.sub.A .epsilon.[0,q), computes and sends X.sub.A = g.sup.R.sbsp.A mod p to user B; 2) user B randomly selects R.sub.B .epsilon.[0,q, computes and sends X.sub.B = g.sup.R.sbsp.B mod p to user A; 3) user A computes K.sub.AB =X.sub.B.sup.R.sbsp.A mod p= g.sup.R.sbsp.B.sup.R.sbsp.A mod p; 4) user B computes K.sub.AB =X.sub.A.sup.R.sbsp.B mod p= g.sup.R.sbsp.A.sup.R.sbsp.B mod p. In the above example, exponentiation methods according to embodiments of the present invention can be used to calculate g.sup.R.sbsp.A, g.sup.R.sbsp.B. * * * * * ------------------------------------------------------------------------------- [Image] [View Shopping Cart] [Add to Shopping Cart] [HIT_LIST] [Top] [Home] [Boolean Search] [Manual Search] [Number Search] [Help]