A Lesson on Elliptic Curves

By Nicholas O’Brien –


Elliptic Curves. Maybe you’ve heard of them. Maybe you haven’t. Either way if you are reading this blog you are either my professor or someone who cares to know about them. It is very tempting to just “jump in” to the subject, but if you were to do that you may become very confused very quickly. The purpose of this post is to guide you into the realm of Elliptic Curve Cryptography, we will focus on it being used for Diffie Hellman Key Exchange, and hopefully show you why it is a very good alternative and replacement for RSA. Also be aware that whenever I reference a group I am usually talking about finite groups, which means the order is finite (you’ll know what that is in a few minutes).

Groups

Lets start with a concept that is the baseline for most PKI systems. This would be groups. Now suppose we have a set:

G = {0, 1, 2, 3, 4}

Hopefully you are familiar with sets from your days in you Discrete Math class. If not just think of it as a collection of objects, here being numbers.

1. Closure

Now suppose I attach an operation onto this set. For our example I will have addition (mod 5) as my operation. Take note that this operation must be “closed”. This means for any two elements in the set if this operation is applied it must give another element in the set. For example:

3 + 4 (mod 5) = 2 (mod 5) ϵ G

Had I chosen an operation that was not closed like addition mod 10 then 3+4 would be 7 which is not in G, meaning it is not closed.

2. Associativity

My operation is pretty tame, and rigged to work ;), some would assume that the next situation is always true, but this is not the case. The operation must be associative. This means that:

a + (b + c) = (a + b) + c

We know that this holds for my simple operation, as this axiom is a byproduct of regular addition and mod themselves, but again for some more complicated operations this may not hold. For example, think of the positive real numbers as your set with an operation of division . Clearly we can observe the following

n / (m / p) != ( n / m ) / p

So as seen this simple idea does not always hold.

3. Identity

No this is not a introspective look on society… An identity element is an element where any element applied the operation with the identity gives itself. For example with our set the identity is 0. Clearly any element added 0 will give itself. It’s not hard to come up with an example where there is no identity element, simply think of our set without 0.

4. Inverses

Building off of our last axiom a pair of elements are inverses when, together through the operation, they equal the identity element. In our set we can see 2 and 3 are inverses and so are 1 and 4. To see an example of a set/group without inverses, called a semigroup, think of positive integers with multiplication. Going through our other axioms it seems to hold. But there is no integer multiplied by 2 that gives the identity in the set, 1.

Subgroups

So we have officially defined a group now. Fun note, the group I had chosen is well known and commonly used. It is denoted as Z5. Sometimes you can take a portion of the set of a group and still have all these axioms hold under the same operation. If this is true then we call this new group a subgroup of the original. For example say we have the group

U(12) = {1, 5, 7, 11}

This group is usually called the unit group of 12, I leave proving it a group as an exercise to the reader 🙂 its operation is multiplication mod 12. Observe the simple set of {1, 5}. It becomes quickly apparent that 5*5mod12 = 1 so this closed, is associative, has an identity, and all elements have inverses (they are both their own). We can say that this then forms a subgroup. A side note, notice how 5^2 = 1.

Order and Lagrange’s Theorem

With our example above, the number of times that an element must have the operation applied to it before it becomes the identity is called the order. We can say the order of 5 in U(12) is 2. This isn’t totally useful for our purposes but is a fundamental of groups. To add confusion we also use the word order to describe how many elements a group has. For example our first group we used (Z5) has order 5. U(12) has order 4. A very cool theorem called Lagrange’s Theorem tells us that the order of an element must divide the order of its group. In other words for any element in any group the group’s order must be a multiple of the order of that element. Let’s give an example with U(12), we had the element 5 which we know has order 2. Notice how U(12) has order 4. The usefulness of this will come up later, I will not be providing a proof of Lagrange’s Theorem as it is more involved than what we need.

Generators

There some groups that have special elements called generators. Which when have the operation applied to them continually they form a group/subgroup. For example in Z5 we have 1 being a generator for the entire group. Generators are hugely important in the PKI world, as we will see with Elliptic Curves.

Elliptic Curves

Okay so now that we know what a group is. Lets go over what an Elliptic Curve group is. As mentioned in the last section a group is a set and an operation with certain axioms. Now lets remind ourselves what it means to be elliptic. An elliptic curve takes the form:

y = sqrt(x^3 + ax + b)

For some a and b constants. So where does the group come in??? Good question. The set will be a collection of points that sit on the curve, AND the point at infinity. Obviously in coding this you have to come up with a value for infinity. In my code (Python) I simply used None to be infinity. The operation is something simply called “point addition”. It is shown in IEEE Std 1363-2000 which is decently hard to get your hands on. Later on I will have an implementation to show.

ECDHE (Elliptic Curve Diffie Hellman Exchange)

Okay, so now we start to see where this is going. As in all PKI it is important to establish what is private and what is public. As mentioned in the last section there are coefficients a and b that help define a curve. In addition for PKI purposes a p, a prime, is also defined which dictates a modulus for the points. All of these are includes/decided upon as part of the public section. Below I leave an implementation of the point addition for a generic curve:

def add(self, x_0, y_0, x_1, y_1):
      if x_0 is None:
          self.r_x = x_1
          self.r_y = y_1
          return
      if x_1 is None:
          self.r_x = x_0
          self.r_y = y_0
          return
      if x_0 is not x_1:
          lambduh = ((y_0 - y_1) % self.p)
          lambduh *= self.modInverse(x_0 - x_1)
          lambduh %= self.p
          self.r_x = (lambduh*lambduh - x_0 - x_1) % self.p
          self.r_y = ((x_1 - self.r_x)*lambduh - y_1) % self.p
          return
      if (y_0 is not y_1) or y_1 is 0:
          self.r_x = None
          self.r_y = None
          return
      lambduh = (3*x_1*x_1 + self.a) % self.p
      lambduh *= self.modInverse(y_0*2) % self.p
      self.r_x = (lambduh*lambduh - x_0 - x_1) % self.p
      self.r_y = ((x_1 - self.r_x)*lambduh - y_1) % self.p

In this code r_x and r_y are the parts of the resulting point of the point addition. Also in the public section a generator is agreed upon for a part of the curve. Thus the public part is a 4-Tuple of <a, b, p, g>, where g is the generator point. Now for the private portion. Each person will come up with a random exponent. Do not be thrown by the name which is mostly vestigial. The “exponent” means the number of times that the generator will be added to itself. In many ways a better name would be multiple, but alas it is not. Now in true Diffie Hellman fashion each side will will send the generator “raised” to the exponent to each other. For example say Bob chooses m for his exponent and Alice chooses n for hers then Bob would send mg to Alice and she would send Bob ng. Eve (the eavesdropper) will not be able to find each exponent as it is computationally complex, also called the discrete logarithm. Then Bob and Alice will apply their exponents to what they received. This means Bob will have mng and Alice will have nmg. These are equivalent. This new shared secret point is then used to derive a key, usually the x coordinate of the shared point.

A thing of note about the process is the exponentiation. There is a naive way to implement this and an efficient way. The naive way is to repeatedly call the add function n times, n being your exponent. This becomes a problem very quickly. For “small” exponents on the order of 1,000,000 takes seconds to complete. But no fear! Modular Exponentiation is here! Say an exponent you choose is 32 bits in length (we will talk about lengths soon) what you do is build a list of g, 2g, 4g, 8g, … for each bit. Then with analyzing the bit pattern of the exponent add the elements in the list that correspond with each bit set. The end result is very efficient and allows 1024 bit exponents to be raised in less than a second. As this can seem confusing I have provided an implementation of this below:

def raiseByExponent(self, exp):
      if exp is 1:
          self.r_x = self.g_x
          self.r_y = self.g_y
          return
      gArray = []
      gArray.append(self.g_x)
      gArray.append(self.g_y)
      self.add(self.g_x, self.g_y, self.g_x, self.g_y)
      gArray.append(self.r_x)
      gArray.append(self.r_y)

      for i in range(2, exp.bit_length()):
          self.add(self.r_x, self.r_y, self.r_x, self.r_y)
          gArray.append(self.r_x)
          gArray.append(self.r_y)

      lowestBit = 0
      for i in range(exp.bit_length()):
          if ((exp & (1 << i)) != 0):
              lowestBit = i
              break
      self.r_x = gArray[2*lowestBit]
      self.r_y = gArray[(2*lowestBit)+1]
      lowestBit += 1
      for i in range(lowestBit, exp.bit_length()):
          if ((exp & (1 << i)) != 0):
              self.add(self.r_x, self.r_y, gArray[2*i], gArray[(2*i)+1])

Similarly to the add function the resulting point is left in r_x, r_y for the corresponding x and y parts of the point. Also be aware x parts of points are on even indexes and y parts are on the odd indexes in the list.

I feel the need…

One of the key aspects that many people like about Elliptic curves is the speed in generating components and the space that it affords. That’s also why I chose to show a Diffie Hellman explanation as this goes along a with the need for speed motif, as symmetric cryptography is usually faster than asymmetric. However I do concede that overall the speed of RSA is better than ECC (Elliptic Curve Cryptography) theres still alot that is better. For example to encrypt solely with RSA at the same strength as AES-256 you would need a prime modulus of 15,360 bits in length. For same level of security with ECC you would only need a prime modulus of 512 bits in length. For more comparisons observe the below table:

Security Bits – Symmetric RSA ECC
80 – SkipJack 1024 160
112 – 3DES 2048 112
128 – AES 3072 256
192 – AES 7680 384
256 – AES 15360 512

* Table Taken from Atmel-89511

Do you want to build a curve? (WARNING: MATH)

So now we have to get a little more math centered. Choosing a curve is arguably the most dangerous part of ECC. There are a few types of curves that you wish to avoid. The first being “supersingular” elliptic curves. Do not get too bogged down on the wording, just know that with supersingular elliptic curves there is certain realms of the dark arts of mathematics that allow for a huge decrease in security. Fortunately we have a way to identify when a supersingular curve is chosen. First I need to do some clarification. In the section on ECDHE I said that the public component p is a prime number, this is not a requirement of elliptic curves but many believe it to be the most secure option. Also, maybe you realized this, observe that the points on the curve will be integers mod p. We have notation for that! This is typically written as Zp if you didn’t already guess but we wish to add an additional operation (multiplication) which leads to a new notation of Z/pZ or Fp. Again elliptic curves do not have to be over integers they can be over other objects also but for us we will keep it simple. Now typically the “Math way” to describe the curves I am talking about is E(Fp), read as “Elliptic Curves over the field Fp“. We have also have a sub-condition for us to be closed under the point addition operation. We must have 4a3 + 27b2 ≠ 0 (mod p), for those wondering this is the discriminant. But I digress. The number of points on the curve you choose can be denoted by #E(Fp), I’ll talk about this more later. Then a curve is not supersingular if #E(Fp) ≠ 1 mod p.

The second issue is “prime field anomalous elliptic curves”. Again try not to worry too much about the Math jargon and just know it is bad. Using the knowledge we just when over the check for this is very similar to the supersingular case. We have this security concern in the area where #E(Fp) = p, which is 0 mod p. Fortunately, since 1997 there has not been any other types of elliptic curves that have been found with security concerns.

The point I wish to make here is to use a known safe curve. There are plenty out there that have been studied way more than you or I can do by ourselves. Even some of the original NIST curves are believed to have no obvious security concerns.

Dual EC_DRBG

There’s feeling of distrust in the security community is centered around something that came out of the Snowden leaks. The tool EC_DRBG (Elliptic Curve Deterministic Random Bit Generator) had a backdoor placed in it by the NSA during the standardization of it. Since we are pros in ECC we can understand what happened. The tool was premised on generating two points on a curve. It generated those points by first choosing a random point on the curve (Q) and then “raising” (remember this is a silly name – think multiply) by a k value to give another point (P). The backdoor that was placed was that the NSA had known the k exponent and used it for key escrow. This effectively defeats the ECDLP (Discrete Logarithm Problem) by just storing the answer. This is NOT a defeat of ECC in total. The same type of backdoor could have been down with an RSA algorithm just as easily. Also there is no indication that the safe curves people use are compromised. In addition with the Snowden leaks we saw that the NSA is not much farther ahead than the private sector with trying to solve the ECDLP in an efficient way.

What curves should I use

Theres many options out there for curves. The ANSI X9F1 standard provides many curves that were generated randomly. The NIST curves also work! They have gotten a bad wrap over the years as they are from the US government and NIST has been known to work tightly with the NSA before. But the ECDLP still applies and the possibility of a new type of elliptic curves that has security concerns is largely dismissed. I personally recommend to have your field be a prime field of binary field, many people are unsettled by binary fields. As for size I would mostly recommend 384 bits but I also recommend 4096 bits for RSA, so take that as you wish. A good NIST curve would be P-384, this is a prime field with a modulus of 384 bits (I know almost a logical name), again if you are afraid of NIST then choose another curve.

What is that #E(Fp)?

The #E(Fp) of a curve is hugely important and this is where it all comes together. This value, as we have seen, is integral to the security of a curve. But we have seen this value before. This is analogous to order. We know stuff about order! Namely we know Lagrange’s Theorem. Now I’m not going to go into how you find #E(Fp), it gets very complicated very quickly, but I will say how easy it is to check. Given a value for the order of a group then by Lagrange’s Theorem you can take any element on the curve and raise it to that value and you should get the identity element. And in case you didn’t realize this at the time, the identity is infinity in these groups. I leave this as a fun exercise to show why this is true.

Conclusion

Hopefully this post was helpful to give you an idea about what elliptic curves are all about. I admit some bias on the front of what curves I prefer but realistically it does not matter. Just be sure to do your own research on a particular curve/ set of curves before deciding. I also wish to put a disclaimer, as most do when publishing any post on cryptography, that although I believe that all information contained in this post is correct and is backed up by research I have seen, I cannot attest that it will always be this way and as such I simply state that I do not claim any code or ideas in this post are secure. They are meant, purely, for educational purposes.


Sources

  1. Maletsky, Kerry. “RSA vs ECC Comparison for Embedded Systems.” Atmel (2015): Web. http://www.atmel.com/Images/Atmel-8951-CryptoAuth-RSA-ECC-Comparison-Embedded-Systems-WhitePaper.pdf.
  2. Galbraith, Steven D., and Surrey Egham. “Supersingular Curves in Cryptography.” Supersingular Curves in Cryptography (2001): Web. 17 Nov. 2016. https://www.iacr.org/archive/asiacrypt2001/22480497.pdf.
  3. Koblitz, Neal, and Alfred J. Menezes. “A Riddle Wrapped in an Enigma.” A Riddle Wrapped in an Enigma (2015): Web. https://eprint.iacr.org/2015/1018.pdf
Advertisements