Hello again! Since 2013 is now firmly ensconced in the record books, I thought I'd take some time out of your assorted hangover recoveries to revisit two of the Deadspin commentariat's favorite mathematical topics from the last year: prime numbers and modular arithmetic! Judging from the comments my last Mathspin! article received, it seems that I didn't spend enough time covering the mathematical foundation of current crypto or relating them to non-crypto-nerds with easily accessible examples; I hope that won't be the case this time around - I have pictures with colors and everything.

If you've even come into tangential contact with the Internet since June, you've likely read about, heard about and are sick to death of discussions about the National Security Agency's operations to ensure the concept of "Internet privacy" is no more than a fond memory in everyone's minds. From the almost-completely-confirmed forced government blessing of a random number generator highly suspected to be backdoored to the completely-confirmed backdooring of chips used to make secure web-surfing faster, the world has discovered that the NSA has done everything it can to make TLS stand for "Tough Luck, Shithead". Edward Snowden couldn't take any more of this secret government skullduggery and decided to become a whistleblower - hence why he is currently in some secret Moscow bar going shot-for-shot with Putin - but before he left, he decided to use a small Texas-based email company to send his secret correspondence to reporters, a choice which became the company's undoing.

One of the consequences of Snowden's decision to become the Winter Olympics' first American delegate was the closure of Snowden's secure email service Lavabit, a move that sent shockwaves through the security community but didn't get much recognition from mainstream news outlets. Later, we found out why Lavabit suddenly went out of business: the US government had forced its owner, Ladar Levinson, to turn over the company's private encryption keys - a move compromising the security of not just the user the government was trying to obtain information about, but of every user the service had - and Ladar closed up shop in hopes of protecting his remaining customers. While I certainly wouldn't have wanted to be in Ladar's shoes when he had to make that decision, I wondered if that decision had to be made at all - that is, if somehow Lavabit could communicate securely with its customers without having some master key available for the government to access - and surprisingly, there is a better way.

Forward secrecy (previously called perfect forward secrecy, a similar, but different concept) is a cryptographic concept that has become more relevant since Lavabit's shuttering. Essentially, forward security allows you to forget the past - if you initiate your conversations in a forward-secure manner and all sides discard their keys when their conversations end, if someone were to collect all your encrypted conversations (like they might if they were trying to find you talking to someone they wanted to accuse of treason), they would only be able to obtain the related unencrypted conversations if you were actively conversing at the time they tried to obtain them. Thus, if something happens such that your secret keys leave your control (like you get hacked, subpoenaed or rolled up), instead of someone gaining the ability to read every conversation you've had before ever, your conversations finished before the incident are safe. Unfortunately Ladar didn't enable forward secrecy for use with Lavabit - but before I get into how he could have, I have to get into a little math first.

Numbers do a lot of funny things when you put them together the right way. Take the number 3, for example - not the most significant number, but not insignificant either. Now multiply it by itself - now you have 9. Big whoop. Going back to UEA's follow-up post on modular arithmetic, if we were to take this mod 7, we would get the number 2 - again, big whoop. But for the sake of shits and giggles let's give this another go - now we have 27, which is 6 mod 7. Then we do this again, and again, and again. If you look closely once you're done, you see that somehow you managed to get all the positive integers less than 7. Is that strange? Not really - you just put 3 and 7 together the right way. You see, 3 is called a primitive root mod 7, meaning that you can generate every positive integer less than 7 by a power of 3 mod 7. See? Funny [1]. I'm going to call these primitive roots generators for the rest of this post both because I think it's easier to remember they "generate" all the numbers less than some other number, and I'm not going to type "primitive root" like 20 more times.

Now you might be thinking "Okay Cardiac Linfarction, that's a neat party trick, but I bet this only works for, like, 7 and 400." Nope! You see, the generators actually give you every number coprime to n, but if you check the last paragraph in UEA's post (or remember what being coprime means), every number less than a prime is coprime to it. In addition, if n is a prime you are guaranteed one generator, so this works for every prime number.

Advertisement

Alright, let's move back out of the world of math and back into the world of useful things. Say you have two people, Alice and Bob, who want to talk to each other secretly but have never met each other before so they don't have any secrets between them they can use as a key. What can they do? Well, they can start by agreeing on a prime number n and generator g to use for the rest of the protocol. They can do this out in the open so long as Alice and Bob can always agree to use the same prime and generator. Then Alice and Bob both generate random numbers less than n (let's call Alice's number a and Bob's number b), raise g to the power of their respective numbers mod n, and send them to each other. Now Alice has received g^b mod n and Bob has received g^a mod n. Now, they both know their respective random numbers, so they raise the number they received to their respective numbers mod n. Through the power of math exponentiation commutes, so in the end Alice gets (g^b)^a mod n = g^ba mod n = g^ab mod n = (g^a)^b mod n, which is what Bob has - They can forget their secret number and use this shared value to create an encryption key for that session. This is called the Diffie-Hellman key exchange (or Diffie-Hellman/DH for short) after its inventors Whitfield Diffie and Martin Hellman and allows for both forward secrecy and for two people to securely create a shared secret over a channel with an eavesdropper [2].

Okay, I promised pictures, so here it is - if you imagine the yellow paint is n and g and the secret colors are a and b then this is a great visual aid (so great, I stole it from Wikipedia):


"But Cardiac Linfarction," you may inquire, "why is it so hard to separate the mixture"? Well, think of it this way - If I asked you to find 5^7 mod 23 it'd be pretty easy - get 5^7 and take the modulus. But if I said "5^k mod 23 equals 19 - find k", it's going to take a bit more work to figure out what k is. This problem of trying to find k given g^k mod n is called the Discrete Logarithm Problem, and most mathematicians and cryptographers think this is a hard problem [3]. Coincidentally, you have to be able to solve the DLP to break DH.

Advertisement

Let's bring it all back together with an example using the security industry's favorite whipping post - TLS! One of the TLS setup steps is to exchange the secret data that generates the keys actually used to encrypt the session data. This is most commonly done with RSA - that is, the client will generate some secret data and encrypt it with the server's public key before sending it over the wire, safe in the knowledge that only the server has the private key that will decrypt it. But what if the server isn't the only one who has the server's private key? Then if someone was recording the traffic between the client and server they would capture the transmission containing the encrypted secret data and could then decrypt it - a little more fiddling and they would have the encryption keys to read the data from this session. Note that it doesn't matter how long it's been since they captured the data and when they get the key - it could be an hour or a millennium between the two, the math won't change. Some people see this as a problem, and instead use ephemeral DH (which generates a new random number every time on the server side and is the only form of DH in major use today) to exchange secret data - this way, even if someone records all the data they won't be able to decrypt it unless they can break the DLP.

In the end, had Ladar enabled ephemeral DH on Lavabit's servers, he would have been able to give his private key to the Feds with a clean conscience, safe in the knowledge that all his clients' data was safe - at least, until we shockingly discover he was storing everyone's passwords and emails in plaintext.


[1] Well, I wouldn't really call it funny ha-ha... so it fits right in on Deadspin!

[2] In addition to Diffie and Hellman, Ralph Merkle gave Hellman the idea of using discrete logs and should be credited but usually isn't; and once again Clifford Cocks, along with Malcolm Williamson and James Ellis at GCHQ, came up with the idea before Diffie/Hellman/Merkle but couldn't publish since their work was again classified, once more to which I can only say TOUGH BREAK. This does not apply to Elliptic Curve Diffie-Hellman, although the concept used there is very similar.

[3] Actually we hope this is a really hard problem; we've been getting much better at this over the years and since this is like one of the three mathematical properties that underpin all of modern crypto (and advances in this problem apply very easily to factoring, another of the three mathematical properties that underpin all of modern crypto) if this falls we are likely to be in for what has been called a cryptopocalypse.