Understanding RSA

RSA is a beautifully simple algorithm for asymmetric encryption, which uses the properties of modular arithmetic to transmit small messages securely. This article presents a possible thought process from which the algorithm could be derived, based on some key mathematical facts.

Note: This article requires some knowledge of modular arithmetic.

A Small Overview

RSA is an asymmetric encryption algorithm, which means that its goal is to create a message which is encrypted using one key, but decrypted using a different one. In practice, it is a way to ensure that a message is only received by a specific person. To do this, the receiver of the messages creates a public encryption key, which any person can use to send them a message, and a decryption key, which they keep private and is the only way to read the messages sent.

The Key Insight

The starting point for the RSA algorithm is a simple fact derived from Euler's Theorem: If you have a congruence $\displaystyle{ m }$ (for message) in a modulo $\displaystyle{ n }$ , and they share no common factors, then there are infinitely many powers $\displaystyle{ x }$ such that:

\displaystyle{ m ^{ x } ≡ m \text{ (mod n)} }

For instance:

\displaystyle{ 1 ^{ 25 } ≡ 1 \text{ (mod 35)} }

\displaystyle{ 2 ^{ 25 } ≡ 2 \text{ (mod 35)} }

\displaystyle{ 3 ^{ 25 } ≡ 3 \text{ (mod 35)} }

\displaystyle{ 4 ^{ 25 } ≡ 4 \text{ (mod 35)} }

Basically, the entire exponentiation becomes a no-op. This may not seem very useful, but if we rewrite $\displaystyle{ x }$ as the product of two numbers, let's say, $\displaystyle{ x = e \cdot d }$ , then, it must hold that:

\displaystyle{ m ^{ e \cdot d } ≡ m \text{ (mod n)} }

\displaystyle{ \left( m ^{ e } \right) ^{ d } ≡ m \text{ (mod n)} }

Which, if you look at it, is exactly what we want. Anyone with the encryption key $\displaystyle{ e }$ can encode a message:

\displaystyle{ \text{encrypted message} ≡ m ^{ e } \text{ (mod n)} }

And, from the encrypted message, you can only recover the original one with the decryption key $\displaystyle{ d }$ .

\displaystyle{ \left( \text{encrypted message} \right) ^{ d } ≡ m \text{ (mod n)} }

This is the base of how the encryption scheme works. We create public $\displaystyle{ n }$ and $\displaystyle{ e }$ values, which anyone can use to encrypt a message, and we, and only we, can use our private key $\displaystyle{ d }$ to decrypt it. But now, we need to find a way of actually creating $\displaystyle{ n }$ , $\displaystyle{ e }$ and $\displaystyle{ d }$ , such that the original property holds, and more importantly, such that it is not easily possible to derive the decryption key $\displaystyle{ d }$ from the other variables, which would defeat the whole point.

Diving Deeper into Euler's Theorem

First of all, we must understand Euler's Theorem, the source of our key insight. It states that for a congruence $\displaystyle{ m }$ in a modulo $\displaystyle{ n }$ , with no common factors between $\displaystyle{ m }$ and $\displaystyle{ n }$ , it holds that:

\displaystyle{ m ^{ φ \left( n \right) } ≡ 1 \text{ (mod n)} }

Where $\displaystyle{ φ \left( n \right) }$ is Euler's totient function, which requires finding the prime factors of $\displaystyle{ n }$ , and is generally hard to compute.

This equation, in turn, means that multiplying anything by $\displaystyle{ m ^{ φ \left( n \right) } }$ has no effect (as it is congruent to 1), and we can do it freely. From this, we get that:

\displaystyle{ m ≡ m \cdot m ^{ φ \left( n \right) } ≡ m \cdot m ^{ φ \left( n \right) } \cdot m ^{ φ \left( n \right) } ≡ m \cdot m ^{ φ \left( n \right) } \cdot \ldots \cdot m ^{ φ \left( n \right) } ≡ m ^{ 1 + q φ \left( n \right) } \text{ (mod n)} }

\displaystyle{ m ^{ q φ \left( n \right) + 1 } ≡ m \text{ (mod n)} }

Where $\displaystyle{ q }$ is a whole number.

Now, if we set $\displaystyle{ x = e \cdot d = q φ \left( n \right) + 1 }$ , we have recreated the original property:

\displaystyle{ m ^{ x } ≡ m ^{ e \cdot d } ≡ m \text{ (mod n)} }

We have also greatly constrained the possible values of $\displaystyle{ e }$ and $\displaystyle{ d }$ , with $\displaystyle{ e \cdot d = q φ \left( n \right) + 1 }$ . This equation can be rewritten in a modular form:

\displaystyle{ e \cdot d ≡ 1 \text{ (mod φ(n))} }

\displaystyle{ d ≡ e ^{ - 1 } \text{ (mod φ(n))} }

This establishes a clear relationship between $\displaystyle{ e }$ and $\displaystyle{ d }$ , which can be further simplified by the fact that $\displaystyle{ e }$ is supposed to be public, which allows us to use a preset value (usually the prime $\displaystyle{ 2 ^{ 16 } + 1 }$ ) without worrying too much about security. We then only need to make sure $\displaystyle{ e }$ is coprime with $\displaystyle{ φ \left( n \right) }$ for $\displaystyle{ d }$ to exist (made easier by the primality of $\displaystyle{ e }$ ). We have also gained an important requirement: To compute $\displaystyle{ d }$ , we basically only need to know $\displaystyle{ φ \left( n \right) }$ . This means $\displaystyle{ φ \left( n \right) }$ should be kept private as well. Then, finding $\displaystyle{ d }$ from $\displaystyle{ e }$ is just a multiplicative inverse, which can be done using the Extended Euclidean Algorithm.

Euler's Totient Function

To complete the algorithm, we need to actually compute $\displaystyle{ φ \left( n \right) }$ . For our purposes, it necessary to know that the totient function is computed with all of $\displaystyle{ n }$ 's unique prime factors $\displaystyle{ p _{ 1 } , p _{ 2 } , \ldots , p _{ k } }$ :

\displaystyle{ φ \left( n \right) = n \cdot \frac{ p _{ 1 } - 1 }{ p _{ 1 } } \cdot \frac{ p _{ 2 } - 1 }{ p _{ 2 } } \cdot \ldots \cdot \frac{ p _{ k } - 1 }{ p _{ k } } }

For example, $\displaystyle{ φ \left( 12 \right) = 12 \cdot \left( \frac{ 1 }{ 2 } \right) \cdot \left( \frac{ 2 }{ 3 } \right) = 4 }$ , because the unique prime factors of $\displaystyle{ 12 }$ are $\displaystyle{ 2 }$ and $\displaystyle{ 3 }$ . Computationally, this means that finding $\displaystyle{ φ \left( n \right) }$ requires factoring $\displaystyle{ n }$ , which is a famously hard problem to do efficiently with large numbers. On one hand, this means that people will not be able to easily compute $\displaystyle{ φ \left( n \right) }$ , and, by extension, $\displaystyle{ d }$ , which is exactly what we want. On the other hand, this also means we cannot either, which makes the entire algorithm unworkable.

The solution is to backwards: Instead of starting with an $\displaystyle{ n }$ and factoring it, we start with the prime factors and multiply them to compute $\displaystyle{ n }$ . The simplest non-trivial case for this is when $\displaystyle{ n }$ is a product of two primes $\displaystyle{ n = p \cdot q }$ . Then,

\displaystyle{ φ \left( n \right) = φ \left( p \cdot q \right) = p \cdot q \cdot \frac{ p - 1 }{ p } \cdot \frac{ q - 1 }{ q } = \left( p - 1 \right) \left( q - 1 \right) }

Now, if we generate two primes, computing $\displaystyle{ n }$ and $\displaystyle{ φ \left( n \right) }$ is trivial, but, given $\displaystyle{ n }$ , finding $\displaystyle{ φ \left( n \right) }$ requires finding the factors $\displaystyle{ p }$ and $\displaystyle{ q }$ , which for large enough numbers, becomes computationally unfeasible. This is the last step to our puzzle, and we are ready to describe the algorithm in full.

Putting It All Together

The first step in our algorithm is to generate two primes, $\displaystyle{ p }$ and $\displaystyle{ q }$ , from which we can compute our $\displaystyle{ n = p \cdot q }$ and $\displaystyle{ φ \left( n \right) = \left( p - 1 \right) \left( q - 1 \right) }$ . We share $\displaystyle{ n }$ with everyone, as part of our public key, while keeping the other values private.

We then choose a value for our $\displaystyle{ e }$ , usually $\displaystyle{ e = 2 ^{ 16 } + 1 }$ , and share it as the final part of our public encryption key. Then, using our private $\displaystyle{ φ \left( n \right) }$ , we compute the private decryption key $\displaystyle{ d ≡ e ^{ - 1 } \text{ (mod φ(n))} }$ . We have now finished the key generation, and it is time to send some messages.

Any person that knows both $\displaystyle{ n }$ and $\displaystyle{ e }$ can encrypt a message $\displaystyle{ m }$ with a simple exponential:

\displaystyle{ \text{encrypted message} ≡ m ^{ e } \text{ (mod n)} }

This message cannot be decrypted by anyone, even the original sender, without the private decryption key $\displaystyle{ d }$ . Since we know it, we can decrypt it just as easily as it was encrypted:

\displaystyle{ m ≡ \left( \text{encrypted message} \right) ^{ d } \text{ (mod n)} }

And thus, we have gotten back the original message, without the possibility of snooping by third parties.

Conclusion

In this article we have gone through the mathematical principles that allow the RSA algorithm to work, alongside a possible process for going from these insights to an actual algorithm. That said, there is still some more to it. First of all, generating large primes is not trivial, and requires the use of probabilistic primality tests, like Miller-Rabin. Similarly, the Extended Euclidean Algorithm is rather complex, at least conceptually. There are also many possible optimizations, like replacing the totient function with the reduced totient function, for which Euler's Theorem also holds. Finally, it is very important to know that you must be very mindful about security with RSA, as it is extremely easy to shoot yourself in the foot, making the entire algorithm unsecure. If you are not an expert, do not ever use a home-baked implementation of RSA for anything that actually needs to be secure. That said, understanding and implementing RSA is a fun and relatively easy project, and trying to improve its speed an security is a great learning experience in math, programming, and cryptography.