Understanding RSA
RSA is a beautifully simple algorithm for asymmetric encryption, which uses the properties of modular arithmetic to transmit small messages securely. This article presents a possible thought process from which the algorithm could be derived, based on some key mathematical facts.
Note: This article requires some knowledge of modular arithmetic.
A Small Overview
RSA is an asymmetric encryption algorithm, which means that its goal is to create a message which is encrypted using one key, but decrypted using a different one. In practice, it is a way to ensure that a message is only received by a specific person. To do this, the receiver of the messages creates a public encryption key, which any person can use to send them a message, and a decryption key, which they keep private and is the only way to read the messages sent.
The Key Insight
The starting point for the RSA algorithm is a simple fact derived from Euler's Theorem: If you have a congruence (for message) in a modulo , and they share no common factors, then there are infinitely many powers such that:
For instance:
Basically, the entire exponentiation becomes a no-op. This may not seem very useful, but if we rewrite as the product of two numbers, let's say, , then, it must hold that:
Which, if you look at it, is exactly what we want. Anyone with the encryption key can encode a message:
And, from the encrypted message, you can only recover the original one with the decryption key .
This is the base of how the encryption scheme works. We create public and values, which anyone can use to encrypt a message, and we, and only we, can use our private key to decrypt it. But now, we need to find a way of actually creating , and , such that the original property holds, and more importantly, such that it is not easily possible to derive the decryption key from the other variables, which would defeat the whole point.
Diving Deeper into Euler's Theorem
First of all, we must understand Euler's Theorem, the source of our key insight. It states that for a congruence in a modulo , with no common factors between and , it holds that:
Where is Euler's totient function, which requires finding the prime factors of , and is generally hard to compute.
This equation, in turn, means that multiplying anything by has no effect (as it is congruent to 1), and we can do it freely. From this, we get that:
Where is a whole number.
Now, if we set , we have recreated the original property:
We have also greatly constrained the possible values of and , with . This equation can be rewritten in a modular form:
This establishes a clear relationship between and , which can be further simplified by the fact that is supposed to be public, which allows us to use a preset value (usually the prime ) without worrying too much about security. We then only need to make sure is coprime with for to exist (made easier by the primality of ). We have also gained an important requirement: To compute , we basically only need to know . This means should be kept private as well. Then, finding from is just a multiplicative inverse, which can be done using the Extended Euclidean Algorithm.
Euler's Totient Function
To complete the algorithm, we need to actually compute . For our purposes, it necessary to know that the totient function is computed with all of 's unique prime factors :
For example, , because the unique prime factors of are and . Computationally, this means that finding requires factoring , which is a famously hard problem to do efficiently with large numbers. On one hand, this means that people will not be able to easily compute , and, by extension, , which is exactly what we want. On the other hand, this also means we cannot either, which makes the entire algorithm unworkable.
The solution is to backwards: Instead of starting with an and factoring it, we start with the prime factors and multiply them to compute . The simplest non-trivial case for this is when is a product of two primes . Then,
Now, if we generate two primes, computing and is trivial, but, given , finding requires finding the factors and , which for large enough numbers, becomes computationally unfeasible. This is the last step to our puzzle, and we are ready to describe the algorithm in full.
Putting It All Together
The first step in our algorithm is to generate two primes, and , from which we can compute our and . We share with everyone, as part of our public key, while keeping the other values private.
We then choose a value for our , usually , and share it as the final part of our public encryption key. Then, using our private , we compute the private decryption key . We have now finished the key generation, and it is time to send some messages.
Any person that knows both and can encrypt a message with a simple exponential:
This message cannot be decrypted by anyone, even the original sender, without the private decryption key . Since we know it, we can decrypt it just as easily as it was encrypted:
And thus, we have gotten back the original message, without the possibility of snooping by third parties.
Conclusion
In this article we have gone through the mathematical principles that allow the RSA algorithm to work, alongside a possible process for going from these insights to an actual algorithm. That said, there is still some more to it. First of all, generating large primes is not trivial, and requires the use of probabilistic primality tests, like Miller-Rabin. Similarly, the Extended Euclidean Algorithm is rather complex, at least conceptually. There are also many possible optimizations, like replacing the totient function with the reduced totient function, for which Euler's Theorem also holds. Finally, it is very important to know that you must be very mindful about security with RSA, as it is extremely easy to shoot yourself in the foot, making the entire algorithm unsecure. If you are not an expert, do not ever use a home-baked implementation of RSA for anything that actually needs to be secure. That said, understanding and implementing RSA is a fun and relatively easy project, and trying to improve its speed an security is a great learning experience in math, programming, and cryptography.