It’s difficult for beginning programmers to grasp the idea of one way encryption. “How can I make an encryption that is impossible to reverse? Even if I know the algorithm to encrypt? And why would I ever want to do this?”
First let’s review a very basic two way encryption algorithm.
Take a string of text Danny.
A basic algorithm could be to push each of the letters to the right once.
Raw = Danny
Encrypt(“Dan”) = yDann
Obviously, this is a pretty poor encryption algorithm. A simple way of improving the encryption and making the output more scrambled is to encrypt the raw text twice.
Raw = Danny
Encrypt(Encrypt(“Dan”)) = nyDan
Still fairly poor, but it’ll suite for this example. Our Decrypt function would simply push the letters to the left.
Decrypt(“yDann“) = Danny
or
Decrypt(Decrypt(“nyDan“)) = Danny
Before I now explain one way encryption, let me pose an example where one way encryption would be needed. Google.com has a database of millions of users. Each user has a username and a password stored in Google’s database. Any database administrator can view this data, including the passwords. So one way to protect user’s privacy is to encrypt the passwords before placing them into the database. Then, when a user logs into google.com with their username or password, the password is encrypted using the same algorithm and then checked to see if it matches the password in the database. Makes sense right? If my password to google was Danny, the database could store yDann instead. Then, when I log on to google.com and type in Danny in the password field, google.com encrypts the password using the same algorithm mentioned above and compares it to yDann in the database. Since they are equal I’m able to log in! Hurray!
But there’s a huge problem! What if the database administrator to google discovered the algorithm written above? The database admin could just copy the password to my account and then go home and decrypt it! Then the admin would be able to log into my account and access my information!
This is where hashing, a form of one way encryption, comes into play.
Let’s go back to my password: Danny
Let’s try another algorithm, let’s change each letter to the number that corresponds with the letter.
A = 1
B = 2
C = 3
D = 4
E = 5
F = 6
G = 7
H = 8
I = 9
J = 10
K = 11
L = 12
M = 13
N = 14
etc…
So, Encrypt(“ABC“) = 123 // 1,2,3
Seems simple right? But this algorithm, as unsecure as it may be, is an example of a one way encryption algorithm!
Encrypt(“LC“) = 123 // 12, 3
ABC and LC have the same output! Meaning that it’s impossible to know, given the encrypted/hashed string 123, what the text was before the hashing algorithm took place.
So in the database of google under my username’s password could be 4114
Encrypt(“Danny“) = 4114 // 4, 1, 14
But when the evil database administrator sees my password of 4114, he (or she) won’t know whether my password is Danny, Dan, or DKD! Even if the algorithm is known, there’s no way to know for sure the value of the pre-hashed text. With longer passwords there’s even more possible combinations for each text.
Many computer scientists have spent their careers working on improving hashing algorithms. MD5 and SHA-1 seem to be the most popular hashing algorithms on the web (md5 is built into PHP); however, you should be careful not to just use these algorithms due to reverse hash lookups around the web.