Almost every site you go to, if you lose your password, the site refuses to send it to you.  Instead they either create a new one for you or send you a link to change it.

Why is this?  Since they must know your password to check if you entered the correct one, shouldn't they be able to send it to you?

That's actually not entirely correct.  Any responsible site will not store your actual password.  Instead they store what is called a "hash" of your password.  Hashing is a one-way process of changing some value (in this case your password) into some other value (what actually gets saved).  It's one-way in that you can calculate the hash from your password, but there's no way to calculate your password from the hash. (Actually that's not 100% true.  It's possible but very difficult to come up with a list of some potential passwords, but it can't guarantee finding the exact one, and the list of potential passwords can be infinite if passwords are allowed to be infinitely long) .

So how exactly does hashing work?  How can you only go one way but not the other?  Let's make up a simple method of hashing (called a hashing algorithm).  It's not even remotely close to being as secure as ones that are used for real, but it's simple to explain and it gets the point across.

Take the password character by character.  If it's a number, leave it as is.  If it's a letter, convert it to a number, i.e. A is 1, B is 2, Z is 26 etc.  After you do that, add up all the numbers.  For example, let's say our password is "cab123".  Doing the number translation gives us "312123" and adding that up gives 12.  This number is what actually gets saved as your "password".

The site doesn't know your actual password, all they know is that when you hash your password you get the number 12.  When you try to log in, the password you typed is hashed and compared to what's in the database.

Right now you're probably thinking "Well what if someone else guesses a password that whose numbers and letters also add up to 12?"  Well with our simple algorithm they would indeed be able to log in as you with the password "EG" (5+7).

Clearly real algorithms are more sophisticated than this.  Most of them create numbers that are 38 digits long (128-bit) or more which makes it extraordinarily unlikely that someone would randomly guess something that hashes to the same thing as your password.  Even that's an understatement. You're probably more likely to get hit by a falling plane within 5 minutes of winning the lottery than for someone just randomly typing to get something else that has the same hash as your password.

The point is that there isn't a way to calculate the original value from the hashed value.  In our simple example, given only the number 12 there's no way to figure out definitely that the password was "cab123".  It could have been "eg" or "321abc" or "93" or "ab9".  The point of hashing isn't to prevent someone else from getting into your account on that site.  If they were able to get the hash then it's likely they already have access to your other information on that site.  The reason for hashing is so that even if they do hack that one site, they don't know your original password that you may be using on other sites.

Real hashing algorithms make it hard to even get a list of possible passwords.  Pretty much the only way is to brute-force it by trying different passwords and seeing if they get a match.  That isn't actually as far-fetched as it sounds however.

To combat a brute force, most sites (hopefully) do something that is called "salting".  How salting works is actually amazingly simple.  Before hashing a password, a site will add some letters (or numbers or any other combination of characters) to either the front or back of the passwords.  This is called the "salt".  So if my password was "ILoveKittens" and the salt was "ungawunga" then the hash of "ungawungaILoveKittens" would be saved instead.  I'll get into how salting helps after I describe the kinds of attacks that they protect against.

One form of brute force attack is what is known as a "dictionary attack" which takes different combinations of dictionary words as passwords and calculate a huge list of hashes.  Now they can compare it to a list of hashed passwords in a database they have.  How did this get this database?  Maybe they started an illegitimate site who's main purpose is to get people's passwords.  Maybe they're a hacker that got into some other site's database. Maybe a malicious employee at a company tried to do this.  There are numerous ways this could happen.

Another attack similar to a dictionary attack is called a "rainbow attack" where hashes are calculated for every combination of characters up to some length, and stores them in what's called a "rainbow table".  Doing every combination of characters takes exponentially longer with password length, which is why longer passwords are recommended.

For example, if a password is 2 characters long and only contains numbers, then there are 100 different combinations: 10 possible digits (0-9) with 2 characters is 10².  If we add just 1 character to get a 3 digit password, now there are 10³ (1,000) different combinations, so each character makes it 10 times as difficult to create a full list.

If we used both numbers and letters, then with 2 characters we would have 10 numbers and 26 letters, which is 36² (1,296). With 3 characters it is 36³ (46,656).  See how much faster that grew with adding letters into the mix?  If we allow lower-case and upper-case letters to be treated differently, for a 2 character password that gives us 62² (3,844), and with 3 characters it's 62³ (238,328).  See how much faster it grows?  If we add a 4th character to our case-sensitive alpha-numeric password, that gives us 14,776,336 different combinations.

Even still, computers are getting very fast these days, and as of 2008 there are rainbow tables for every alpha-numeric password under 10 characters.  That sounds terribly scary, and this is why passwords should be salted in addition to hashing.  Salting does 2 things.  For one it makes the password longer.  Remember each additional character of length adds a ton of difficulty, and these rainbow tables can take years to create.  In our uppercase/lowercase/numeric password scenario, adding a single character makes it 62 times as hard.  If a table for 10 characters took a year to calculate, one for 11 characters would take 62 years, effectively making it infeasible.

Hashes and salts are usually stored together because the salt is needed to recreate the hash.  It would seem like an attacker could just take the existing list and add the salt to them and calculate new hashes.  While this is true, this also isn't very viable because a single rainbow table is only valid for one specific salt (or no salt).  If each user has a different salt (which they should) then instead of spending a year creating a password table for every single password less than 10 characters (if they're unsalted), you instead have to create a different table for every user.

Dictionary attacks are still effective with simple passwords though, since the number of combinations that make up real words is a lot smaller than those that can be any combination of characters. This is why it's encouraged that people pick strong passwords that can't be easily guessed by a computer putting together different combinations of known words and phrases.  For example, some sites require you have a number and some combination of upper case and lowercase letters, or even symbols.

It is a social responsibility of software developers to apply this level of security, but unfortunately (as far as I'm aware) there is no legal requirement for site-owners to create secure sites (except perhaps in the financial sector or other regulated businesses). Ultimately, it comes down to "How much do I trust this site?"  I would recommend that at the very least you have 2 different passwords.  One for sites of companies you trust to have good security, and a different one for more questionable sites.

Remember, all it takes is just one unsecured site with a semi-determined hacker to find your password. If you're using the same password on other sites, you're not only risking your password on that site, but every other site where you use that password. This is especially dangerous with the common trend of using your email address as your username.  Your primary email is in a separate category and should have its own dedicated password. Your email is the key to most sites you sign up with, because that's where your (hopefully new) password will be sent when you lose it. If someone gets access to your email account, they could get your passwords to other sites in the same way.

The ideal solution is to have a different password for every site.  You may forget the lesser used ones but most sites provide a way to recover it by email.  Which once again brings up the point that you should keep your email password the most secure of all.  It's like your master key that opens everything.  You can also use different email addresses for different sites.  I personally use a separate email account for sites that I suspect may spam me or seems otherwise untrustworthy.

I hope this post has been helpful and that you feel better knowing why most sites can't send you your passwords.  You should be extra wary of sites that can.

Feel free to comment on anything that is unclear or just flat out wrong (or heck, even grammatical errors).

Next week: Why do programs crash?  Feel free to suggest further topics, as I don't have one in mind for the week after yet.