April 14, 2020

Encoding != Encryption != Hashing

Encoding != Encryption != Hashing

This is a very quick and highly simplified explanation about encoding, encryption and hashing and how they are different. Today I tweeted this:

I was genuinly surprised that this tweet got quite a lot of retweets and likes. However. From personal experience I know that a lot of people, including developers, have difficulties to distinguish the terms. Mainly the difference between encryption and hashing is not clear to many.

Encoding is used to transform data from one data format into another such that the data can be exchanged between different systems. Think for instance about mp4 (video) encoding to transforms other video formats into mp4. Encoding is not a security measure and shouldn't be used as such. The same algorithm is used both for encoding and decoding purposes.

Encryption is a security measure used to ensure confidentiality of data. It is a reversible transformation of plaintext into ciphertext using an algorithm, called a cipher, and an encryption key. The key should be properly secured and only shared with intended people, because together with the ciphertext and the algorithm it can be used to revert the ciphertext into the plaintext value.

Hashing algorithms are one-way transformations of the input into a hash value. The generated hashes are irreversible. There's no such thing as "dehashing".

There's also no reason you should ever want to do that for legitimate purposes because the purpose of hashing is to ensure integrity, not to revert the input from the hash value.

Let's take password hashing as an example to explain the principle.

Account registration

  1. The user creates an account and specifies a username and password.
  2. When the user clicks on the register button, the username and password are sent to the webserver (hopefully over the secure HTTPS protocol).
  3. Server side, a random string (called salt) is added to the user's password and this value is transformed into a hash by using a cryptographic hashing function. Both the hash value and the salt are stored for each user. The purpose of the salt is to make each hash value unique even for users with the same password.

Account login

  1. The user logs in with his username and password.
  2. When the user clicks on the login button, the username and password are securely sent to the webserver.
  3. Server side, the same cryptographic hashing algorithm used at account creation is used to hash the password specified by the user combined with the salt that is stored for this user.
  4. Only if these two hash values match the user is granted access (=integrity check).

The same checks should be done at password change of course. Only if the hash of the original password that the user specified matches with the hash of the stored password he should be able to change his password.