First SHA-1 collision, birthday paradox, should you rehash your will?

One of the most important properties of a cryptographic hash function is that of “Strong collision-resistance”, that is, it should be "computationally infeasible" to find distinct inputs x1, x2 such that hash(x1) = hash(x2).

For the SHA-1 hash function, which generates a 160 bit hash, to have strong collision resistance, only a brute force attack should be possible, which requires 2^80 computations.

A few weeks back, a team of researchers from Google and the CWI Institute in Amsterdam submitted the first ever successful SHA-1 collision attack, appropriately dubbed as SHAttered — that is, they created 2 pdf files that have the same SHA-1 hash value.

This is a remarkable accomplishment.

Though this attack took the equivalent processing power as 6500 years of single-CPU computations and 110 years of single-GPU computations, the total number of computations was only 9,223,372,036,854,775,808 or 2^63, which is more than 100000 faster than brute force (which requires 2^80 computations)

SHA-1, published by NIST in 1995, has been proven not to have strong collision-resistance property, and so officially broken.

Below are screen shots of 2 different pdf files that have the same SHA-1 hash!

shattered 1

shasum shattered-1.pdf
38762cf7f55934b34d179ae6a4c80cadccbb7f0a

shattered 2

shasum shattered-2.pdf
38762cf7f55934b34d179ae6a4c80cadccbb7f0a

This news doesn’t come as a surprise as flaws were found in SHA-1 as early as 2004 and theoretical attacks were proposed. Bruce Schneier, a well known cryptographer, said at the Crypto 2004 conference in 2004, “It’s time for us all to migrate away from SHA-1”. In 2011, NIST deprecated SHA-1. SHA-256 or SHA3 should be used.

In a few months, Google will release code that allows anyone to create a pair of PDFs that hash to the same SHA-1 sum given two distinct images with some pre-conditions.

Microsoft has said “Starting with the Windows 10 Anniversary Update, Microsoft Edge and Internet Explorer will no longer consider websites protected with a SHA-1 certificate as secure and will remove the address bar lock icon for these sites. These sites will continue to work, but will not be considered secure.” Chrome has removed support for sites with SHA-1 signed certificates.

This action is long overdue.

But it is important to understand what exactly was accomplished here.

Let’s say that you have written a will or a living trust, created a pdf file and computed its SHA-1 hash. You then emailed the SHA-1 hash to a bunch of your friends. Should you be worried now?

Is it possible for someone, who has access to the safe containing the pdf file, to create another pdf file with different data that has the same hash? Hell no.

You can sleep at night without worrying. This property of cryptographic hash function is called second-preimage resistance (or weak collision resistance) and for SHA-1 the brute force attack requires 2^160 computations. Even the proposed non-brute-force attacks require at least 2^128 computations, which is computationally infeasible for the foreseeable millenia.

A simplistic way of understanding feasibility of strong-collision vs weak collision attacks is by asking at a birthday party, “Is there at least one pair at the party that share the same birthday?” vs “My birthday is Jul 4, how many people have the same birthday?”. In the former case, it would be surprising (and perhaps because of that it is called the birthday paradox) to find out that the probability is 50% when there are as few as 23 attendees in the party. In the latter case, the probability is much lower.

As mentioned earlier,

Google will be releasing code that allows anyone to create a pair of PDFs that hash to the same SHA-1 sum given two distinct images with some pre-conditions. The key phrase here is “to create a pair of PDFs that hash to the same SHA-1”.

The corresponding will or the living trust analogy is that you dictate your will to your evil secretary and he creates 2 documents, one with your instructions and one favoring him and he tweaks them both, adding innocuous spaces, punctuation, filler words etc until they have the same SHA-1 hash. He shows the first doc and the SHA-1 to you but files the second doc. This is a relatively easier problem requiring 2^80 computations if done using a brute force attack and a computationally feasible 2^63 computations if the method used by Google is applicable.

Bruce Schneier quotes a NSA saying “Attacks always get better; they never get worse.”. So, if your don’t trust your secretary, ask him to use SHA-256 or SHA3 to hash the will, and sleep like a baby!

Leave a Reply