Collision attack

Collision attack

In cryptography, a collision attack on a cryptographic hash tries to find two arbitrary inputs that will produce the same hash value, i.e. a hash collision. In contrast to a preimage attack, neither the hash value nor one of the inputs is specified.

There are roughly two types of collision attacks:

  • Collision attack: find two arbitrary different messages m1 and m2 such that hash(m1) = hash(m2).
  • Prefix collision attack: given two different prefixes p1, p2 find two appendages m1 and m2 such that hash(p1 ∥ m1) = hash(p2 ∥ m2) (where is the concatenation operation).


Classical collision attack

Mathematically stated, a collision attack finds two different messages m1 and m2, such that hash(m1) = hash(m2). In a classical collision attack, the attacker has no control over the content of either message, but they are arbitrarily chosen by the algorithm.

Much like symmetric-key ciphers are vulnerable to brute force attacks, every cryptographic hash function is inherently vulnerable to collisions using a birthday attack. Due to the birthday problem, these attacks are much faster than a brute force would be. A hash of n bits can be broken in 2n / 2 time (evaluations of the hash function).

More efficient attacks are possible by employing cryptanalysis to specific hash functions. When there exist collision attacks that are faster than a birthday attack, a hash function is often denounced as "broken". The NIST hash function competition was largely induced by published collision attacks against two very commonly used hash functions, MD5[1] and SHA-1. The collision attacks against MD5 have improved so much that it takes just a few seconds on a regular computer.[2]

Hash collisions created this way are usually constant length and largely unstructured, so cannot directly be applied to attack widespread document formats or protocols. However, workarounds are possible by abusing dynamic constructs present in many formats. Such a malicious document would contain two different messages in the same document, but conditionally displays one or the other, depending on which of two collided hash values is present:

  • Computer programs have conditional constructs (if-then-else) that allow testing whether a location in the file has one value or another.
  • Some document formats like PostScript, or macros in Microsoft Word, also have conditional constructs.[3][4]
  • File formats that can include images, including TIFF and PDF, are vulnerable to collision attacks by using colliding hash values as indexed colors, such that text of one message is displayed with a bright color that blends into the background, and text of the other message is displayed with a dark color.[4]

Chosen prefix collision attack

An extension of the collision attack is the chosen prefix collision attack, which is specific to Merkle–Damgård hash functions. In this case, the attacker can choose two arbitrarily different documents, and then append different calculated values that result in the whole documents having an equal hash value. This attack is much more powerful than a classical collision attack.

Mathematically stated, given two different prefixes p1, p2, the attack finds two appendages m1 and m2 such that hash(p1 ∥ m1) = hash(p2 ∥ m2) (where is the concatenation operation).

In 2007, a chosen prefix collision attack was found against MD5, requiring roughly 250 evaluations of the MD5 function. The paper also demonstrates two X.509 certificates for different domain names, with colliding hash values. This means that a certificate authority could be asked to sign a certificate for one domain, and then that certificate could be used to impersonate another domain.[5]

Perhaps the best known real-world collision attack was published in December 2008 when a group of security researchers published a forged X.509 signing certificate that could be used to launch a rogue certificate authority, taking advantage of a prefix collision attack against the MD5 hash function. This meant that an attacker could impersonate any SSL-secured website as man-in-the-middle, subverting certificate validation in web browsers. What's worse, the rogue certificate would not be revokable by real authorities, and could also have an arbitrary forged expiry time. Even though MD5 was known to be very weak in 2004,[1] certificate authorities were still willing to sign MD5-verified certificates in December 2008.[6]

Attack scenarios

Many applications of crytographic hash functions do not rely on collision resistance, thus collision attacks do not affect their security. For example, password hashing and HMACs are not vulnerable.[7] For the attack to be useful, the attacker must be in control of the input to the hash function.

Digital signatures

Because digital signature algorithms cannot sign a large amount of data efficiently, most implementations use a hash function to reduce ("compress") the amount of data that needs to be signed down to a constant size. Digital signature schemes are often vulnerable to hash collisions, unless using techniques like randomized hashing.[8]

Note that all public key certificates, like SSL certificates, also rely on the security of digital signatures and are compromised by hash collisions.

The usual attack scenario goes like this:

  1. Mallory creates two different documents A and B, that have an identical hash value (collision).
  2. Mallory then sends document A to Alice, who agrees to what the document says, signs it and sends it back to Mallory.
  3. Mallory copies the signature sent by Alice from document A to document B.
  4. Then she sends document B to Bob, claiming that Alice signed the different document. Because the digital signature matches the document hash, Bob's software is unable to detect the modification.

See also


  1. ^ a b Xiaoyun Wang, Dengguo Feng, Xuejia Lai, Hongbo Yu: Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD, Cryptology ePrint Archive Report 2004/199, 16 Aug 2004, revised 17 Aug 2004. Retrieved July 27, 2008.
  2. ^ M.M.J. Stevens (June 2007). On Collisions for MD5. "[...] we are able to find collisions for MD5 in about 224.1 compressions for recommended IHV's which takes approx. 6 seconds on a 2.6GHz Pentium 4." 
  3. ^ Magnus Daum, Stefan Lucks. "Hash Collisions (The Poisoned Message Attack)". Eurocrypt 2005 rump session. 
  4. ^ a b Max Gebhardt, Georg Illies, Werner Schindler. A Note on the Practical Value of Single Hash Collisions for Special File Formats. 
  5. ^ Marc Stevens, Arjen Lenstra, Benne de Weger (2007-11-30). Chosen-prefix Collisions for MD5 and Colliding X.509 Certificates for Different Identities. 
  6. ^ Alexander Sotirov et al (2008-12-30). "Creating a rogue CA certificate". 
  7. ^ "Hash Collision Q&A". Cryptography Research Inc. 2005-02-15. Archived from the original on 2008-07-17. "Because of the way hash functions are used in the HMAC construction, the techniques used in these recent attacks do not apply" 
  8. ^ Shai Halevi and Hugo Krawczyk, Randomized Hashing and Digital Signatures

External links

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Collision resistance — is a property of cryptographic hash functions: a hash function is collision resistant if it is hard to find two inputs that hash to the same output; that is, two inputs a and b such that H(a) = H(b), and a ≠ b.[1]:136 Every hash function with… …   Wikipedia

  • Collision (computer science) — Not to be confused with wireless packet collision. In computer science, a collision or clash is a situation that occurs when two distinct pieces of data have the same hash value, checksum, fingerprint, or cryptographic digest.[1] Collisions are… …   Wikipedia

  • Collision — For other uses, see Collision (disambiguation). A collision is an isolated event which two or more moving bodies (colliding bodies) exert forces on each other for a relatively short time. Although the most common colloquial use of the word… …   Wikipedia

  • Attack on Sydney Harbour — Infobox Military Conflict conflict=Attack on Sydney Harbour partof=the Battle for Australia during World War II caption=1 June 1942. A Japanese Ko hyoteki class midget submarine, believed to be Midget No. 14, is raised from Sydney Harbour date=31 …   Wikipedia

  • Collision course — This article is about the movement of one object or philosophy towards another. For other uses, see Collision course (disambiguation). A collision course, also known as a kamikaze run, is the deliberate maneuver by the operator of a moving object …   Wikipedia

  • Attack Squadron 46 (United States Navy) — Infobox Military Unit unit name= Attack Squadron 46 caption= VA 46 insignia dates= May 24 1955 – June 30 1991 country= United States allegiance= branch= United States Navy type= All Weather Attack role= size= command structure= Inactive current… …   Wikipedia

  • Hash collision — In computer science, a hash collision or hash clash is a situation that occurs when two distinct inputs into a hash function produce identical outputs.All hash functions have potential collisions, though with a well designed hash function,… …   Wikipedia

  • Preimage attack — In cryptography, a preimage attack on a cryptographic hash is an attempt to find a message that has a specific hash value. There are two types of preimage attacks:* First preimage attack : given a hash h , find a message m such that hash(m) = h …   Wikipedia

  • Correlation attack — In cryptography, correlation attacks are a class of known plaintext attacks for breaking stream ciphers whose keystream is generated by combining the output of several linear feedback shift registers (called LFSRs for the rest of this article)… …   Wikipedia

  • Birthday attack — A birthday attack is a type of cryptographic attack, so named because it exploits the mathematics behind the birthday problem in probability theory. Given a function f , the goal of the attack is to find two inputs x 1,x 2 such that f(x 1)=f(x 2) …   Wikipedia