collision-free;Preneel;hash value;digital signatures;digital timestamping;digital fingerprint;">
###
Question 94. What is a Hash Function?

A hash function *H *is a transformation that takes a variable-size
input *m* and returns a fixed-size string, which is called
the hash value *h* (that is, *h = H(m)*). Hash functions
with just this property have a variety of general computational
uses, but when employed in cryptography the hash functions are
usually chosen to have some additional properties.

The basic requirements for a cryptographic hash function are:

- the input can be of any length,
- the output has a fixed length,
*H(x)* is relatively easy to compute for any given *x
*,
*H(x)* is one-way,
*H(x)* is collision-free.

A hash function *H *is said to be *one-way* if it is
hard to invert, where "hard to invert" means that given
a hash value *h*, it is computationally infeasible to find
some input *x* such that *H(x) = h*.

If, given a message *x, *it is computationally infeasible
* *to find a message *y not equal to x* such that *H(x)
= H(y)* then *H* is said to be a *weakly collision-free
* hash function.

A *strongly collision-free* hash function *H *is one
for which it is computationally infeasible to find any two messages
*x* and *y* such that *H(x) = H(y).*

For more information and a particularly thorough study of hash
functions, see Preneel [Pre93].

The hash value represents concisely the longer message or document
from which it was computed; one can think of a message digest
as a "digital fingerprint" of the larger document. Examples
of well-known hash functions are MD2 and MD5 (see Question 99)
and SHA (see Question 100).

Perhaps the main role of a cryptographic hash function is in the
provision of digital signatures. Since hash functions are generally
faster than digital signature algorithms, it is typical to compute
the digital signature to some document by computing the signature
on the document's hash value, which is small compared to the document
itself. Additionally, a digest can be made public without revealing
the contents of the document from which it is derived. This is
important in digital timestamping (see Question 108) where, using
hash functions, one can get a document timestamped without revealing
its contents to the timestamping service.