Occasionally, a hashing algorithm may be proven to be insecure, meaning it no longer complies with the characteristics that we defined earlier. This has already happened with sha1
. With time, other algorithms may prove to be insufficient for content addressing in IPFS and other distributed information systems. For this reason, and in order to support multiple cryptographic algorithms, we need to be able to know which algorithm was used to generate the hash of specific content.
So how can we do this? To support multiple hashing algorithms, we use multihash.
A multihash is a self-describing hash which itself contains metadata that describes both its length and what cryptographic algorithm generated it. Multiformats CIDs are future-proof because they use multihash to support multiple hashing algorithms rather than relying on a specific one.
Multihashes follow the TLV
pattern (type-length-value
). Essentially, the "original hash" is prefixed with the type
of hashing algorithm applied and the length
of the hash.
type
: identifier of the cryptographic algorithm used to generate the hash (e.g. the identifier of sha2-256
would be 18
- 0x12
in hexadecimal) - see the multicodec table for all the identifierslength
: the actual length of the hash (using sha2-256
it would be 256
bits, which equates to 32 bytes)value
: the actual hash valueIn order to represent a CID as a compact string instead of plain binary (a series of 1
s and 0
s), we can use base encoding. When IPFS was first created, it used base58btc
encoding to create CIDs that looked like this:
QmY7Yh4UquoXHLPFo2XbhXkhBvFoPwmQUSa92pxnxjQuPU
Multihash formatting and base58btc
encoding enabled this first version of the CID, now referred to as Version 0 (CIDv0
), and its initial Qm...
characters remain easy to spot.
However, with time, doubts arose about whether this multihash format would be sufficient:
base58btc
?To address these concerns, an evolution to the next version of a CID was necessary. In the following lessons we'll explore what was added to the specification to lead us to the current CID version: CIDv1
.
Feeling stuck? We'd love to hear what's confusing so we can improve this lesson. Please share your questions and feedback.