Anatomy of a CID | Lesson 5 of 6

CIDv1: Multibase prefix

So now our CIDv1 in binary (0s and 1s) gives us this information:

<cid-version><multicodec><multihash>

Since binary CIDs are not very human-friendly, we can represent these binary CIDs in a string form (binary data represented as text). Example:

bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi

Converting data between binary and string formats requires base encoding, so when working with string CIDs it's important that we know the type of base encoding that was applied to the binary data. But how can we identify this?

In CIDv0, hashes are always encoded with base58btc. Always. This means that we can safely interpret CIDv0 hashes assuming they are using base58btc. However, due to environment constraints (e.g. DNS names), we need the ability to support other base encodings as well. For that, you guessed it, we can add another prefix!

Multibase prefix

Multibase prefixes, which represent the base encoding used when converting CIDs between string and binary formats, are only used in the string form of the CID:

Multibase Prefix

Let's examine two examples of CIDs in their string form:

String form examples

We know the first one is a CIDv0 because it starts with Qm.... All hashes that start with Qm can safely be interpreted as base58btc as a CID of Version 0.

The second example starts with b, the base encoding prefix identifier for base32, which is used by default by most implementations of IPFS.

For the complete list of multibase identifiers, see this table.

Take the quiz!

How can we determine the base encoding method used to represent a binary CID as a string?

Oops, you haven't selected the right answer yet!