So far we've exclusively been discussing adorable images for the sake of our general merriment, but content addressing can be used on all different types of files and data, from JSON objects to term papers to videos. For cryptographic hashing to work, we need to know what data format we're working with and use an appropriate tool.
A CID (Content Identifier)
is a particular form of content addressing used on the decentralized web. It was developed for
IPFS
(a decentralized web protocol which we'll discuss in later tutorials), but has very broad implications.
A CID
is a single identifier that contains both a cryptographic hash and
a codec, which holds information about how to interpret that data. Codecs encode and
decode data in certain formats.
+-------+------------------------------+
| Codec | Multihash |
+-------+------------------------------+
Many formats and protocols use content addressing already. Tools like Git and protocols like Ethereum and Bitcoin are among them, but they differ in how to interpret the data and in what cryptographic function they use for hashing. CID
allows us to create a universal identifier for any of these systems.
Every CID
is an identifier that contains the codec
to interpret the data and a multihash
which is a self-describing hash (a hash that tells you what type of hashing function was used to create it).
+------------------------------+
| Codec |
+------------------------------+
| |
| Multihash |
| +----------+---------------+ |
| |Hash Type | Hash Value | |
| +----------+---------------+ |
| |
+------------------------------+
For lots more detail on how CIDs are constructed in IPFS, check out our Anatomy of a CID tutorial.
Feeling stuck? We'd love to hear what's confusing so we can improve this lesson. Please share your questions and feedback.