Identifying and retrieving data on the web

Welcome to ProtoSchool!

ProtoSchool offers tutorials on all sorts of decentralized web protocols, from IPFS to Filecoin, through text-based lessons, quizzes, and coding challenges. For it all to make sense, you'll first need a fundamental understanding of the decentralized web and how it's different from the centralized web that most of us are more familiar with.

If you're new to the decentralized web, this is the place to start! This introduction is code-free and designed to introduce you to some of the key concepts and terms you'll encounter throughout ProtoSchool.

How we identify and retrieve data

One of the most important differences between the centralized web and the decentralized web is the way we identify and retrieve data on each. Let's use a simple example to illustrate:

Two of your friends, Lars and Courtney, recommend the same book for your cat-loving child, but they describe the book to you in very different ways:

Lars: "Go to the Strand bookstore at 828 Broadway in New York City, take the elevator to the 2nd floor, find the 3rd bookcase on the right in the Childrens section, and get the book that's 16 inches from the left on top shelf."
Courtney: "Check out Cutest Kittens Ever by Anna Claybourne. Its ISBN-13 number is 9781682972168."

If your goal is to get a copy of the book, which of these descriptors do you find most helpful? Which gives you the most options for how to acquire the book? In each case, once you've followed the instructions, how confident will you be that you've found the book your friend intended?

Location addressing and content addressing

One of your friends identified the book by its location, and the other by its content. (Not sure which is which? Hint: We love alliteration almost as much as Lars loves location addressing and Courtney loves content addressing.)

Location addressing points us to the location where data is stored by a specific entity. Lars pointed us to a specific bookshelf controlled by the Strand, where he knows they've previously kept this book, and hopes they continue to offer it there. This is how we identify data on the centralized web.

Content addressing instead provides a unique, content-derived identifier for the data, which we can use to retrieve the data from a variety of sources. We could have used the ISBN provided by Courtney to verify we'd found the right book at our local library, our neighbor's house, or the school book fair. This is how we identify data on the decentralized web.

Let's take a deeper look at these two models.

Feeling stuck? We'd love to hear what's confusing so we can improve this lesson. Please share your questions and feedback.