Skip to content

A five-stage arc · Introduction1 of 5 live

How librarians solved identity, decades before Big Tech tried.

Every digital object has many names. The work of keeping those names in agreement — of deciding when two strings refer to the same thing and when they do not — is one of the foundational hard problems in computing. Catalogers and reference librarians have been doing it, publicly and at scale, since the nineteen-sixties. This is a walking tour of how.

Start here

…if you are.

Learn how libraries know what things are. The arc moves from "Who is this?" to "How does the knowledge graph know?" — and ends with the older, harder question of how that knowing is kept true over time.

The Five Rooms

Each stage builds on the one before it.

  1. Stage 01

    What is this thing, really?

    Every digital object carries a fistful of identifiers — ISBN, OCLC number, LCCN, VIAF cluster, DOI, Wikidata Q-number. They all claim to point at the same thing. The work of deciding is older than the web.

    Enter the room
  2. Stage 02

    Distinguish

    in preparation

    Which Stephen King?

    There are hundreds of Stephen Kings in the world's catalogs. The discipline that resolves them — cluster by cluster, evidence by evidence — is what every identity-resolution system reinvents from scratch.

  3. Stage 03

    Classify

    in preparation

    Where does it sit in human knowledge?

    Dewey is hierarchical — a book belongs at one number. FAST is faceted — one book gets many headings that intersect. The difference between a tree and a graph, taught with real records.

  4. Stage 04

    Connect

    in preparation

    What does it touch?

    Authority data is a knowledge graph. Walk outward from an entity — works by, works about, influences, contemporaries, subjects in common — and watch the structure of human knowledge render itself.

  5. Stage 05

    Maintain

    in preparation

    Identity as stewardship.

    Names change. Records get corrected. Entities get merged and split. Cataloging is not clerical work; it is the ongoing care of identity, meaning, relationships, and trust. The stage where fingerprints become signatures, and stewardship becomes verifiable.

The Authority Arc is built by Paul Clark using the public APIs of the Virtual International Authority File (VIAF) and — where access permits — OCLC's WorldCat Entities API. VIAF data is provided under the Open Data Commons Attribution License (ODC-BY). This site claims no affiliation with OCLC.