[PIG] Open Standoff Protocol for Digital Text

Team Members

We have internalized the patterns of print protocols from analog text to such an extent that we hardly recognize them.

Digital text is developing its own unique features, such as hyperlinks, checkboxes, emojis, and markdown, which set it apart from analog text. What other textual primitives are emerging or have yet to be realized?

Our proposal has two parts.

We intend to survey how and why older text protocols are breaking down. For example, we’ve adapted typography and semantic structure to fit digital environments.

We also want to make Iian Neill’s standoff editor and reader open source. Standoff markup is a way to disentangle HTML from text at a basic level. It also allows for the creation of annotations without altering the content. Standoff facilitates semantic separation of concerns, similar to CSS’s role in separating style concerns.

Short summary of your improvement idea

  • a way of adding multiple channels of reference on top of a plain text layer.
  • a way of encoding “partial contexts” or “fields of reference” along with a communication.
  • supports creation of inline entity graphs that are not dependent on external databases.
  • provides the reader with tools for querying and visualizing the text/graph nexus.

What is the existing target protocol you are hoping to improve or enhance?

Digital Text. We want to add features that allow access to tacit structures in text.

What is the core idea or insight about potential improvement you want to pursue?

Print encourages a sense of closure a sense that what has been found in the text has been finalized, has reached a state of completion. - Walter Ong

Text has two faces. Most text editors, constrained by a print-centric paradigm, only allow us to see one - the final, polished draft.

There’s an implicit narrative about how one “should” write, which is to write within the lines, adhering to a linear, structured process. Yet as we write, our minds naturally seek out associations, metaphors, and flights of conceptual fancy - what the philosopher William James described as the “big, blooming, buzzing confusion” of our inner mental landscape.

We tuck away these exploratory, meandering thoughts and ideas. The messy, nonlinear process of writing is obscured in favor of presenting a clean, coherent final draft.

For example, here’s what’s running through my head as I’m writing:

We think that words should have backpacks.

Standoff markup allows the writer to make inline entity relations on words, phrases, and even discontinuous highlights. This approach enables previously flat text to gain a semantic topology, offering multiple views of the text and various ways to connect ideas that may not always result from surface-level meaning but rather from personal associations.

What is your discovery methodology for investigating the current state of the target protocol?

UX interviews, case studies, failure states and friction logs.

In what form will you prototype your improvement idea? How will you field-test your improvement idea?

In the browser. A predefined data layer to simulate entity relations. Collaboration with other makers in this space to iterate and improve the ideas.

Who will be able to judge the quality of your output? Ideally name a few suitable judges.

  • Vint Cerf
  • John Underkoffer
    (We still need to confirm with them but they are familiar with the concepts.)

How will you publish and evangelize your improvement idea?

The kit will be on Github, potentially as a plugin. We will publish our research and give talks both virtually and in person.

What is the success vision for your idea?

  • Help bridge the gap between text artifacts and intentions.
  • Explore the latent space of digital text as it adapts to new interactions, devices and behaviors.

Amazing to see y’all propose this! Regardless of protocol acceptance we should see about making this happen!


used to do this with technical documents.

and, often we separate the writing from the doing but now i am thinking about runtime writing of contexts using your kind of protocol. as it might be a higher-volume use case, like a richer version of both RDF and tags


Thanks Boris! I’d really like to hear what resonates with you. It can be interpreted in so many ways, and we don’t want to assume we have it pegged.
There’s a lot here that applies to the open canvas working group (ex: TLdraw as text editor w hotkeys to pop over to margin-canvas, semantically link paratext and pop back inline) and many other TfT adjacent agendas.

This is fantastic! And I actually think this would work well.
The key superpower that standoff offers is a plurality of presentation modes. This could be one view. Imagine an anamorphic wire sculpture, where it looks like a giraffe from one angle but as you walk around it transforms to the shape of a elephant.

A variation on your example is a stack of minimaps, each acting as an agent interested in different aspects of the content.

I’d love to hear more about how this pattern worked for tech docs. (About half the ideas in here come from wrestling with internal docs, guides, and enterprise SaaS complexity.)

I’ve been waiting for standoff to be available as something that hopefully can take over from the lowest common denominator that is markdown for a while :wink:

1 Like

some details on the top of my head:

the app is in excel and the doc in question is of a patent (in development)

colored columns:

  • 1st colored column includes some context (ex. “1.”, “[0001]”)
  • 2nd colored column includes some numbered instance
  • 3rd colored column and so on include numbered instances mentioned in the 2nd colored column but only the numbers are written

uncolored columns are roughly relationships between or of the numbered instances

1 Like

Yes! His work is a direct inspiration. A few random thoughts on why writing and reading is stuck in a saddlepoint:

  1. Books are horrible products (but we love them). They have no feedback loops, just a blackhole of information asymmetry. It’s like an interface that’s just a manual. (Except for dictionaries. Brilliant interfaces. Words become addresses for themselves. 11/10 genius.) Here’s Walter Ong on it:

Of course, all language and thought is to some degree analytic: It breaks down the dense continuum of experience, William James’s, “big, blooming , buzzing confusion," into more or less separate parts, meaningful segments. But written words sharpen analysis, for the individual words are called to do more. To make gesture without facial expression, without without a real hearer, you have to foresee circumspectly all possible meanings a statement may have for any possible reader possible in any possible situation, and you have to make your language work so as to come clear all by itself, with no existential context. They need for this exquisite circumspection makes writing the agonizing work it commonly is. (Orality and Literacy)

  1. So it makes sense that a lot of stuff in books wouldn’t stick. (Thought experiment and ignoring tech hurdles: How would our reading habits change if, say, we already agreed with the author’s premise and the book “knew” that about us and could adapt?) Would a multiform book tailored to the audience be an improvement? (Maybe it’s good that books are bad products. Maybe they need to be that way.)

  2. This project isn’t really about goal-directed writing. Its for the kind of writing when you don’t know what you’re talking about. I rarely grab the exact phrase I’m after so I have to string words together in a sentence, like “signature-shorthand-signal-icon-handle.” (which could’ve been referring to XOXO in a note.) To put it another way, its about the kind of writing we don’t normally pay attention to. Is that a space where casual protocols can grow?

  3. When we read a book or listen to a presentation, our mind does other things. We may not be able to recall the content of 9 hours of reading, but that may not be why we were reading in the first place. The same is true for writing.

  4. I kind of think the standard tools like typography, links, and tags, don’t cut it for scratching out nonsense or jotting down an impression. Those tools were built with publications in mind. (Infinite canvas and TfTs are steps in the right direction, but they aren’t enough, perhaps too little direction.) But those tools are even more constraining for the multimodal world of audio transcriptions and gestural interfaces. The WSIWYG will not hold.

  5. I have a few dozen sketches of what I think is missing. But I’d love to have a broader conversation, to look closely at what is missing in these personal exchanges with our devices. To translate those patterns into experiments and start to bake new ideas into future contextual tools.


I like this. This uncolored columns seem clever and valuable. Was this better than a multi-sheet spreadsheet? For example, was it easier to scan through the whole think as one sheet rather than clicking different tabs for particular views?

Unrelated: Somebody needs to make an XR newspaper. a newspaper you read in XR. Big broadsheets the eye can scan around. Prop it open with digital hands. wink or click to expand a section, watch the broadsheet expand or recompose itself. I would so read Wikipedia that way.

1 Like

oh it’s related. hiding columns kinda summarizes the paragraphs by showing only the first subject. this way i also don’t need to make it multi-sheet.

also if you do the node-thing on subjects you might even call this progressively summarized.

moreover, something like image clips to represent uncolored columns (mapping by llms?) can mean including uncolored columns in summaries.

this view in general ensures consistency of wording (by find & replace) and completeness of checking whether a patent wording makes sense (i did it manually pair by pair, but you can search and tag prior art / drawing on the view using llms).

combine all this and you have a tool kit for patent analysis and writing!

1 Like