Protocolizing Responsible AI

From a protocol perspective, takeaway #5 from the AI index report 2024 caught my attention:

  1. Robust and standardized evaluations for LLM responsibility are seriously lacking.
    New research from the AI Index reveals a significant lack of standardization in responsible AI reporting. Leading developers, including OpenAI, Google, and Anthropic, primarily test their models against different responsible AI benchmarks. This practice complicates efforts to systematically compare the risks and limitations of top AI models

My take:

Pattern: The Paradox of Powerful Technologies

Throughout history, the most powerful technologies have often been double-edged swords, offering tremendous potential benefits alongside significant risks. In the early days of the nuclear power industry, the promise of clean, abundant energy led to a rapid proliferation of nuclear power plants, often without adequate safety protocols. Only after high-profile accidents like Three Mile Island and Chernobyl did the need for robust safety standards and international cooperation become fully apparent.

Weak Signal: The Rise of “AI Nationalism”

Just as the early nuclear era saw a race between nations to develop nuclear capabilities, the current AI boom risks giving rise to a form of “AI nationalism” – a competition between countries to achieve dominance in AI technology, potentially at the expense of safety, ethics, and international cooperation. This could lead to a fragmented and inconsistent approach to AI governance, with different nations pursuing their own agendas and standards.

Counter-Signal: The “Digital Dark Ages”

Regulatory failure or regulatory capture could lead to societal dependence on opaque and complex AI systems. This might result in a loss of skills, knowledge, and critical thinking, echoing the decline in many areas of knowledge during the historical Dark Ages. Concentrating power and knowledge in a few entities or systems could hinder progress and innovation, potentially requiring a “re-discovery” of knowledge in the future.

How might a stronger emphasis on protocolization help address these challenges? What protocols might be necessary to promote international cooperation, maintain transparency, and safeguard against the concentration of power in the hands of a few?


Low hanging fruit for protocolising responsible AI might be akin to consumer food labelling. There’s manufacturing info, ingredients listed (from most to least prevalent), nutritional facts, evidence of certifications etc…

I recall seeing GOOG do some things with model cards a while back. I expect that’s the way to go with this stuff; establish a minimum baseline (probably off the back of an emergency event) and bootstrap from there.

You make an excellent point about the relevance of food labelling and provenance to the responsible development of AI systems. The health and safety aspects of food labelling, such as ingredient lists and nutritional information, provide a useful template for thinking about how we might disclose key information about AI systems.

However, the provenance aspect of food labelling, which allows issues to be traced back to specific suppliers, highlights a challenge in the context of LLMs. The complex and opaque nature of current LLM architectures, which involve training on vast and diverse datasets, makes it difficult to trace specific outputs back to their origins in the training data. This can hinder efforts to identify and mitigate biases, errors, or unintended consequences in LLMs, and complicate attempts to hold developers accountable for the actions and outputs of their models.

To address the traceability and accountability of LLMs, what do you think are the most important next steps? Should we focus on advancing technical innovations to improve model interpretability and provenance tracking, or should we focus on developing standards and protocols for transparency and governance?

Next steps? Hmm, probably needs to be more on the transparency and governance side. I recall this clip of OpenAI’s CTO getting quizzed about data provenance for their models. This sort of thing is an open secret, it seems, and until someone says the quiet part out loud and that admission is effectively used as leverage it’ll continue to go round in circles. Or there’ll be an emergency inciting event. Doesn’t seem like social issues or pursuit of optimisation will be enough to catalyse responsible development protocols.