[PIG] [Weekly Update] E2EE in Activiy Pub

Okay, so this is the first post in the weekly updates on what we’re doing for the End to End Encryption in Activity Pub project as requested! We’re basically up and running and doing all kinds of stuff! (I’m not great at this, bear with me.)

I’m writing this first one, but if I get something wrong about what Evan’s up to, don’t blame him! He’s the more technical side of our partnership. My side is focused more on how it works with product, design flows, and the policy sides of things. There’s a chunk of overlap in the middle, of course. More on that later in the update.

So let me talk about the goal to start off with - what we’re attempting to do here is to put together a proposal of how end to end encryption might work for Activity Pub direct messaging. We want to put together a well-worked through proposal for a spec that we can take to the W3C.

But obviously in order to do a proposal we have to think through how the whole thing is going to work, from the approach to encryption, to the way we integrate that encryption into ActivityPub, but beyond that we’re likely going to have to think how people will know to trust certain clients, whether there’s some kind of certification or auditing of clients/servers, as well as which approach is most comprehensible to normal humans and precisely what kinds of language and flows we should recommend are used in clients. Things at either end of that spectrum may influence which approach we take.

Plus there are some specific situations that ActivityPub and the Fediverse have that centrally controlled clients don’t have. We have to think through what kind of assurances we can provide people about how secure their communications are if and when we don’t have direct control of any of the clients or intermediary servers. Some of these may be easily solvable. Others, I suspect, won’t be. Plus we also have to think about how these services will degrade gracefully - ie. how things should work, and how these services should communicate, when E2EE is not available. So lots of stuff to think through.

Where we’re starting, bluntly, is with research. Evan is digging into the current consensus on encryption - exploring MLS and Signal and other approaches there - and I’m digging into and documenting the conceptual approaches that these projects are taking and the user interfaces that they use and display. For me that work involves reading a bunch about how their encryption works, pulling out every one of the screens that use end to end encryption or handling messaging, digging into the various approaches they take in terms of communicating whether it’s active or not, as well as this separate set of conversations about how your messaging archives are kept in sync.

It’s been a really interesting process so far and I’ve learned a lot already, and it’s exposed a few things I’m trying to think through in a bunch more detail.

I’m now going to go quite deep into an area that I’m trying to get my head around right now. This is very much work in progress. It’s me thinking through a problem in real time. And again it is an early stage of the grant, so I’m almost certainly going to say something ill-informed at this moment. Working through these mental sticking points is what the next few months, and all the conversations we’re planning to have, and more of my engagement with Evan, is for! If he’s reading this and shaking his head, then he’ll probably tell me that in our next catch-up. Or you’ll get an update about it later in the week.

Anyway, let me get into it. Most of these clients operate in a fairly similar way - when you log into a new client with your centralized account, a new public/private key is created in that client. The public key is then stored in the central repository attached to your account identifier. It has to be, because that’s how other clients will know how to encrypt something that only you can decrypt later.

If you have a dozen separate clients, then when someone tries to write something to you, they write it once and then their client encrypts it a dozen times, and sends it to a centralized repository where then each of the clients can either immediately or asynchronously access and download the one that’s for them. Then behind the scenes each client decrypts the content and stores it locally on their device.

If you then reply to a message from one of those, not only does that message get encrypted multiple times to go back to every client of the person on the other end, it is also encrypted with the public keys of all your other clients - effectively sent sideways to keep all your clients up to speed. It’s a fascinating approach.

This is generally really secure, but it presents some particular issues for a protocol level process where we don’t actually write or maintain any of the clients or the servers in the middle that distribute them.

If you’re Signal or iMessage, you know that messages will only be accessible by clients you yourself control. And you will control all the servers. That’s not the case for us. And that presents all kinds of interesting conundrums (conundra?) that I’m trying to get my head around at the moment. We obviously can encourage people to choose a trusted client, but how do they know it’s trusted? Does there have to be an authority or organization that can do an audit on it to check it’s okay?

If not, it will obviously have access to your plaintext. How do you stop a bad client simply screenshotting your messages when they arrive and then posting it automatically to Twitter? At some level, you can’t. And nor can you be sure that the person at the other end isn’t using a suspicious client on the other end? Again, you can’t. So some of this is just going to be about how do we recommend good clients, build trust, audit versions of the clients and so on and so on.

There’s a lot of interesting “out of bound” protocol stuff there.

And another interesting problem - the main man in the middle attack works a bit like this - the central directory holds a list of all my public keys. What’s to stop a rogue instance from creating a new public / private key and adding the public one to my directory? Now any time anyone sends me a message, that central instance can capture the message sent to their public key and they then have the private key and can decrypt it.

The way to handle that is probably through open and transparent key management - ie. that every client that connects to that server can in some way see what public keys are held for that individual user and can keep asking and checking that everyone is aware of how many keys they have. Perhaps we can find an interesting extra step that means a third party client that sends you a message can append every public key they sent the message to, and if there’s a discrepancy between how many keys the sending and receiving client can see, then everyone is informed and can choose to move somewhere else.

Anyway, this is all work in progress, and half formed thoughts. It’s week 2, and I’m all over the shop! Opening up all the questions, putting all the mess on the table, and seeing what problems, opportunities, limits and possibilities emerge out of that.

Apart from all this, we’re also meeting with people. Among others we caught up with Ted Han, recently ex of Mozilla today, to get his sense of where the work is. He gave me a really interesting perspective on the social structures that lie around and support encryption and build trust. And Evan asked the W3C community group that manages ActivityPub to form a task force on end-to-end encryption, and they agreed. So we can get input from other developers directly

If you’ve skipped to the end for the tl:dr, here it is - the last couple of weeks and the next couple of weeks are really all about understanding what’s out there in encryption and protocol and social structures and UI, how people are thinking about the issue, how other people’s services work, how they communicate to their users, what kinds of flows there are. So next week’s update is likely to be roughly the same as this one, except probably a lot shorter. As we move into the first or second week of June, we’ll probably start narrowing in much more fully on the approach we think we should take, and start socializing that a bit more. In the meantime, if you have comments or questions feel free to post them below or e-mail me or Evan at tom.coates@gmail.com or evan.prodromou@gmail.com

2 Likes

Weekly update 29 May 2024

So, Tom and I have a pretty wide division of labour between us on the E2EE over ActivityPub project. He’s working pretty hard on understanding the user experience of E2EE, and thinking about the social structures that will enable safe and effective use of the technology on a heterogeneous network.

For me, I’m focusing on the bits and bytes at the lower level. As one of the co-authors and designers of the ActivityPub standard, it’s important to me that our implementation is well-integrated into the fabric of the existing network. So, I’m trying to set up some technical requirements for our solution that meet those needs.

In particular, I’d like to see us find a solution with the following characteristics:

  • No additional identity requirements. Users should be able to find each other and initiate a conversation with their Webfinger addresses (user@domain) with nothing else needed.
  • No extra servers. To minimize the work needed for implementation, there shouldn’t be any additional server setup required, outside of the ActivityPub server. I think this means that the communications need to be in-band, using AP as a substrate.
  • Heterogeneous servers. Any compliant ActivityPub server should be usable. Unfortunately, this might not be the case starting out – not all AP servers support the ActivityPub API fully, and some even reject activities they can’t inspect. I hope that this project will give those implementations the incentive they need to support AP fully.
  • Heterogeneous clients. We’re building a prototype implementation, but our expectation is that other ActivityPub client apps should be able to support the same e2ee mechanism. This means also that Open Source, Web-based, or native clients should work. Most importantly, it means that the protocol should be easy enough to implement that client developers can’t screw it up too badly.
  • An open protocol. We intend to submit this work for approval as a W3C Social Community Group report, potentially becoming part of the next version of ActivityPub. That’s going to require royalty-free use of any patents or inventions.

In terms of solutions, I’m exploring as much of the space as possible. My starting point is simple mechanisms like PGP email and OTR. I’ve also investigated OMEMO. More modern solutions like MLS and Signal Protocol might also be a good fit. I’m reading a lot, trying to understand the systems available, and figure out the right level of complexity.

We also need to be able to frame what’s in scope and out of scope for the project. I think Tom mentioned last week that the W3C’s SocialCG has started an E2EE task force in conjunction with the work we’re doing. I’ve taken the list of user stories that Tom and I have been developing and put them up on the task force’s GitHub repo for discussion.

Finally, we’re working on getting our last jury member for our grant review. I’ve been reaching out to people with experience in E2EE messaging to make sure we’ve got a good mix on the jury. Hopefully we’ll have more news soon.

Checking in for our weekly update! Last week, Tom and I worked on integrating his notes and reviews of existing E2EE messaging UIs with the available abstract protocols.

I also made a document covering some of the ways we can integrate E2EE into ActivityPub. I covered a range of models:

  • No integration
  • Links in profiles
  • Initiation and handoff
  • Full support for another messaging protocol, like XMPP or Matrix
  • Mastodon API for the client, ActivityPub protocol for server-to-server
  • ActivityPub API for the client, ActivityPub protocol for server-to-server

My inclination is to go for the last one. It requires the least work from server admins and end-users, and provides a really good user experience. However, it requires a lot more effort from software developers, and probably takes a lot of work just to get to “baseline” compliance.

If we follow that modality, it leaves us with MLS and Signal as the leading protocols to implement. I’m reading up this week on each of these, and I’m going to do a sample architecture for each.

I’ve also been talking with Tom about some of the big questions he has on implementation, namely:

  • How public keys are stored (on client, on server, on an external keyserver)
  • How individual clients are addressed
  • How messages are archived

We’re planning our first jury meeting during the week at Edge Esmeralda in Healdsburg at the end of June, and we want to have these deliverables for that jury meeting:

  • Architectural approach
  • Use cases
  • Initial design artifacts

We’re also seeking an independent jury member to help us out – someone experienced in E2EE who will be supportive even if we don’t pick their favourite abstract protocol.

1 Like