Plurality in Practice POG Weekly Blog

This Week

This week has mostly been taken up by admin, planning and research: onboarding to the Summer of Protocols Program, concretizing our research aims and processes, and getting more background on QV/QF and proposed improvements to these protocols, in theory and in practice. Since our original proposal was for a PIG, our first step is determining how to adjust to the POG format.

Although we already have a plan for gathering a dataset, we have also reached out to people who have run or are running their own QV and QF initiatives. This has been useful both in terms of broadening our possible data pool and widening our reading list.

Next Week

  • Preparations for our proposed primary data gathering event
  • Selecting a third jury member
  • Follow-up with referrals from our initial interviews
  • Further reading around QV/QF and alternatives

Blockers

We don’t exactly have any blockers, but the preparations for our proposed data gathering event are taking up most of our time. This should ease after next week.

Musings

  • Looking into the development and evolution of MACI, other private voting protocols such as Fractal and Vitalik’s writings on voting, one interesting feature of adjustments to voting / funding mechanisms is how in service of reducing collusion, they often redefine what collusion is. As part of this evolution more nuanced mechanisms create new ways to collude in the negative space they fail to capture. Plural mechanisms occupy a particularly interesting pivot point here: once you start collecting relational identity data, how much are you obliged to gather to plug all the new potential collusion avenues you’ve introduced? Can this be simulated or generalized, or is it too context specific?

  • Music will play a large part in our event. It’s interesting that you can spend almost as much hiring a gong as a Steinway piano.

  • I stayed with some friends who are planning a family trip to Disney World. The protocol explosion to handle theme park queues is fascinating

1 Like

Apologies for the missed entry – the admin and organization for the event in Berlin which will form one of the datasets for this work ended up taking more of our attention than expected.

The event took place on 28th of May and was a success. Almost 60 participants met in person and participated in multiple votes using quadratic and plural voting.

We’ve begun work on anonymizing the dataset and begin the analysis and simulations. We also continue reaching out to organizers of other QV and COCM implementations to

Two weeks ago:

Prepping and testing tooling for the data gathering event. More research into other implementations of QV / QF and plural versions of them

Last week:

We ran the data gathering event. Began data anonymization, initial simulations and analysis

Blockers:

Prepping and running the event absorbed far more time than we had budgeted. We seem to gathered some good data though.

We’ll post a second, more detailed diary post this week to get back on schedule

One protocol musing from the event: even a modestly sized gong is an extremely effective way to get people’s attention

1 Like

This Week
This week we began the analysis of the voting data which we gathered at the Plural Research event in Berlin. As a first step we implemented the Connection-Oriented Cluster Match (COCM) [1] model in a jupyter notebook and replicated the voting outcome that our tool calculated. We choose to re-implement the model in python as it is more suited for data analysis and simulation than typescript. A nice side effect of doing this is that we were able to double check the integrity of the gathered data.

Some context on the event: participants submitted research proposals either individually or as a group, with groups able to fluidly form and dissolve over the course of the event. Participants then voted (with a fixed budget of 80 voice credits each) on which proposals should receive funding. Results were calculated quadratically and using COCM (see [1] for details]), with participants’ employment affiliation and the newly formed research groups as the available clustering dimensions. We will write about this in more detail in upcoming posts.

One obvious and intuitive result is that the choice of vote mechanism matters. The bump chart below visualises the ranking variations for the different proposals based on three distinct scoring methods: plurality score (calculated using COCM), quadratic score, and raw votes. Raw votes is defined as the raw number of votes allocated to each proposal by all participants. The quadratic score is simply the sum of square rooted votes each participant allocated to a given proposal. The x-axis represents the different ranking categories, showing the rank positions of various proposals according to each scoring method. The y-axis represents the rank positions, with rank 1 being the highest. For example, Proposal 1 has the highest plurality score (i.e. rank 1) with a value of 59.45, the highest quadratic score with a value of 64.658, but only the second highest score in terms of absolute votes received (i.e. rank 2) with a value of 196 votes.

Each line on the chart corresponds to a proposal from the voting results, tracking how the rank of each proposal changes across the three categories. In total, we had 35 distinct proposals that participants were able to vote on mirrored by the number of ranks. The colour of each line indicates the magnitude of rank change across the categories. Proposals with a significant rank change, defined as a difference greater than 4 ranks, are highlighted in dark red, while those with smaller changes are coloured grey. This colour-coding helps to quickly identify proposals with the most variability in their ranking across different scoring methods.

We will continue with more in-depth analysis over the coming weeks.

Thoughts and Open Questions
We have been back and forth on an analytic approach which will fit the dual goals of addressing our specific PoG proposal while also providing more generalizable results to protocol science in general.

What seems most promising is to try and take an empirical approach to simulating various counterfactuals to determine a measure of protocol risk or perhaps resilience. Risks and mitigations will of course vary by use case, but it may be possible to produce a high level set of guidelines for people to consider.

For example:

What factors should well-meaning protocol implementers / overseers be mindful of when implementing their protocols to reduce the risk of attack / error?

What factors should protocol participants be aware of when assessing their participation in protocols where the implementers / overseers may be ill-intentioned, careless or otherwise?

To a certain extent these factors are already known for many protocols (e.g., QV is vulnerable to Sybil attacks) but what is less explored is how much weight to attach to each one. The crypto ecosystem in particular seems vulnerable to throwing out protocols because a theoretical issue has been identified, with minimal consideration of how likely it is to actually occur. Similarly, protocol vulnerabilities often attract attention more for how interesting they are to speculate on than their likelihood of occurring or the consequences if they did.

What we need is a way to quantify and ascribe weight to these issues, beyond just identifying them.

To bring this back to voting: linear voting mechanisms have the feature that participants with more votes can have an undesirable control over the outcome. QV / QF directly addresses this, but as a trade-off introduces various sybil issues. COCM attempts to address some of these issues (where voters are distinct entities but similarly value-aligned) but as a trade-off introduces a new definition of what it means to collude, and new problems for protocol implementers to wrangle with.

It’s hard to empirically measure these trade-offs, but what we can do is simulate certain ways things might go wrong. For example “What if there had been a sybil in this QV vote? What would the impact have been? Or, conversely, how large a sybil attack would need to be coordinated to materially affect the outcome, and how hard would this be to achieve?

COCM introduces a whole new set of scenarios which are even more interesting, and which we’ll examine in future entries. For example, what happens if only partial information about the “true” underlying user relationships are available? To what extent does the property of collusion resistance still hold, if at all? Are the gathered experimental data helpful to suggest improvements to the model?

Next Week
We believe that the analysis of the experimental data will keep us busy for the weeks to come in order to answer the questions that we raised above. Furthermore, one aspect that we did not talk about so far but we plan to do in next week’s update is the following. One thing that will come out of this research is a protocol for an experimenter that wishes to use the COCM model in his/her tool describing best practices for executing such an experiment based on our experience from running our own.

References
[1]: Miller, J., Weyl, E. G., & Erichsen, L. (2022). Beyond Collusion Resistance: Leveraging Social Information for Plural Funding and Voting. Available at SSRN 431 1507.

1 Like

This Week
In this week’s update, we will discuss how we gathered information about pre-existing participant relationships using our tool to calculate plurality scores within the Connection-Oriented Cluster Match (COCM) [1] framework. Although this was a relatively small dataset, it contains many interesting features which raise important questions for people implementing or participating in protocols which rely on social data from participants.

Some context about the importance of pre-existing participant relationships when calculating plurality scores: COCM leverages user relationships, for example professional affiliations, to provide bridging bonuses when participants from diverse groups reach a consensus. This approach aims to improve upon naive quadratic mechanisms, which often fail to balance incentives and capture all relevant dynamics. Specifically, naive quadratic mechanisms overlook overlaps between participants - whether financial, cultural, or political - regarding the decision at hand, potentially leading to suboptimal outcomes when heavily aligned participants converge on a decision. However, the fundamental assumption underlying the calculation of plurality scores using COCM is that the implementer/overseer captures the “true” relationships between participants relevant for the calculation. This is a very strong assumption. Therefore, COCM might only improve on classical quadratic mechanisms if the implementer/overseer can credibly verify that he/she not only captured the “true” underlying relationships between participants but also the ones relevant for the vote at hand.

At the Plural Research event in Berlin we choose to gather information about participants’ professional relationships as well as relationships that they formed during the event by proposing a research proposal together as a group of collaborators. Importantly, we do not claim that we captured the true underlying relationships between participants, nor that we gathered the most relevant grouping dimensions (i.e. professional affiliations and research groups formed at the event). Instead, our research is intended to highlight and explain the differences in outcomes compared to classical quadratic mechanisms, especially in the case where the underlying assumptions of COCM are violated.

Let’s have a look at the pre-existing relationships that we gathered at the Plural Research Event in Berlin and that participants formed during the event. A visualisation of the relationship network is displayed above. The figure displays two networks, where the left one only considers professional affiliations between participants and the right one additionally includes research groups that participants formed during the event. We elicited professional affiliations by simply asking participants using our tool prior to the event. Furthermore, participants freely formed research groups during the event which they registered using our tool. We will talk more about the tool itself in the upcoming weeks.

Each node in the network represents one of the 36 participants who cast at least one vote at the event. Each edge in the network denotes a connection between participants, where participants have a connection if they belong to the same group. In the left network, if two participants are connected they have the same affiliation. Based on this data, there were many disconnected clusters of professional affiliations in the network, with approximately â…“ of participants not sharing any professional relationship with any other participant. In the network to the right, we include relationships that participants formed during the event. We implement different edge colours to distinguish the type of relationship. A red edge denotes a relationship due to a professional affiliation, a blue edge denotes a relationship due to a research group formed during the event, and a green edge denotes a relationship on both dimensions.

One can see that participants from different professional backgrounds formed research groups, thereby connecting different parts of the network. This already demonstrates that the event was successful in one fundamental sense: there is a strong case that these individuals would never have interacted, let alone agreed to perform research together, without this event and the plural mechanisms it used.

But beyond this success, what can we say about COCM relative to QV? One trivial but important claim: the networks above certainly do not represent the true relationships between all participants. But is it a good enough representation, and what does good enough mean in this context? How far can we deviate from the true network before COCM performs worse than QV, or opens up avenues of error or exploitation which are not available under QV?

Protocol Musings
Another, perhaps more fundamental, question raised by this avenue of enquiry: who gets to decide what someone’s “true” identity is? COCM relies on choosing a category or dimension of identity to sort participants among. In our case we chose employment self-reported affiliation, but the most appropriate dimension(s) will depend on context.

Quite apart from the administrative issue of how to confirm / curate / assign these self-reported details (What if someone makes an error? What if someone deliberately misrepresents their affiliation?), there’s a more fundamental issue of drawing distinctions.

To take an example close to home: the Ethereum Foundation is made up of multiple teams. These have a sufficient degree of independence that it’s reasonable for someone in the PSE team to claim that they have a different professional affiliation to someone in the ESP team (a note to EF - other letters are available!)

But a cynic might claim that these people have the same affiliation: Ethereum Foundation.

Who gets to arbitrate? The organizers? The individuals concerned? The participants as a group? (But then how could they decide, since the decision mechanism relies on the social identity data as input?) Some independent body or group (but they might lack context)? A protocolized decision based on some other source of truth such as a register of legal entities? Even seemingly benign decisions might have large ramifications on outcomes here.

We posit that the majority of relevant identity dimensions will run into similar issues. As with elsewhere in our research, our focus is not on what the correct answer is, but whether we can estimate the consequences of making a “wrong” choice.

Next Week
The analysis of the gathered social network of participants creates interesting questions that we wish to answer in the coming weeks. The most interesting question in our mind clearly is the question of how the underlying social network of participant relationships affects the plurality score. In essence, what is the impact of unobserved relationships between participants on the calculated plurality score? Currently, we are working on a simulation framework that allows us to shed light on it.

We hope we can share the first results in the update next week.

References
[1]: Miller, J., Weyl, E. G., & Erichsen, L. (2022). Beyond Collusion Resistance: Leveraging Social Information for Plural Funding and Voting. Available at SSRN 431 1507.

1 Like

This Week

This week has been occupied with questions of truth. Broadly speaking, QV/QF and plural mechanisms like COCM [1] can be considered “adjustment protocols” or even “correction” protocols in that they take some input and adjust them to a state which the protocol implementer feels will better serve some goal. In QV, imbalances in voice credits between various holders are deemed to be too high, so all values are square rooted to mitigate this. In COCM, further interaction terms and penalties are applied in an attempt to discount between correlated voters.

Two broad categories of questions arise:

  • Is the protocol implementer correct that the adjustment is warranted? What if that situation changes?
  • Does the protocol correctly capture the true state of the network it uses for adjustment? What damage is done if it doesn’t?

It’s the second category which interests us right now. To take QV as a simple example, for QV to be useful, the square-rooting adjustment needs to be warranted AND the voice credit values before the reduction need to reflect the “true” state among the underlying voters. The well-known Sybil issue is an obvious example of where this deviation from the true state may occur and even be encouraged.

EIP-7716

A brief detour into the bowels of Ethereum: EIP-7716, which proposes anti-correlation penalties on validators who miss attestations (EIP-7716: Anti-correlation attestation penalties). In short, validators whose errors are correlated are likely to be large operators with unified infrastructure. They benefit from economies of scale, but they introduce centralization and therefore risk: a fault which affects one note is more likely to affect others. Identifying these validators and adding extra penalties when they make errors could help to mitigate this. This is work by dapplion based on a suggestion by Vitalik Buterin (Supporting decentralized staking through more anti-correlation incentives - Proof-of-Stake - Ethereum Research)

We won’t go into too many details here, but initial analysis suggests this is a good adjustment protocol: a simple rule set based on a clear metric which serves as a good proxy for the underlying network state, with no obvious ways to game the system without supporting the aims of the protocol designer (e.g., a large centralized player could decentralize their architecture to avoid penalties. This wouldn’t satisfy the simplistic goal of breaking up these large players in favour of smaller ones, but it would still reduce risk, increase costs for those players and make the playing field more even for smaller stakers).

We’ll investigate this EIP further in future weeks, because we feel it makes a nice comparison to the more human complexities that arise from QV and COCM.

What if the protocol does not correctly capture the true state of the network?

Returning to COCM, this week we embarked on a simulation to try and assess the discrepancies and risks that arise when the identity data used by the protocol deviates from the true state of the relationships between participants. Capturing this true state (both in selecting the identity dimensions to be gathered and correlated, and the accuracy of this data gathering) is a key assumption of COCM in terms of its utility and anti-collusion property.

First, we simulate a network that defines relationships between agents and call this the “true” network. Second, we gradually remove relationships between agents to simulate situations where the implementer/overseer does not capture all information about the underlying network. Third, we calculate the plurality scores using the “true” and the networks with missing relationships. Finally, we define the difference between the obtained scores as the impact that a violation has on the plurality score.

We generate the “true” network using the famous G(n,p) random graph model [2] that generates a so-called Erdős–Rényi graph (ER). The graph above illustrates an ER graph with parameters n=100 and p=0.05, where n specifies the number of nodes and p specifies the probability that there exists an edge between any two nodes. The algorithm to generate the ER random network works as follows: 1) Take the first node and the second node 2) generate a random number between 0 and 1 3) if the number is smaller or equal to p=0.05 connect the two nodes via an edge and otherwise not. 4) in the next iteration take the first and the third node and repeat that process until all node combinations are exhausted.

To calculate the plurality score of the network generated by the ER model we must define groups and votes. We simply assume that agents who share an edge are in the same group, which seems quite natural given the way we generated the network. In this initial simulation, we assume a uniform distribution of votes. Here, we assume that each agent votes 10 on a hypothetical option. We assume this to avoid the potential effect of an interaction between the voting distribution and the underlying network on the plurality score. Remember, we want to cleanly evaluate the effect of the underlying network, therefore, we eliminate the interaction by assuming a uniform distribution. Evaluating the effect of the interaction (if any) might be interesting but it’s not part of this exercise. Under these assumptions the plurality score of the “true” ER network depicted above is 207.584. Let’s put this number into perspective.

We gradually remove edges from the network at random and recalculate the plurality score. By doing this, we transform the “true” ER network into a “partially true” version and compare the resulting plurality scores. The figure above illustrates the results of this exercise. The x-axis denotes the percentage of edges that we randomly remove from the “true” network. For example, 0 denotes that we do not remove any edges (i.e. this corresponds to the “true” network with a plurality score of 207.584), 0.1 denotes that we remove 10% of edges at random and so on. Note that the case where we remove 100% of edges with a score of 316.2278 corresponds to the score that one would obtain using the classic quadratic voting model. Remember, no edges imply that there are no groups. The y-axis denotes the mean plurality score. Lastly, the band around the mean plurality scores for each share of removed edges displays the standard deviation of the plurality scores. For each share of removed edges we simulate 100 networks where we randomly remove the given percentage of edges and calculate the plurality score. We can see that the result displayed in the figure is robust to variations in removed edges. In other words, within a given share it does not matter much which share of edges gets randomly removed.

In summary, the exercise that we performed today clearly demonstrates that the plurality score depends on the available information about agents’ pre-existing relationships. Therefore, the property of collusion resistance that COCM provides might be in jeopartdy due to the fact that participants might be able to collude by hiding (e.g. not reporting) existing relationships. Moreover, the incentive to collude could be substantial, as the potential difference in plurality scores resulting from partial information about the “true” network appears to be significant. However, further evaluation is needed to show that the difference in obtained plurality scores due to partial information is robust to, for example, changes in p, n, or the network model itself.

Next Week

Next week is Protocol Week at Edge Esmeralda. We’ve submitted Plurality vs Authority as our tension, which we hope will spark a discussion on the pressures between gathering relevant social information and how this is curated / enforced.

Protocol Musings

People may enjoy the recent edition of the Epicenter Podcast, featuring Glen Weyl and Audrey Tang discussing the success of plural tools in Taiwan: x.com

References

[1]: Miller, J., Weyl, E. G., & Erichsen, L. (2022). Beyond Collusion Resistance: Leveraging Social Information for Plural Funding and Voting. Available at SSRN 431 1507.

[2]: Gilbert, E. N. (1959). Random graphs. The Annals of Mathematical Statistics, 30(4), 1141-1144.