Sybil and Collusion Resistance via Community Graphs

We need to talk about the ‘C’ word getting thrown around lately.

Yes, I’m talking about Collusion.

It’s messing with our voting mechanisms and our quadratic funding rounds.

But what exactly is collusion?

It has a well-defined legal term, but that’s always within the context of clearly-defined rules or laws: to collude is to coordinate secretly to get around those rules.

But what if there are no clearly written rules? What if we’re dealing with brand new mechanisms like online governance and Quadratic Funding? What does it mean to collude?

Well, then the word becomes a negative connotation.

Collusion is the evil twin of coordination, just like manipulation is the evil twin of leadership.

If Bob is influencing people toward an outcome which is deemed self-serving, that’s manipulation. If he’s influencing people toward a group-benefiting goal, we call that leadership.

Similarly, if people are working together toward a positive outcome, we say they are coordinating. But if they’re doing it toward a self-serving outcome which benefits their group at the expense of the larger group, then we call that collusion.

A textbook example is when businesses collude to fix prices. A more relevant one is when money corrupts systems which aim to uphold certain values above money.

For example, think of governance and justice systems. We don’t want people ‘coordinating’ to buy/sell votes, bribe judges, or pay off dirty cops. Consequently, these systems have had to find ways to discourage collusion with (often severe) penalties.

But as we build out decentralized systems of governance, we are encountering very analogous threats of collusion, and yet we lack any of the tools that centralized systems have to discourage such behavior.

This is a big problem.

Sybil Resistance vs. Collusion Resistance

This outstanding paper perfectly demonstrates this through a real-life case study of a Proof of Personhood protocol called Idena.

The founder of Idena was gracious enough to expose the shortcomings of his project so that the rest of us can learn from them, at great risk to his own reputation. I commend him for pioneering a personhood protocol – that’s really hard. And I commend him even more for shining a spotlight on it afterward. We all make mistakes and learn from them, but it’s not often that we get to learn from others’ mistakes; it’s true a gift.

Idena created a personhood mechanism which made their users perform quite a bit of work to prove, and maintain, their unique personhood. But in doing so, they did solve the sybil problem. With minimal exceptions, every Idena user was a human being with only a single account.

What additionally made this such a great case study is that there was a monetary incentive to try to game the system, in the form of an ongoing UBI token distribution to all account-holders. With real money on the line, we got to see what people actually did to game it.

They didn’t try to get overly clever about hacking the personhood mechanism. Instead, they did something much easier – they started colluding.

For example, a Russian company (‘puppeteer’) hired a bunch of workers (‘puppets’) in low-wage countries to perform the personhood ceremonies, but never gave them the private keys to their own accounts. They simply paid them some small fraction of the UBI token earnings and kept the rest for themselves.

Other versions involved set-ups where the puppets technically had access to their private keys, but they didn’t possess the know-how to be able to do anything with them. In one scenario, a company hired a bunch of children in Egypt as their puppets to perform the personhood ceremonies.

It’s an incredibly thought-provoking paper, both in its analysis as well as the discussion; would highly recommend it.

The main takeaway, if I had to choose one, is that sybil-resistance and collusion-resistance are really two sides of the same coin. If you incentivize sybil attacks, but you only solve for personhood, people will simply start colluding via some form of puppeteering, targeting low-wage and low-info humans and acting as the middle-man that takes a giant chunk of the money or power or whatever is up for grabs.

So as we think about sybil resistance, we must simultaneously solve for collusion resistance; it’s pretty clear what will happen otherwise.

Can’t We Just Math Our Way Out of This?

Okay, so can we use clever math and analysis to detect collusion based on people’s behavior?

Short answer is, not really. Because if the colluding takes place off-chain, how can we ever detect it?

The best we can do is mathematically guess whether people are colluding or not.

Various mechanisms have been proposed to use correlations of people’s votes (or contributions in the case of Quadratic Funding) to probabilistically derive collusion. But the problem is that not only is the on-chain data very limited, but fundamentally correlation does not prove collusion (any more than correlation proves causation).

Let’s take a simple example.

Every day, Bob, Alice and Sarah vote on what to have for lunch.

Their choices are:

a) Pizza-to-the-Moon (Italian Restaurant),
b) Cheddar Jack (Burger Joint), and
c) Tokyo Palace (Sushi place).

Over time, we notice that Bob and Alice seem to vote for Pizza-to-the-Moon quite often, and many times they are correlated in their votes, while Sarah seems to be about evenly split over time.

And suppose we also know some information about Bob and Alice’s social ties, that they know each other and have a lot of friends in common, while Sarah is more removed from both of them on the social graph.

Mathematically, given this data set, the anti-collusion mechanisms would discount Bob’s and Alice’s votes on the grounds that they are likely colluding, because either:

a) Their votes are highly correlated over time, and/or
b) They belong in similar social circles, so they are more likely to talk and collude

And it might very well be the case that collusion is happening, or maybe not. Consider two sets of scenarios.

Scenarios of set A:

Bob might be bribing Alice to vote a certain way, or vice versa. Or maybe Pizza-to-the-Moon is bribing one or both of them. Or instead of money, there might be social pressure exerted, or favors exchanged, and so on. It gets very messy and blurry, very quickly.

Scenarios of set B:

On the other hand, it might also be the case that Pizza-to-the-Moon just makes really good food, and the other two restaurants make crappy food. And Sarah happens to not care much about the taste of food, she only cares about the ambiance, and all three restaurants have equally satisfying ambiance. Or maybe Bob and Alice just like Italian food more than American or Japanese food, while Sarah likes them all about the same.

In both sets of scenarios (set A or set B), the data inputs into the algorithm would be the same, so there is no amount of math that can be done to derive whether collusion is truly happening.

Either you run an anti-collusion algorithm and you risk having everyone eating where they don’t actually want to eat, or you don’t run the algorithm and you risk rewarding collusion.

But what if the algorithm had access to all the real-world data and it could understand the context fully and assess every situation uniquely, and all that data was obtained and updated at a relatively low cost, and kept in an extremely privacy-preserving manner?

Well, good luck with that. I’d say it’s not exactly a pizza on the moon, but at it’s at least a pie in the sky.

What Can We Do?

I’d like to propose an alternate path.

Let’s build out reputation systems which are Soulbound, and which empower people in communities to hold one another accountable to behave in an honest and sincere manner, whether they are voting, contributing to a Quadratic Funding round, or engaging in token-earning activities.

We don’t need to re-invent human social coordination and put it all on the blockchain. Powerful incentives like social status, fear of exclusion, social pressure, and so on, already exist. Let’s subsume these off-chain dynamics which evolved over the millennia and extend them to give ourselves more powerful and precise tools toward even better social coordination.

To achieve this, we need to not only have a way to represent unique personhood, but also a way to replicate within the digital space how we actually relate to one another: we form groups, and families and mini tribes, and clubs, and meetups, and squads, and a million other things. And we connect with and through those communities.

And it’s within the context of those communities that we have our relationships, and hold ourselves, and each other, to a higher standard. It’s where we form unwritten rules and norms and culture. It’s where we actively choose not to collude, because we value our reputation and our belonging to communities higher than the short-term monetary gain from gaming the system. Our social bonds mean everything to us, and our reputation is the glue that holds those bonds together.

Reputation holds so much more real-world data and context than we can hope to upload into a bunch of databases or blocks. It is the aggregation of all our social behavior, and from many disparate points of view. In other words, the best oracles for detecting collusion are other humans.

A properly designed system that can accurately represent communities, community membership, and our reputations with one another, in a decentralized, permissionless and privacy-preserving way, would elegantly address the problem of collusion, along with many other problems, while also opening the door to many new possibilities.

To illustrate, imagine something like an ‘Independence Score’ which everyone assigns to one another, denoting how likely that person is to not collude. This score could increase over time, so someone who has participated in 10 rounds and continues to receive a high score from everyone gets a higher total score than someone who is doing their first round. If you collude with someone, they are incentivized to give you a high score for the round you are colluding on, but then they are incentivized to change it to a low score afterward (as you might be colluding against them next time). After multiple rounds, the low scores will stack up. All the scores are normalized at the end so that everyone has an equal vote, and then QV is applied.

This is just a rough example, and can be improved in many ways.

We are currently building Community Graphs, which will be fully open, decentralized infrastructure that will allow for the experimentation of many different types of reputation, attestation, and voting mechanisms. And these will all be stackable and composable, so that combining multiple identity or reputation systems might offer an even stronger signal than any one of them on their own.

To plagiarize David from Bankless, it’ll be Legos. But instead of Money Legos, it’ll be Social Coordination Legos.

I’m confident that a solution exists somewhere in this space of possibilities, while I have little idea what will ultimately be the most optimal mechanism(s). We should try and play around with as many as possible so that we learn what works well under which circumstances.

We should also expect to encounter new obstacles and unintended consequences along the way, so it’s important to always test new ideas in smaller, low-stakes environments before scaling anything out.

With that in mind, I’m super excited to explore this space, and welcome anyone interested to join us on the journey.

We’ll be submitting Community Graphs as a project in this funding round. Whether accepted or not, would love to offer what we’re building (once it’s ready) as a tool to Octant in order to provide sybil and collusion resistance in future rounds.

6 Likes

This was quite the read, and cannot thank you enough for sharing it! We’ve been looking into ways in which we can build a reputation system with Ethereum Attestation Service or something else to help solve this issue. Have you considered the tech that they have built?

Would love to talk to you further about what this would look like in practice!!

1 Like

Thanks James! I believe Jesse reached out to you already. Excited to be in touch and talk more about what we’re building. Would be great to understand the specific problems that Octant is facing and how we can address those in our roadmap. And yes, we’re talking to EAS as well; open to collaborating with anyone in this space to help solve this problem

1 Like

Interesting read! Particularly loved the puppeteering issue of Idena. The line between where coordination ends and collusion begins is certainly hard to pin down with exactness. For example, Gitcoins new cluster QF algorithm penalized wallets who vote in the same way, by putting them into one cluster and applying the quadratic penalty to them collectively. But this mechanism also penalized the onground coordinaton of projects that got people to vote for them, in addition to projects that usually support each other.

How well does what you’re building interface with cluster QF? And what are the main parameters in deciding a reputation score?

I’m unsure of how relevant it would be to Octant, since they neatly sidestep the Sybil issue by making allocations capital based (you get votes based on how much capital you have staked in the form of locked GLM). So i’m also not sure how applicable it would be to making epoch’s more fair.