Decentralised content moderation
Published by Martin Kleppmann on 13 Jan 2021.
Who is doing interesting work on decentralised content moderation?
With Donald Trump suspended from Twitter and Facebook, and
Parler kicked off AWS, there is renewed discussion about
what sort of speech is acceptable online, and how it should be enforced. Let me say up front that
I believe that these bans were justified. However, they do raise questions that need to be
discussed, especially within the technology community.
As many have already pointed out, Twitter, Facebook and Amazon are corporations that are free to
enforce their terms of service in whatever way they see fit, within the bounds of applicable law
(e.g. anti-discrimination legislation). However, we should also realise that almost all social
media, the public spaces of the digital realm, are in fact privately owned spaces subject to
a corporation’s terms of service. There is currently no viable, non-corporate alternative space that
we could all move to. For better or for worse, Mark Zuckerberg, Jack Dorsey, and Jeff Bezos (and
their underlings) are, for now, the arbiters of what can and cannot be said online.
This situation draws attention to the decentralised web community,
a catch-all for a broad set of projects that are aiming to reduce the degree of centralised
corporate control in the digital sphere. This includes self-hosted/federated social networks such as
Mastodon and Diaspora, peer-to-peer
social networks such as Scuttlebutt, and miscellaneous blockchain
projects. The exact aims and technicalities of those projects are not important for this post.
I will start by focussing on one particular design goal that is mentioned by many decentralised web
projects, and that is censorship resistance.
Censorship resistance
When we think of censorship, we think of totalitarian states exercising violent control over their
population, crushing dissent and stifling the press. Against such an adversary, technologies that
provide censorship resistance seem like a positive step forward, since they promote individual
liberty and human rights.
However, often the adversary is not a totalitarian state, but other users. Censorship resistance
means that anybody can say anything, without suffering consequences. And unfortunately there are
a lot of people out there who say and do rather horrible things. Thus, as soon as
a censorship-resistant social network becomes sufficiently popular, I expect that it will be filled
with messages from spammers, neo-nazis, and child pornographers (or any other type of content that
you consider despicable). One person’s freedom from violence is another person’s censorship, and
thus, a system that emphasises censorship resistance will inevitably invite violence against some
people.
I fear that many decentralised web projects are designed for censorship resistance not so much
because they deliberately want to become hubs for neo-nazis, but rather out of a kind of naive
utopian belief that more speech is always better. But I think we have learnt in the last decade that
this is not the case. If we want technologies to help build the type of society that we want to live
in, then certain abusive types of behaviour must be restricted. Thus, content moderation is needed.
The difficulty of content moderation
If we want to declare some types of content as unacceptable, we need a process for distinguishing
between acceptable and unacceptable material. But this is difficult. Where do you draw the line
between healthy scepticism and harmful conspiracy theory? Where do you draw the line between healthy
satire, using exaggeration for comic effect, and harmful misinformation? Between legitimate
disagreement and harassment? Between honest misunderstanding and malicious misrepresentation?
With all of these, some cases will be very clearly on one side or the other of the dividing line,
but there will always be a large grey area of cases that are unclear and a matter of subjective
interpretation. “I know it when I see it”
is difficult to generalise into a rule that can be applied objectively and consistently; and without
objectivity and consistency, moderation can easily degenerate into a situation where one group of
people forces their opinions on everyone else, like them or not.
In a service that is used around the world, there will be cultural differences on what is considered
acceptable or not. Maybe one culture is sensitive about nudity and tolerant of depictions of
violence, while another culture is liberal about nudity and sensitive about violence. One person’s
terrorist is another person’s freedom fighter. There is no single, globally agreed standard of what
is or is not considered acceptable.
Nevertheless, it is possible to come to agreement. For example, Wikipedia editors successfully
manage to agree on what should and should not be included in Wikipedia articles, even those on
contentious subjects. I won’t say that this process is perfect: Wikipedia editors are predominantly
white, male, and from the Anglo-American cultural sphere, so there is bound to be bias in their
editorial decisions. I haven’t participated in this community, but I assume the process of coming to
agreement is sometimes messy and will not make everybody happy.
Moreover, being an encyclopaedia, Wikipedia is focussed on widely accepted facts backed by evidence.
Attempting to moderate social media in the same way as Wikipedia would make it joyless, with no room
for satire, comedy, experimental art, or many of the other things that make it interesting and
humane. Nevertheless, Wikipedia is an interesting example of decentralised content moderation that
is not controlled by a private entity.
Another example is federated social networks such as Mastodon or Diaspora. Here, each individual
server administrator has the authority to
set the rules for the users of their server, but
they have no control over activity on other servers (other than to block another server entirely).
Despite the decentralised architecture, there is a
trend towards centralisation (10% of Mastodon instances
account for almost half the users), leaving a lot of power in the hand of a small number of server
administrators. If these social networks are to go more mainstream, I expect these effects to be
amplified.
Filter bubbles
One form of social media is private chat for small groups, as provided e.g. by WhatsApp, Signal, or
even email. Here, when you post a message to a group, the only people who can see it are members of
that group. In this setting, not much content moderation is needed: group members can kick out other
members if they say things considered unacceptable. If one group says things that another group
considers objectionable, that’s no problem, because the two groups can’t see each other’s
conversations anyway. If one user is harassing another, the victim can block the harasser. Thus,
private groups are comparatively easy to deal with.
The situation is harder with social media that is public (anyone can read) and open (anyone can join
a conversation), or when the groups are very large. Twitter is an example of this model (and
Facebook to some degree, depending on your privacy settings). When anybody can write a message that
you will see (e.g. a reply to something you posted publicly), the door is opened to harassment and
abuse.
One response might be to retreat into our filter bubbles. For example, we could say that you see
only messages posted by your immediate friends and friends-of-friends. I am pretty sure that there
are no neo-nazis among my direct friends, and probably also among my second-degree network, so such
a rule would shield me from extremist content of one sort, at least.
It is also possible for users to collaborate on creating filters. For example,
ggautoblocker was a tool to block abusive Twitter
accounts during GamerGate, a 2014
misogynistic harassment campaign that
foreshadowed
the rise of the alt-right and Trumpism. In the absence of central moderation by Twitter, victims of
this harassment could use this tool to automatically block a large number of harmful users so that
they wouldn’t have to see the abusive messages.
Of course, even though such filtering saves you from having to see things you don’t like, it doesn’t
stop the objectionable content from existing. Moreover, other people may have the opposite sort of
filter bubble in which they see lots of extremist content, causing them to become radicalised.
Personalised filters also stop us from seeing alternative (valid) opinions that would help broaden
our worldview and enable better mutual understanding of different groups in society.
Thus, subjective filtering of who sees what, such as blocking users, is an important part of
reducing harm on social media, but by itself it is not sufficient. It is also necessary to uphold
minimum standards on what can be posted at all, for example by requiring a baseline of civility and
truthfulness.
Democratic content moderation
I previously argued that there is no universally agreed standard of acceptability of content; and
yet, we must somehow keep the standard of discourse high enough that it does not become intolerable
for those involved, and to minimise the harms e.g. from harassment, radicalisation, and incitement
of violence. How do we solve this contradiction? Leaving the power in the hands of a small number of
tech company CEOs, or any other small and unelected group of people, does not seem like a good
long-term solution.
A purely technical solution does not exist either, since code cannot make value judgements about
what sort of behaviour is acceptable. It seems like some kind of democratic process is the only
viable long-term solution here, perhaps supported by some technological mechanisms, such as
AI/machine learning to flag potentially abusive material. But what might this democratic process
look like?
Moderation should not be so heavy-handed that it drowns out legitimate disagreement. Disagreement
need not always be polite; indeed,
tone policing should not be
a means of silencing legitimate complaints. On the other hand, aggressive criticism may quickly flip
into the realm of harassment, and it may be unclear when exactly this line has been crossed.
Sometimes it may be appropriate to take into account the power relationships between the people
involved, and hold the privileged and powerful to a higher standard than the oppressed and
disadvantaged, since otherwise the system may end up reinforcing existing imbalances. But there are
no hard and fast rules here, and much depends on the context and background of the people involved.
This example indicates that the moderation process needs to embed ethical principles and values. One
way of doing this would be to have a board of moderation overseers that is elected by the user base.
In their manifesto, candidates for this board can articulate the principles and values that they
will bring to the job. Different candidates may choose to represent people with different world
views, such as conservatives and liberals. Having a diverse set of opinions and cultures represented
on such a board would both legitimise its authority and improve the quality of its decision-making.
In time, maybe even parties and factions may emerge, which I would regard as a democratic success.
Facebook employs
around 15,000 content moderators, and
on all accounts it’s
a horrible job.
Who would want to do it? On the other hand, 15,000 is a tiny number compared to Facebook’s user
count. Rather than concentrating all the content moderation work on a comparatively small number of
moderators, maybe every user should have to do a stint at moderation from time to time as part of
their conditions for using a service? Precedents for this sort of thing exist: in a number of
countries, individuals may be called to jury duty to help decide criminal cases; and researchers are
regularly asked to review articles written by their peers. These things are not great fun either,
but we do them for the sake of the civic system that we all benefit from.
Moderators with differing political views may disagree on whether a certain piece of content is
acceptable or not. In cases of such disagreement, additional people can be brought in, hopefully
allowing the question to be settled through debate. If no agreement can be found, the matter can be
escalated to the elected board, which has the final say and which uses the experience to set
guidelines for future moderation.
Implications for decentralised technologies
In decentralised social media, I believe that ultimately it should be the users themselves who
decide what is acceptable or not. This governance will have to take place through some human process
of debate and deliberation, although technical tools and some degree of automation may be able to
support the process and make it more efficient. Rather than simplistic censorship resistance, or
giving administrators dictatorial powers, we should work towards ethical principles, democratic
control, and accountability.
I realise that my proposals are probably naive and smack of “computer scientist finally discovers
why the humanities are important”. Therefore, if you know of any work that is relevant to this topic
and can help technological systems learn from centuries of experience in democracy in the civil
society, please send it to me — I am keen to learn more. Moreover, if there is existing work in the
decentralised web community on enabling this kind of grassroots democracy, I would love to hear
about it too.
You can find me on Twitter @martinkl, or contact me by email
(firstname at lastname dot com). I will update this post with interesting things that are sent to
me.
Here are some related projects that have been pointed out to me since this post was published. I
have not vetted them, so don’t take this as an endorsement.
- The Facebook/Instagram Oversight Board is quite close to what
I have in mind, and it has upheld
the suspension of Trump’s account.
- The recently launched
MIT Center for Constructive Communication
is an ambitious effort in this area.
- “The Decentralized Web of Hate”
is a detailed report by Emmi Bevensee on use of decentralised
technologies by extremists.
- Amy X. Zhang and her collaborators have
done a lot of research on moderation.
- Evelyn Douek argues that it’s not sufficient to
view content moderation as lots of individual decisions on individual pieces of content, but that accountability
requires a new form of institution that provides a dynamic, continuous governance structure.
- Jay Graber recently published a comprehensive
report comparing decentralised social protocols, and a
blog post
on decentralised content moderation.
- Wes Chow has written a
thoughtful and nunanced article
on decentralised content moderation, with lots of references to further reading at the end.
- A few people
mentioned Slashdot, Reddit, and Stack Overflow
as successful examples of community-run moderation.
- On the other hand, J. Nathan Matias is skeptical
that volunteers will be able to handle the challenges of content moderation at scale, since Facebook reportedly
spends $500m a year on it.
- Trustnet is a way of computing numerical scores for
the degree of trust in indvidual users, based on the social graph.
- Matrix, a federated messaging system, is
working on a
decentralised, subjective reputation system.
- Freenet has a web-of-trust-based, decentralised
user reputation system
(see also this Bachelor’s thesis).
- Waivlength is exploring a governance approach inspired by jury duty.
- Freechains is a peer-to-peer content distribution
protocol with an embedded user reputation system.
- Songbird is a sketch of a
decentralised moderation system for IPFS.
- Cabal allows users to
subscribe to other users’ moderation
actions, such as blocking and hiding posts.
- An app called Fantastic
is exploring mechanisms for moderation.
- Felix Dietze’s 2015 master’s thesis
explores community-run moderation. He is also working on
ranking
algorithms
for news aggregators.
- Twitter is trialling Birdwatch,
a crowdsourced effort to tackle misinformation.
- Coinbase’s approach
is to ban only content that is illegal in jurisdictions where they operate, or content that is
not considered protected speech
under the U.S. First Amendment.
If you found this post useful, please
support me on Patreon
so that I can write more like it!
To get notified when I write something new,
follow me on Bluesky or
Mastodon,
or enter your email address:
I won't give your address to anyone else, won't send you any spam, and you can unsubscribe at any time.