The following is an excerpt Media Diversity Institute and TextGain’s chapter for the Courage against Hate report. To read and download the full report click on this link.
Using AI and Advocacy-driven Counternarratives to Mitigate Online Hate
The Courage Against Hate initiative has been brought together by Facebook for the purpose of sparking cross-sector, pan-European dialogue and action to combat hate speech and extremism. This collection of articles unites European academic analysis with practitioners who are actively working on countering extremism within civil society.
Detect Then Act: Ai-Assisted Activism Against Online Hate
In recent years, Machine Learning (ML) and more specifically Natural Language Processing (NLP) techniques have advanced to a point where they rival humans in tasks such as predicting state-of-mind, age or gender from (anonymous) text. Since a common feature of online hate speech is to use pejorative language (e.g., clown, thug, scum), it should in theory be possible to isolate it with automatic detection techniques. Theoretically, we could train a tireless AI to detect pejorative language on social media platforms, remove those messages and be done with it. However, the problem is more complex in practice. No doubt, AI is an integral part of the solution in managing hundreds of thousands of new messages per day, but careful consideration should be given to human rights and freedom of expression, as was also recognized by Facebook CEO Mark Zuckerberg.
First, removing offensive messages does not remove the underlying drivers. If anything, those users that see their content blocked will likely only become more disgruntled. While most stakeholders might agree that nobody really minds if violent extremists are disgruntled when their inflammatory propaganda is removed, not all extremist content is violent, and not all offensive content is extremist. There is a large grey area of content in a metaphorical minefield of local government regulations, societal norms and tech company policy. To illustrate this, discriminatory online hate speech is illegal in many EU regions (cf. Germany’s NetzDG) while freedom of expression is protected in the US by the First Amendment and Brandenburg v. Ohio, and tech companies have to navigate multiple regions worldwide.
Second, automatic techniques sometimes make mistakes, or even worse, perpetuate human prejudices, with the risk of over blocking (removing inconspicuous content) and under blocking (ignoring undesirable content). Contrary to human intervention, today’s ML algorithms were not designed to account for their mistakes. This challenge is also highlighted in a recent publication in Nature (Rudin, 2019), which advocates for simpler, more interpretable techniques for high-stakes decision making. If anything, one can argue that AI with societal impact should always have human supervisors in the loop. The aim of technology in a moderation setup is then not to replace humans, but to support them in their decision making by taking over the most repetitive tasks.
Social media platforms came with the vision of sharing knowledge globally, equal access to information, with equal voices. To uphold that vision is a shared responsibility of all members of society. Perhaps because we are still in our infancy as global citizens of a virtual community, some of our discussions still look like playground bullying, yet open debate is still democracy’s best immune system. This kind of self-regulatory aspiration underpins our Detect Then Act project. Detect Then Act is a collaboration between Textgain, the Media Diversity Institute, the University of Hildesheim (computer science, political science), the University of Antwerp (communication science) and the St Lucas School of Arts in Antwerp. The project is supported by the European Commission’s Rights, Equality and Citizenship programme under the call REC-RRAC-ONLINE-AG-2018 (850511). Our aim is to counter online hate speech by encouraging bystanders to become upstanders. In effect, while online trolls test the boundaries and circumvent platform terms of service by framing their us-vs-them narratives as ‘funny memes,’ the middle ground (Brandsma, 2017) largely stays silent in dismay. The project encourages volunteers to stand up to online hate and bullying by training them in digital resilience, relevant regulations, by providing AI-powered dashboards and ready-made counternarratives, and by protecting their privacy when reacting. The project is compliant with the EU’s General Data Protection Regulation (GDPR). A private dashboard presents upstanders with a snapshot of today’s and yesterday’s most hateful messages on social media. These messages are collected by tapping into the platforms’ official APIs, yet no content is stored in a database. After two days, any messages that the AI might have spotted are forever forgotten again. Also, the identity of the authors of such messages is never revealed, and when upstanders decide to react, neither is their identity. In academia this is also called a double-blind approach. Messages that attract a lot of buzz are shown with a computer-generated photo and pseudonym, to make them stand out.
Lessons Learned from GTTO: Creating Successful Counternarrative
Since 2015, the aim of GTTO has been to leverage social media to engage in dialogue around diverse forms of hate, including antisemitism, Islamophobia, anti-Christian sentiment, and related attempts to turn public opinion against migrants and asylum seekers. GTTO’s main audience is young people. Hence, the accompanying counternarratives have been specifically tailored to this group, which highlights an important aspect. The effectiveness of counternarratives depends heavily on demographics: who are we targeting, and what is the best way to target them?
For example, millennials (25-35 year olds) are an ideal audience for educational narratives. They engage well with all types of informative media like explainer videos, podcasts, infographics, fact memes, and so on. This group constitutes the majority of GTTO followers and sharers. On the other hand, it has been more difficult to engage with younger audiences, 16-25 year olds in particular, due to a variety of factors. First, the choice of platform is key. Today, youngsters are more active on Instagram and TikTok than on Facebook and Twitter, where we communicate with video clips and cartoons. But the choice of content creators matters. Within GTTO, content is designed by millennials. This is something that we have become mindful of, and uptake will likely increase if we work directly with younger content creators.
One engagement technique used within the GTTO project is empowering young people to be producers instead of consumers. Two years ago, coinciding with the season finale of Game of Thrones, we launched a campaign called Game Of Trolls, to help young people tackle online hate in an instructive, actionable way. Deciding to fight fire with fire,we recruited trolls to join our ranks and train them in ‘positive trolling’ through a series of hands-on tips on Facebook and Instagram. Then, our so-called Army of Good Trolls respond to calls for help submitted via the hashtag #TrollWithLove. The campaign reached close to 1 million people on Facebook with the help of Facebook Ad Credits, which were critical to the success of the campaign. This shows how synergy between CSOs and social media platforms can result in powerful, broadly visible counternarratives. However, it also highlights one of the challenges faced by CSOs in mitigating online hate: without the support of the platform we could not have afforded the campaign. Creating effective counternarratives is not only about good ideas and demographics, but also about (financial) resources and tools to put them into practice. There are many other reasons why counternarratives might succeed or fail. In GTTO, we constantly review and adapt our strategy, not only because we want to improve as first-line practitioners, but also because the effectiveness of techniques changes over time. As target audiences widen and expand their interests, so too must the content that they want to engage with and how. In the overview (right) are some practice-based insights that we have learned throughout this process:
Casual: To make counternarratives appealing it is often important to strike the right tone. We avoid sounding like an NGO, which may represent part of an establishment that teens push against, and create distance. We try not to nag or preach and instead look to carry people with us.
Fresh: To keep people engaged, we work to keep our content fresh, by appropriating new trends and pop culture. For example, we used the movie release of Fantastic Beasts and Where to Find Them to launch a Fantastic Trolls and How to Fight Them guide, a ‘bestiary’ with different types of online trolls, and how people can or shouldn’t engage with them.
Stimulating: It is one thing to share facts and figures, but to create effective engagement there needs to be a clear call to action for those who are consuming the content. How can they help, and why should they?
Multimodal: It is vital to use different forms of media throughout a campaign. Within GTTO, we use a mix of videos, images, infographics, memes, cartoons and articles, keeping in mind that walls of text are going to be scrolled faster than visual content.
Persistent: We constantly adapt with new tones, new trends, new modes, new content. In a way, campaigning is a case of attrition: the content needs to be kept flowing, and memes that don’t work once may work in the future if the global landscape changes to make it more relevant.
Pragmatic: While it is splendid to get support from sponsors and social media platforms, to keep day-to-day work going, it can be useful to rely on free apps for content generation. There is no formula or price setting for creating a viral meme, cheap & cheerful can also work.
Practical: In GTTO we continuously learn from others, for example Vox in the case of explainer videos. There is no need to reinvent the wheel. When time and resources are scarce, best practices from other initiatives can often be boiled down to basic yet effective output
What Always Will Remain Challenging, and Why
Initiatives like Detect Then Act and Get The Trolls Out continually adapt to evolving audiences, behavior and technology. The greatest hurdle is the ever-changing tide of hate. Each day new hateful memes, new hashtags, images and videos emerge from the web’s underbelly. Finding – let alone reacting to – everything is no longer possible, and ill-conceived reactions may also exacerbate the problem. Practitioners seeking to counter hate must now be selective in choosing what is having the most impact. AI can help, but it is not without pitfalls. Quantitative and qualitative approaches should work side-by-side as a key to success, but we need to close the gap between developers that might not fully grasp the problem and practitioners that might not fully grasp the technology. Algorithms can evaluate how influential keywords are, and whether they are going to ‘explode’ at some point in the future, but maybe only human experts should be able to operationalize this data. Academic groundwork can be an advantage too, and tech companies should perhaps be less afraid to adopt open sources, strengthening their accountability towards society. Finally, mitigating hate is a responsibility shared by all members of society, demanding closer collaboration between law enforcement bodies, civil society actors, users, and tech companies.