One of the most fascinating examples of social innovation I’ve been tracking recently was the We Counter Hate platform, by Seattle-based agency POSSIBLE (now part of Wunderman Thompson Seattle) that sought to reduce hate speech on Twitter by turning retweets of these hateful messages into donations for a good cause.

Here’s how it worked: Using machine learning, it first identified hateful speech on the platform. A human moderator then selected the most offensive and most dangerous tweets and attached an undeletable reply, which informed recipients that if they retweet the message, a donation will be committed to an anti-hate group. In a beautiful twist this non-profit was Life After Hate, a group that helps members of extremist groups leave and transition to mainstream life.

Unfortunately (and ironically) on the very day I reached out to the team, Twitter decided to allow users to hide replies in their feeds in an effort to empower people faced with bullying and harassment, eliminating the reply function which was the main mechanism that gave #WeCounterHate its power and led to it being able to remove more than 20M potential hate speech impressions.

Undeterred, I caught up with some members of the core team–Shawn Herron, Jason Carmel and Matt Gilmore–to find out more about their journey.

Afdhel Aziz: Gentlemen, welcome. How did the idea for WeCounterHate come about?

Shawn Herron: It started when we caught wind of what the citizens of the town of Wunsiedel, Germany were doing to combat the annual extremists that were descending on their town every year to hold rally and march through the streets. The town’s people had devised a peaceful way to upend the extremist’s efforts by turning their hateful march into an involuntary walk-a-thon that benefitted EXIT Deutschland, an organization that helps people escape extremist groups. For every meter the neo Nazis marched 10 euro would be donated to Exit Deutschland. The question became, how can we scale something like that so anyone, anywhere, could have the ability to fight against hate in a meaningful way?

Jason Carmel: We knew that, to create scale, it had to be digital in nature and Twitter seemed like the perfect problem in need of a solution. We figured if we could reduce hate on a platform of that magnitude, even a small percentage, it could have a big impact. We started by developing an innovative machine-learning and natural-language processing technology that could identify and classify hate speech.  

Matt Gilmore: But we still needed the mechanic, a catch 22, that would present those looking to spread hate on the platform with a no-win decision to make. That’s when we stumbled onto the fact that Twitter didn’t allow people to delete comments on their tweets. The only way to remove a comment was to delete the post entirely. That mechanic is what gave us a way put a permanent marker, in the form of an image and message, on tweets containing hate speech. It’s that permanent marker that let those looking to retweet, and spread hate, know that doing so would benefit an organization they’re opposed to, Life After Hate. No matter what they chose to do, love wins.

Aziz: Fascinating. So, what led you to the partnership with Life After Hate and how did that work?

Carmel: Staffed and founded by former hate group members and violent extremists, Life After Hate is a non-profit that helps people in extremist groups break from that hate-filled lifestyle. They offer a welcoming way out that’s free of judgement. We collaborated with them in training the AI that’s used to identify hate speech in near real time on Twitter. With the benefit of their knowledge our AI can even find hidden forms of hate speech (coded language, secret emoji combinations) in a vast sea of tweets. Their expertise was crucial to align the language we used when countering hate, making it more compassionate and matter-of-fact, rather than confrontational.

Herron: Additionally, their partnership just made perfect sense on a conceptual level as the beneficiary of the effort. If you’re one of those people looking to spread hate on Twitter, you’re much less likely to hit retweet knowing that you’ll be benefiting an organization you’re opposed to.

Aziz: Was it hard to wade through that much hate speech? What surprised you?

Herron: Being exposed to all the hate filled tweets was easily the most difficult part of the whole thing. The human brain is not wired to read and see the kinds of messages we encountered for long periods of time. At the end of the countering process, after the AI identified hate, we always relied on a human moderator to validate it before countering/tagging it. We broke up the shifts between many volunteers, but it was always quite difficult when it was your shift.

 Carmel: We learned that the identification of hate speech was much easier than categorizing it. Or initial understanding of hate speech, especially before Life After Hate helped us, was really just the “movie version” of hate speech and missed a lot of hidden context. We were also surprised at how much the language would evolve relative to current events. It was definitely something we had to stay on top of.

We were surprised by how broad a spectrum of people the hate was coming from. We went in thinking we’d just encounter a bunch of thugs, but many of these people held themselves out as academics, comedians, or historians. The brands of hate some of them shared were nuanced and, in an insidious way, very compelling.    

We were caught off guard by the amount of time and effort those who disliked our platform would take to slam or discredit it. A lot of these people are quite savvy and would go to great lengths to attempt to undermine our efforts. Outside of the things we dealt with in Twitter, one YouTube “hate-fluencer” made a video, close to an hour long, that wove all sorts of intricate theories and conspiracies about our platform.

Gilmore: We were also surprised by how wrong our instincts were. When we first started, the things we were seeing made us angry and frustrated. We wanted to come after these hateful people in an aggressive way. We wanted to fight back. Life After Hate was essential in helping course-correct our tone and message. They helped us understand (and we’d like more people to know) the power of empathy combined with education, and its ability to remove walls rather than build them between people. It can be difficult to take this approach, but it ultimately gets everyone to a better place.

Aziz: I love that idea – empathy with education.What were the results of the work you’ve done so far? How did you measure success?

Carmel: The WeCounterHate platform radically outperformed expectations of identifying hate speech (91% success) relative to a human moderator, as we continued to improve the model over the course of the project.

When @WeCounterHate replied to a tweet containing hate, it reduces the spread of that hate by an average of 54%. Furthermore, 19% of the “hatefluencers” deleted their original tweet outright once it had been countered.

By our estimates, the Hate Tweets we countered were shared roughly 20 million fewer times compared to similar Hate Tweets by the same authors that weren’t countered.

Matt: It was a pretty mind-bending exercise for people working in an ad agency, that have spent our entire careers trying to gain exposure for the work do on behalf of clients, to suddenly be trying to reduce impressions. We even began referring to WCH as the world’s first reverse-media plan, designed to reduce impressions by stopping retweets.

Aziz: So now that the project has ended, how do you hope to take this idea forward in an open source way?

Herron: Our hope was to counter hate speech online, while collecting insightful data about how hate speech online propagates. Going forward, hopefully this data will allow experts in the field to address the hate speech problem at a more systemic level. Our goal is to publicly open source archived data that has been gathered, hopefully next quarter (Q1 2020)

I love this idea on so many different levels. The ingenuity of finding a way to counteract hate speech without resorting to censorship. The partnership with Life After Hate to improve the sophistication of the detection. And the potential for this same model to be applied to so many different problems in the world (*anyone want to build a version for climate change deniers?). It proves that the creativity of the advertising world can truly be turned into a force for good, and for that I salute the team at Possible for showing what’s, well, possible.