Linguistic data analysis of 3 billion Reddit comments shows the alt-right is getting stronger

Originally published: Quartz on August 18, 2017 by Tim Squirrell (more by Quartz) (Posted Aug 19, 2017)

You probably have a good idea of who the so-called “alt-right” are: a group of white supremacists and nationalists, bound up by a fiery loathing of “political correctness,” “cultural Marxism,” and those pesky “social-justice warriors.” You might have also seen the articles that tell us to stop using that term and call them out for the fascist, neo-Nazis they are. In the wake of the “Unite the Right” protests in Charlottesville last weekend, these calls have only become more urgent. The phrase has become a catch-all for people like Richard Spencer, the head of the white supremacist National Policy Institute, and Milo Yiannopoulos, the online troll and provocateur who recently fell from mainstream conservative grace. But there’s a lot more people it catches in its (inter)net.

The alt-right isn’t one group. They don’t have one coherent identity. Rather, they’re a loose collection of people from disparate backgrounds who would never normally interact: bored teenagers, gamers, men’s rights activists, conspiracy theorists and, yes, white nationalists and neo-Nazis. But thanks to the internet, they’re beginning to form a cohesive group identity. And I have the data to prove it.

The_Donald is a Reddit community with over 450,000 subscribers. It’s the breeding ground for the alt-right, and the fermenting vat in which this identity is being formed. According to data analysis by FiveThirtyEight, it’s U.S. president Donald Trump’s “most rabid online following,” and Reddit itself now claims it is the fourth most visited site in the U.S., behind only Facebook, Google, and YouTube.

As part of the Alt-Right Open Intelligence Initiative at the University of Amsterdam, I’ve been working to understand the language of the alt-right and what it can tell us about its members. Working with the UK Home Office’s Extremism Analysis Unit, I used Google’s BigQuery tool, which lets you trawl through massive datasets in seconds, to interrogate a collection of every Reddit comment ever made—all 3 billion of them.

Focusing on The_Donald, I used a script that lets you see which words are most likely to occur in the same comment. Combining this with a tool that allows you to look at the overlap in commenters between different parts of Reddit, I found that the alt-right isn’t just one voice: It’s made up by distinct constituencies that share different opinions and ways to express them, identifiable by the language they use and the other communities they post in.

In other words, there’s a taxonomy of trolls. So who are they, and what language do they use?

The taxonomy of trolls

The 4chan shitposters. These men and boys (and they are almost exclusively male) come from 4chan, an image board in the deepest bowels of the internet. You’re most likely to see them deliberately provoking offense and outrage, often using the most extreme racist, sexist, and anti-Semitic slurs, but without necessarily fully buying into racist ideology. They’re the people you can’t argue with, because any attempt to engage them in a serious conversation will provoke an “only joking!” plea. Other users of The_Donald affectionately refer to them as “weaponized autists,” named for the orchestration of numerous hacks and leaks through the hacker collective Anonymous. You’ll see them talking about memes such as Pepe the Frog, “Kekistan,” and the “normies” they despise. Elsewhere on Reddit, you’re most likely to find them on /r/ImGoingToHellForThis, /r/CringeAnarchy, or any other deliberately offensive subreddit.

  • Most common words: kek, Pepe, deus vult, tendies, God Emperor Trump

Anti-progressive gamers. Closely related to the above, these trolls were radicalized over the course of the #GamerGate hate movement. They really like video games, and they really hate social-justice warriors, gay people, and feminists, all of whom they’re pretty sure major movie and game studios are “pandering” to with things like all-female screenings of Wonder Woman. You’re likely to see them talking about the trans community a lot (and repeating the words “there are only two genders” constantly). Elsewhere on Reddit, you’ll find them in gaming subreddits, or /r/KotakuinAction, which was the home of GamerGate.

  • Most common words: SJW, snowflake, pandering, tumblr, feminist, triggering, GamerGate, virtue signalling

Men’s rights activists. This group consists of those who explicitly campaign for men’s rights (custody battles and workplace deaths are their favorite talking points) and also includes anti-feminists and misogynists of all stripes. You’ll find them at /r/Incels (short for “involuntary celibates,” who want to have sex or find a partner but can’t—and blame women for this), /r/MGTOW (“Men Going Their Own Way,” who believe that they can only find true liberation in a female-dominated world by refusing to interact with women completely), the infamous /r/TheRedPill, and a few less popular “Manosphere” subreddits as well as misogynistic sites like “Return of Kings. You’ll find them referring to women as “females,” and men they perceive as weak as “cucks” (more on that later).

Anti-globalists. These people like Alex Jones, Steve Bannon, Sean Hannity, and conspiracy theories—and they talk about them an awful lot. They are far less enamored (yet still mildly obsessed) with George Soros, who funds everyone they hate, as well as Emmanuel Macron, John McCain, and Paul Ryan. Elsewhere, they can be seen on /r/uncensorednews (primarily news about bad things perpetrated by members of minority groups and left-wing people), and /r/conspiracy. Their hyperbolic conspiratorial language might sound absurd, but it’s become an increasingly coherent and important part of The_Donald since the subreddit began.

  • Most common words: globalist scum, the establishment, puppets, elites, masters, George Soros, cultural Marxist

White supremacists. It might seem surprising, but the language of white supremacy is actually quite uncommon in The_Donald. That’s because explicit racism is banned. Implicit or coded racism is very common, for example displaying Islamophobic sentiment and passing it off as criticizing Islamism, or claiming “Islam is not compatible with Western culture.” They also populate other subreddits like the now-banned /r/CoonTown and /r/GreatApes, as well as sites like Stormfront and the now defunct The Daily Stormer.

  • Most common words: Islam, (creeping) Sharia, “deus vult”, “western culture”, various racial slurs

For a long time, these people would have very limited reason to interact with one another. There wasn’t much in common between meme aficionados, gamers, sexists, conspiracy theorists, and racists. Because the very nature of Reddit is to subdivide and find your own specific corner of the internet, these communities didn’t tend to run into each other all that much. But that’s now changed.

The_Donald’s identity

Over the last year and a half, these types of trolls have formed a central identity around Trumpism and have started to coalesce. Bored teenagers and gamers are becoming indoctrinated into hard-line anti-globalism, conspiracy theories, and Islamophobia, and it’s happening right before our eyes, on a publicly accessible forum.

The_Donald contains all of these different groups, marked out by their overlapping community memberships and the words that they (and only they) use. They’ve created an in-group language consisting of words like “MAGA” (Make America Great Again) and “based,” a word appropriated from rap culture. The latter is taken to mean “being yourself” and originated in the crack era. Then there is “centipede” (usually shortened to “pede”), a self-referential term originating from the viral video series “Can’t Stump the Trump,” which was popularized when the linked video was tweeted by Trump himself.

But the keystone of this vernacular is “cuck.” A shortening of “cuckold,” an old word used to refer to men who allow their partners to sleep with other men (and often find sexual gratification in the humiliation of it), its use has become the sine qua non of alt-right group membership.

You’ll find cuck used in multiple senses. First, there’s “cuckservative,” used against conservatives who are seen as being too soft and allowing their countries (primarily European) to be “invaded” by Islam and Muslims. The racial connotations of the word were attached during a period when the word was incredibly popular in the now-banned /r/CoonTown, an explicitly racist subreddit.

Then, there’s the use of “cuck” in a more patriarchal sense. The GamerGate movement popularized the word on Reddit when they were banned from 4chan and migrated over to /r/KotakuInAction. They used it first to describe the jilted ex-boyfriend of Zoe Quinn, a games developer they ran a hate campaign against, before turning it against Christopher “moot” Poole, the administrator of 4chan, when he kicked them off his site.

Thirdly, you have what might now be the most standard usage of the word, which is to refer to those seen as liberal. You can see this in the popularization of words like “libcuck,” “cuckbook,” “starcucks,” and “cuck Schumer” in The_Donald. In the wider digital world, you might see it in below-the-line comments of articles on Facebook.

This leads us to the final type of usage, which is when anyone who isn’t the alt-right uses it to mock those who do use it, flipping its meaning entirely. As a result, it’s everywhere, and its story can tell us a lot about the different groups described above.

| Frequency of cuck across different subreddits 2014 2015 Tim Squirrell | MR Online

Frequency of “cuck” across different subreddits, 2014-2015. (Tim Squirrell)

| Frequency of cuck across different subreddits January 2017 Tim Squirrell | MR Online

Frequency of “cuck” across different subreddits, January 2017. (Tim Squirrell)

The_Donald and other alt-right spaces are acting as meeting places for disaffected white men from all walks of life to share a communal hatred. They start out in different corners of the internet with different interests and different lexicons. They remain separate when they’re outside of The_Donald, but the more time they spend in there, the more pernicious views of the world they are likely to pick up by osmosis. They are forming a coherent group identity, represented in the language they have begun to speak, which coalesces around their common hatred of liberalism and their love of Donald Trump.

We’re witnessing the radicalization of young white men through the medium of frog memes. In order to see it, all you need to do is look at the words coming out of their mouths. The alt-right isn’t yet united, but it soon will be.

Monthly Review does not necessarily adhere to all of the views conveyed in articles republished at MR Online. Our goal is to share a variety of left perspectives that we think our readers will find interesting or useful. —Eds.