There’s A Fatal Flaw In The New Study Claiming YouTube’s Recommendation Algorithm Doesn’t Radicalize Viewers

By 12/30/2019
There’s A Fatal Flaw In The New Study Claiming YouTube’s Recommendation Algorithm Doesn’t Radicalize Viewers

Concerns that YouTube’s recommendation algorithm funnels people toward content that promotes white supremacy or other forms of bigoted extremism are not a new thing. What is new, however, is a study that curiously claims the platform’s algorithm “actively discourages viewers from visiting radicalizing or extremist content.”

The study was independently published Dec. 24 by its two authors, Mark Ledwich (whose Twitter bio says he’s a coder with an interest in YouTube political analysis) and University of California, Berkeley postdoctoral student Anna Zaitsev. Ledwich announced the publication on Twitter, where he claimed the study shows YouTube’s algorithm “has a deradicalizing influence,” “DESTROYS conspiracy theorists, provacateurs and white identarians,” helps partisan channels like The Young Turks and Fox News, and “hurts almost everyone else.” He added that his own post-mortem article about the study “takes aim” at people like New York Times reporter Kevin Roose “who have been on myth-filled crusade vs social media.” (More on Roose later.)

Considering Ledwich and Zaitsev’s conclusion directly contradicts the majority of recent studies about how YouTube’s algorithm interacts with extremist content, their paper garnered immediate attention and prompted speculation that YouTube might not be feeding people radical content after all.

But the duo’s study has one fatal issue: they didn’t log in.

Over the course of their research, which Ledwich said looks at YouTube’s “late 2019 algorithm,” they investigated 768 channels and more than 23 million recommended videos. They selected the core channels by narrowing YouTube’s massive pool with two critieria: one, participating channels had to have more than 10K subscribers, and two, at least 30% of the channels’ content had to be focused on politics. Once they had the 768 channels, Ledwich and Zaitsev categorized them based on their content, sorting them into 18 different categories. Categories included conspiracy, libertarian, social justice, anti-social justice, educational, late-night talk shows, men’s rights activist, state-funded, and, oddly, “anti-whiteness.” (To be labeled an “anti-white” channel, a channel’s content must contain “simplistic narratives about American history, where the most important story is of slavery and racism,” and “exclusively frames current events into racial oppression. Usually in the form of police violence against blacks.”)

To examine what YouTube’s algorithm recommends to viewers, Ledwich and Zaitsev went through those channels’ videos and scraped each one’s recommendation data–so, they were able to see what YouTube offered in the Up Next box to people watching each video. However, crucially, Ledwich and Zaitsev did this while not logged in to a YouTube account.

“One should note that the recommendations list provided to a user who has an account and who is logged into YouTube might differ from the list presented to this anonymous account,” they state in the study report. “However, we do not believe that there is a drastic difference in the behavior of the algorithm. Our confidence in the similarity is due to the description of the algorithm provided by the developers of the YouTube algorithm. It would seem counter-intuitive for YouTube to apply vastly different criteria for anonymous users and users who are logged into their accounts, especially considering how complex creating such a recommendation algorithm is in the first place.”

This is inaccurate. Much of the controversy about YouTube’s algorithm stems from what’s been called a “wormhole” or “rabbit hole”: a downward spiral of more and more extreme videos fed to individual users who watched a few, often milder videos about the same topics. But those recommendations are specific to users and their viewing activity, which means trawling YouTube with a logged-out account simply isn’t going to result in the same kind of content.

Because YouTube had no personalization data to go off, each box of Up Next recommendations it served Ledwich and Zaitsev was a generalized, blank-slate collection of videos. The algorithm is literally incapable of introducing an anonymous, logged-out user to increasingly radical content.

And so, for them, the rabbit hole didn’t exist.

The Real Question Is: Can Anyone Accurately Study YouTube’s Algorithm?

A handful of experts have spoken up about Ledwich and Zaitsev’s methodology in the wake of the study’s publication. Princeton computer scientist Arvind Narayanan, who spent a year studying YouTube radicalization with a group of his students, tweeted an extensive takedown of the paper, primarily criticizing the fact that Ledwich and Zaitsev were logged out while collecting data.

“This study didn’t analyze real users. So the crucial question becomes: what model of user behavior did they use?” he wrote. “The answer: they didn’t! They reached their sweeping conclusions by analyzing YouTube without logging in, based on sidebar recommendations for a sample of channels (not even the user’s home page because, again, there’s no user). Whatever they measured, it’s not radicalization.”

Narayanan added that he and his students concluded there’s “no good way for external researchers to quantitatively study radicalization,” and that, “I think YouTube can study it internally, but only in a very limited way.”

Kevin Roose, the NYT reporter Ledwich specifically called out, also responded in a Twitter thread. (Roose is the author of a prominent piece called The Making of a YouTube Radical, which chronicles YouTube user Caleb Cain’s descent into what he described as “the alt-right rabbit hole.”)

“Personalized, logged-in, longitudinal data is how radicalization has to be understood, since it’s how it’s experienced on a platform like YouTube,” Roose wrote.

Other experts who’ve chimed in include Zeynep Tufekci, an associate professor at the University of North Carolina School of Information and Library Science, and Becca Lewis, a Stanford Ph.D. student studying online political subcultures and grassroots political movements.

Also among those urging people not to blindly accept the study’s findings is longtime digital media journalist Julia Alexander, whose tweet calling the study’s data collection process “flawed” and asking fellow reporters not to cover it has prompted a wave of backlash from people arguing that YouTube’s algorithm does not propagate extremist content.

Ledwich has responded to criticism with a Twitter thread about the study’s limitations, ending with this tweet:

Zaitsev, meanwhile, spoke to FFWD and appeared to contradict Ledwich, saying, “We don’t really claim that it does deradicalize, we rather say it is a leap to say that it does [radicalize]. The last few comments about deradicalization are blown out of proportion on Twitter.”

She later disputed FFWD’s claim that she was “distancing” herself from Ledwich’s comments.