There’s A Fatal Flaw In The New Study Claiming YouTube’s Recommendation Algorithm Doesn’t Radicalize Viewers

By James Hale • 12/30/2019 •

Concerns that YouTube’s recommendation algorithm funnels people toward content that promotes white supremacy or other forms of bigoted extremism are not a new thing. What is new, however, is a study that curiously claims the platform’s algorithm “actively discourages viewers from visiting radicalizing or extremist content.”

The study was independently published Dec. 24 by its two authors, Mark Ledwich (whose Twitter bio says he’s a coder with an interest in YouTube political analysis) and University of California, Berkeley postdoctoral student Anna Zaitsev. Ledwich announced the publication on Twitter, where he claimed the study shows YouTube’s algorithm “has a deradicalizing influence,” “DESTROYS conspiracy theorists, provacateurs and white identarians,” helps partisan channels like The Young Turks and Fox News, and “hurts almost everyone else.” He added that his own post-mortem article about the study “takes aim” at people like New York Times reporter Kevin Roose “who have been on myth-filled crusade vs social media.” (More on Roose later.)

Considering Ledwich and Zaitsev’s conclusion directly contradicts the majority of recent studies about how YouTube’s algorithm interacts with extremist content, their paper garnered immediate attention and prompted speculation that YouTube might not be feeding people radical content after all.

Subscribe for daily Tubefilter Top Stories

But the duo’s study has one fatal issue: they didn’t log in.

2. It turns out the late 2019 algorithm
*DESTROYS* conspiracy theorists, provocateurs and white identitarians
*Helps* partisans
*Hurts* almost everyone else.

👇 compares an estimate of the recommendations presented (grey) to received (green) for each of the groups: pic.twitter.com/5qPtyi5ZIP

— Mark Ledwich (@mark_ledwich) December 28, 2019

Over the course of their research, which Ledwich said looks at YouTube’s “late 2019 algorithm,” they investigated 768 channels and more than 23 million recommended videos. They selected the core channels by narrowing YouTube’s massive pool with two critieria: one, participating channels had to have more than 10K subscribers, and two, at least 30% of the channels’ content had to be focused on politics. Once they had the 768 channels, Ledwich and Zaitsev categorized them based on their content, sorting them into 18 different categories. Categories included conspiracy, libertarian, social justice, anti-social justice, educational, late-night talk shows, men’s rights activist, state-funded, and, oddly, “anti-whiteness.” (To be labeled an “anti-white” channel, a channel’s content must contain “simplistic narratives about American history, where the most important story is of slavery and racism,” and “exclusively frames current events into racial oppression. Usually in the form of police violence against blacks.”)

To examine what YouTube’s algorithm recommends to viewers, Ledwich and Zaitsev went through those channels’ videos and scraped each one’s recommendation data–so, they were able to see what YouTube offered in the Up Next box to people watching each video. However, crucially, Ledwich and Zaitsev did this while not logged in to a YouTube account.

“One should note that the recommendations list provided to a user who has an account and who is logged into YouTube might differ from the list presented to this anonymous account,” they state in the study report. “However, we do not believe that there is a drastic difference in the behavior of the algorithm. Our confidence in the similarity is due to the description of the algorithm provided by the developers of the YouTube algorithm. It would seem counter-intuitive for YouTube to apply vastly different criteria for anonymous users and users who are logged into their accounts, especially considering how complex creating such a recommendation algorithm is in the first place.”

This is inaccurate. Much of the controversy about YouTube’s algorithm stems from what’s been called a “wormhole” or “rabbit hole”: a downward spiral of more and more extreme videos fed to individual users who watched a few, often milder videos about the same topics. But those recommendations are specific to users and their viewing activity, which means trawling YouTube with a logged-out account simply isn’t going to result in the same kind of content.

Because YouTube had no personalization data to go off, each box of Up Next recommendations it served Ledwich and Zaitsev was a generalized, blank-slate collection of videos. The algorithm is literally incapable of introducing an anonymous, logged-out user to increasingly radical content.

And so, for them, the rabbit hole didn’t exist.

The Real Question Is: Can Anyone Accurately Study YouTube’s Algorithm?

A handful of experts have spoken up about Ledwich and Zaitsev’s methodology in the wake of the study’s publication. Princeton computer scientist Arvind Narayanan, who spent a year studying YouTube radicalization with a group of his students, tweeted an extensive takedown of the paper, primarily criticizing the fact that Ledwich and Zaitsev were logged out while collecting data.

“This study didn’t analyze real users. So the crucial question becomes: what model of user behavior did they use?” he wrote. “The answer: they didn’t! They reached their sweeping conclusions by analyzing YouTube without logging in, based on sidebar recommendations for a sample of channels (not even the user’s home page because, again, there’s no user). Whatever they measured, it’s not radicalization.”

Narayanan added that he and his students concluded there’s “no good way for external researchers to quantitatively study radicalization,” and that, “I think YouTube can study it internally, but only in a very limited way.”

Kevin Roose, the NYT reporter Ledwich specifically called out, also responded in a Twitter thread. (Roose is the author of a prominent piece called The Making of a YouTube Radical, which chronicles YouTube user Caleb Cain’s descent into what he described as “the alt-right rabbit hole.”)

“Personalized, logged-in, longitudinal data is how radicalization has to be understood, since it’s how it’s experienced on a platform like YouTube,” Roose wrote.

The only people who have the right data to study radicalization at scale work at YouTube, and they have made changes in 2019 they say have reduced “borderline content” recs by 70%. Why would they have done that, if that content wasn’t being recommended in the first place?

— Kevin Roose (@kevinroose) December 29, 2019

Other experts who’ve chimed in include Zeynep Tufekci, an associate professor at the University of North Carolina School of Information and Library Science, and Becca Lewis, a Stanford Ph.D. student studying online political subcultures and grassroots political movements.

👇Yep, that “paper” isn’t even wrong. One tragedy of all this is that, at the moment, only the companies can fully study phenomenon such as the behavior of recommendation algorithms. There are some great external studies that do give us a sense—but that’s all we get. Yet. https://t.co/8PvHkskqZL

— zeynep tufekci (@zeynep) December 29, 2019

Fantastic thread on why quantitative methods are often ill-suited to studying radicalization on YouTube via the algorithm. https://t.co/LiiIsfPVzN

— Becca Lewis (@beccalew) December 29, 2019

Also among those urging people not to blindly accept the study’s findings is longtime digital media journalist Julia Alexander, whose tweet calling the study’s data collection process “flawed” and asking fellow reporters not to cover it has prompted a wave of backlash from people arguing that YouTube’s algorithm does not propagate extremist content.

Hey, a quick request, journalist friends: don’t write about the Violent Video Game report going around, but if you do, please reach out to anyone who covers this space on a daily or regular basis and other academics to understand why the data collection process is flawed. pic.twitter.com/Aj4BFVpYGm

— Chris Ray Gun ✈️ NY (@ChrisRGun) December 30, 2019

Speaking as an academic you appear to be advocating a kind of censorship. Don’t know this study first hand. But I do know research that people tend to be unduly critical of data that doesn’t confirm their worldview.

— Chris Ferguson 🎄 (@CJFerguson1111) December 29, 2019

This is amazing. “journalist” tells other “journalists” not to contradict the narrative they’ve been selling.
And these people wonder why they have zero credibility.
And even battier, good ole @SusanWojcicki is changing the platform because of these people.

— Randy (@Randy49961229) December 30, 2019

You: Oh no, an article refutes my favorite conspiracy theory! Quick, everyone, get out your shovels and bury it quick!

— Matt Hardly (@Matt_Hardly) December 29, 2019

The average independent youtuber is far more trustworthy and diligent than any mainstream legacy media outlet or personality, or anyone with a blue tick next to their name.

— Brad Pope (@BradPope20) December 30, 2019

Ledwich has responded to criticism with a Twitter thread about the study’s limitations, ending with this tweet:

8/ Recommendation-rabbit-hole proponents have been ignoring evidence and searching for compelling anecdotes since 2018.

I wish you all the best in dealing with your dissonance. I hope it’s not too painful.

END

— Mark Ledwich (@mark_ledwich) December 29, 2019

Zaitsev, meanwhile, spoke to FFWD and appeared to contradict Ledwich, saying, “We don’t really claim that it does deradicalize, we rather say it is a leap to say that it does [radicalize]. The last few comments about deradicalization are blown out of proportion on Twitter.”

UPDATE: Ledwich’s co-author has distanced herself from the main claim about the contentious research that the YouTube algorithm deradicalises users https://t.co/YFwfSrPV7i pic.twitter.com/SgzfoQ9fw8

— Chris Stokel-Walker (@stokel) December 30, 2019

She later disputed FFWD’s claim that she was “distancing” herself from Ledwich’s comments.