Reverse Engineering The YouTube Algorithm: Part II

A team of Google researchers presented a paper in Boston, Massachusetts on September 18, 2016 titled Deep Neural Networks for YouTube Recommendations at the 10th annual Association for Computing Machinery conference on Recommender Systems (or, as the cool kids would call it, the ACM’s RecSys ‘16).

This paper was written by Paul Covington (currently a Senior Software Engineer at Google), Jay Adams (currently a Software Engineer at Google), and Embre Sargin (currently a Senior Software Engineer at Google) to show other engineers how YouTube uses Deep Neural Networks for Machine Learning. It gets into some pretty technical, high-level stuff, but what this paper ultimately illustrates is how the entire YouTube recommendation algorithm works(!!!). It gives a careful and prudent reader insight into how YouTube’s Browse, Suggested Videos, and Recommended Videos features actually function.

An Engineering Paper On The YouTube Algorithm For Dummies

While it was not necessarily the intent of the authors, it is our belief the Deep Neural paper can be read and interpreted by and for YouTube video publishers. The below is how we (and when I say we, I mean me and my team at my shiny new company Little Monster Media Co.) interpret this paper as a video publisher.

In a previous post I co-wrote here on Tubefilter, Reverse Engineering The YouTube Algorithm, we focused on the primary driver of the algorithm, Watch Time. We looked at the data from our videos on our channel to try to gain insight into how the YouTube algorithm worked. One of the limiting factors to this approach, however, is that it’s coming from a video publisher’s point of view. In an attempt to gain some insight into the YouTube algorithm we asked ourselves and then answered the question, “Why are our videos successful?” We were doing our best with the information we had, but our initial premise wasn’t ideal. And while I stand by our findings 100%, the problem with our previous approach is primarily twofold:

Looking at an individual set of channel metrics means there’s a massive blind spot in our data, as we don’t have access to competitive metrics, session metrics, and clickthrough rates.
The YouTube algorithm gives very little weight to video publisher-based metrics. It’s far more concerned with audience and individual-video-based metrics. Or, in laymen’s terms, the algorithm doesn’t really care about the videos you’re posting, but it cares a LOT about the videos you (and everyone else) are watching.
But at the time we wrote our original paper, there had been nothing released from YouTube or Google in years that would shed any light onto the algorithm in a meaningful way. Again, we did what we could with what we had. Fortunately for us though, the paper recently released by Google gives us a glimpse into exactly how the algorithm works and some of its most important metrics. Hopefully this begins to allow us to answer the more poignant question, “Why are videos successful?”

Staring Into The Deep Learning Abyss

The big takeaway from the paper’s introduction is that YouTube is using Deep Learning to power its algorithm. This isn’t exactly news, but it’s a confirmation of what many have believed for some time. The authors make the reveal in their intro:

In this paper we will focus on the immense impact deep learning has recently had on the YouTube video recommendations system….In conjugation with other product areas across Google, YouTube has undergone a fundamental paradigm shift towards using deep learning as a general-purpose solution for nearly all learning problems.

What this means is that with an increasing likelihood there’s going to be no humans actually making algorithmic tweaks, measuring those tweaks, and then implementing those tweaks across the world’s largest video sharing site. The algorithm is ingesting data in real time, ranking videos, and then providing recommendations based on those rankings. So, when YouTube claims they can’t really say why the algorithm does what it does, they probably mean that very literally.