Why The YouTube Algorithm Will Always Be A Mystery

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Back when online video was a different shape, searching for videos on YouTube relied on… well, honesty, really. Creators would put up a video, they would tell YouTube what was in it, and YouTube would look at the title and the description and the tags and take it from there. And immediately, people tried to cheat. It turned into an arms race. Keyword stuffing, tag spam, the reply girls of 2012. If there was any way to guess the YouTube algorithm's priorities, then spammers exploited it. And desperate video creators changed what they were making to fit what they thought YouTube wanted, based on rumour, speculation, and extrapolating way too much from way too little data. So recently, YouTube has just kept quiet. Their parent company, Google, learned this lesson back when online video was a RealPlayer window and connecting to the internet sounded like the screeching of a robot cat: if Google ever gave a hint about how to boost web sites up their search results, then there would be a rush of spammers trying to exploit that knowledge. So the advice was always: Just make good things. We’ll figure it out. But now, on YouTube, it's not that the folks who control the algorithm won't tell people how it works. It's that they can't. And here's the evidence: a paper written by YouTube engineers, explaining they're using Google's research into machine-learning to recommend videos. That is the same kind of software that creates those weird Deep Dream images, that makes their text to speech sound so realistic, and that beat the world's best Go player at his own game. And I know I'm simplifying a bit here, but machine learning, the way Google does it, is basically a black box. You give a neural network some input, like the game board in Go. And it gives you outputs: moves it thinks might work. Those outputs are tested, and the results go back into the box, and then you repeat that process a billion or so times, and it starts to get really good. But no-one can look inside that black box and see how it works: it’s designed by a computer, for a computer. And neural networks are great if you're playing a game with an obvious scoring and points system. You win, or you lose. But training that black box on YouTube videos is a bit messier. It’s not just that human behaviour is unpredictable and complicated: it’s trying to work out what counts as winning in the first place. If YouTube tells the algorithm "show videos that people like", then it'll kill any channel which talks about politics, where people hit dislike if they disagree. And it'll silence anyone who has a small but vocal group loudly disagreeing with them. If YouTube tells the algorithm, "okay, show videos that people share", then videos about private things like medical issues or sex education vanish, and folks who have a small, loyal, but quiet fanbase disappear into their own little world. And YouTube creators, of course, would love the algorithm to recommend only their own videos... even when the rest of the world doesn’t actually want to watch them. So YouTube started out, according to the paper, giving their algorithm the reasonable goal of "increase watch time". But that has a few problems. Because there's no way for a computer to determine quality, or truth. At least, not yet. The system doesn't understand context, it can't tell the difference between actual, reliable information and unhinged, paranoid conspiracy clickbait. Although, admittedly, neither can a lot of people, which is why these videos are getting a lot of traction. And it can't tell the difference between videos that are suitable for children, made with education in mind, and creepy, trademark-infringing unofficial efforts. It just knows what kids click on and what they watch. So, sure, the algorithm might increase watch time, in the short term… but a lot of the videos it recommends are going to be questionable at best and actively harmful at worst. And they’re going to be the sort of thing that advertisers get really nervous about. Remember, the algorithm is a black box. No-one knows what it’s doing. All YouTube can do is change the feedback it’s getting, change the signals that say “this is good” or “this is bad”. If YouTube wanted a human to watch and categorise every video being uploaded as “safe” or “unsafe” in real time, they would need about 100,000 employees working shifts round the clock. Plus, that would expose them to legal issues: in most of the countries where YouTube has an office, if you let an algorithm do the filtering and then manually step in when you get a complaint, you're legally fine. But if you approve everything with a human in the loop, you are a publisher, and you're opening yourself up to some very expensive lawsuits. The ideal algorithm, the ideal black box, from YouTube's point of view, would be one with a goal of "increase ad revenue", and which thought about the long-term, which knew about social issues, and potential advertiser boycotts, and financial strategies, and public perception, and what’s suitable for kids, and... and about truth. At which point, what you have is something that can do the job of YouTube's senior management team. And artificial intelligence hasn't gotten that good. At least, not yet.

Info

Channel: Tom Scott

Views: 2,307,663

Rating: undefined out of 5

Keywords:

Id: BSpAWkQLlgM

Channel Id: undefined

Length: 4min 59sec (299 seconds)

Published: Mon May 15 2017