How Machines Learn

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

So is grey just completely obsessed with bees at this point?

πŸ‘οΈŽ︎ 365 πŸ‘€οΈŽ︎ u/[deleted] πŸ“…οΈŽ︎ Dec 18 2017 πŸ—«︎ replies

He also made a footnote: "How do Machines Really Learn?" https://www.youtube.com/watch?v=wvWpdrfoEv0

πŸ‘οΈŽ︎ 448 πŸ‘€οΈŽ︎ u/Clashin_Creepers πŸ“…οΈŽ︎ Dec 18 2017 πŸ—«︎ replies

Hahaha fellow humans what an enjoyable and entertaining video.

πŸ‘οΈŽ︎ 614 πŸ‘€οΈŽ︎ u/Horsepipe πŸ“…οΈŽ︎ Dec 18 2017 πŸ—«︎ replies

If you want to dive into the nitty-gritty of neural networks, I recommend checking out 3Blue1Brown's playlist on the subject.

πŸ‘οΈŽ︎ 165 πŸ‘€οΈŽ︎ u/Obligatory_Username πŸ“…οΈŽ︎ Dec 18 2017 πŸ—«︎ replies

Machine learning is the field in which I work and while I often have some criticisms of how CGP Grey presents information, this video was really good, especially with the foot note. I might be using it in the future when explaining to people what I do.

πŸ‘οΈŽ︎ 321 πŸ‘€οΈŽ︎ u/Cranyx πŸ“…οΈŽ︎ Dec 18 2017 πŸ—«︎ replies

It's only a matter of time before Grey teaches a bot to create podcasts for him so he can retire and spend all of his time with Mr. Chompers.

πŸ‘οΈŽ︎ 126 πŸ‘€οΈŽ︎ u/Krohnos πŸ“…οΈŽ︎ Dec 18 2017 πŸ—«︎ replies
πŸ‘οΈŽ︎ 65 πŸ‘€οΈŽ︎ u/rockham πŸ“…οΈŽ︎ Dec 18 2017 πŸ—«︎ replies

So the development of machine learning is somewhat akin to natural selection, but humans and 'teacher bots' are setting selection factors?

πŸ‘οΈŽ︎ 76 πŸ‘€οΈŽ︎ u/natumel πŸ“…οΈŽ︎ Dec 18 2017 πŸ—«︎ replies

The only thing that I feel typically gets glossed over in videos that attempt to explain machine learning is just how much work humans are actually doing to create these algorithms. I think this video fell short in that way, but otherwise was very well done.

Creating a good data set for a machine learning algorithm is very difficult and very complicated. It's not just a matter of throwing as many bee photos and "three" photos at it as possible. Although that's important, it's also important to throw photos of things that aren't bees or threes at it and carefully monitor the effect these have on the model.

It's also critical that the data is cleaned, which aside from being very painstaking work is also very intellectual and involves a deep human understanding of the data and the correlations and boundaries within it. And these "non bees or threes" photos shouldn't be completely random either. The dog wearing a bee costume is actually a great example of the kind of human reasoning that goes into training machines. If humans didn't identify that scenario as being problematic for the machine learning, they wouldn't be able to strengthen that part of the model to reduce the false positives.

Data sets can have errors or human bias that strongly influence the final algorithm if the data set isn't carefully prepared and well understood.

So yes, it's true to say that no human truly understands the actual model in the most literal mathematical sense, but it's not true to say that humans have no insight into the kinds of factors that influence the result of the computation that happens within that model.

I know I'm being super pedantic, but I just think this topic is a bit overly mystified.

I like the analogy of comparing the human brain to a machine learning model though. When somebody asks "how does the computer know this photo is of a dog?" just ask them the very same question about their human brain. They won't know exactly how all the neurons in their brain are connected and what signals they send, but they'll be able to explain the inputs and the insights that can be easily reasoned about i.e. "I can see fur, four legs, a long nose, and a tail". Those are exactly the same factors that the computer is looking at. It's just looking at them in a different way than you are as a human. Neither of them are completely understood in a literal sense when you ask the question "how does it look", but that's sort of beside the point.

πŸ‘οΈŽ︎ 45 πŸ‘€οΈŽ︎ u/spoonraker πŸ“…οΈŽ︎ Dec 18 2017 πŸ—«︎ replies
Captions
On the internet, the algorithms are all around you. You are watching this video because an algorithm brought it to you (among others) to click, which you did, and the algorithm took note. When you open the TweetBook, A the algorithm decides what you see. When you search through your photos, A the algorithm does the finding. Maybe even makes a little movie for you. When you buy something, A the algorithm sets the price and A the algorithm is at your bank watching transactions for fraud. The stock market is full of algorithms trading with algorithms. Given this, you might want to know how these little algorithmic bots shaping your world work, especially when they don't. In Ye Olden Days, humans built algorithmic bots by giving them instructions the humans could explain. "If this, then that." But many problems are just too big and hard for a human to write simple instructions for. There's a gazillion financial transactions a second, which ones are fraudulent? There's octillion videos on NetMeTube. Which eight should the user see as recommendations? Which shouldn't be allowed on the site at all? For this airline seat, what is the maximum price this user will pay right now? Algorithmic bots give answers to these questions. Not perfect answers, but much better than a human could do. But how these bots work exactly, more and more, no one knows. Not even the humans who built them, or "built them", as we will see... Now companies that use these bots don't want to talk about how they work because the bots are valuable employees. Very, VERY valuable. And how their brains are built is a fiercely guarded trade secret. Right now the cutting edge is most likely very 'I hope you like linear algebra', but what the current hotness is on any particular site and how the bots work, is a bit "I dunno", and always will be. So let's talk about one of the more quaint but understandable ways bots CAN be "built" without understanding how their brains work. Say you want a bot that can recognize what is in a picture. Is it a bee, or is it a three? It's easy for humans (even little humans), but it's impossible to just tell a bot in bot language how to do it, because really we just know that's a bee and that's a three. We can say in words what makes them different, but bots don't understand words. And it's the wiring in our brains that makes it happen anyway. While an individual neuron may be understood, and clusters of neurons' general purpose vaguely grasped, the whole is beyond. Nonetheless, it works. So to get a bot that can do this sorting, you don't build it yourself. You build a bot that builds bots, and a bot that teaches bots. These bots' brains are simpler, something a smart human programmer can make. The builder bot builds bots, though it's not very good at it. At first it connects the wires and modules in the bot brains almost at random. This leads to some very... "special" student bots sent to teacher bot to teach. Of course, teacher bot can't tell a bee from a three either; if the human could build teacher bot to do that, well, then, problem solved. Instead the human gives teacher bot a bunch of "bee" photos, and "three" photos, and an answer key to which is what. Teacher bot can't teach, but teacher bot can TEST. The adorkable student bots stick out their tongues, try very hard, but they are bad at what they do. Very, VERY, bad. And it's not their fault, really, they were built that way. Grades in hand, the student bots take a march of shame back to builder bot. those that did best are put to one side, the others recycled. Builder bot still isn't good at building bots, but now it takes those left and makes copies with changes in new combinations. Back to school they go. Teacher bot teaches - er, tests again, and builder bot builds again. And again, and again. Now a builder that builds at random, and a teacher that doesn't teach, just tests, and students who can't learn, they just are what they are, in theory shouldn't work, but in practice, it does. Partly because in every iteration, builder bot's slaughterhouse keeps the best and discards the rest, and partly because teacher bot isn't overseeing an old-timey, one-room schoolhouse with a dozen students, but an infinite warehouse with thousands of students. The test isn't ten questions, but a million questions. And how many times does the test, build, test loop repeat? As many as necessary. At first students that survive are just lucky, but by combining enough lucky bots, and keeping only what works, and randomly messing around with new copies of that eventually a student bot emerges that isn't lucky, that can perhaps barely tell bees from threes. As this bot is copied and changed, slowly the average test score rises, and thus the grade needed to survive the next round gets higher and higher. Keep this up and eventually from the infinite warehouse (slaughterhouse) a student bot will emerge, who can tell a bee from a three in a photo it's never seen before pretty well. But how the student bot does this, neither the teacher bot nor the builder bot, nor the human overseer, can understand. Nor the student bot itself. After keeping so many useful random changes, the wiring in its head is incredibly complicated, and while an individual line of code may be understood, and clusters of code's general purpose vaguely grasped, the whole is beyond, nonetheless, it works. But this is frustrating, especially as the student bot is very good at exactly only the kinds of questions it's been taught to. It's great with photos, but useless with videos or baffled if the photos are upside down, or things that are obviously not bees, it's confident are. Since teacher bot can't teach, all the human overseer can do is give it more questions, to make the test even longer, to include the kinds of questions the best bots get wrong. This is important to understand. It's a reason why companies are obsessed with collecting data. More data equals longer tests equals better bots. So when you get the "Are you human?" test on a website, you are not only proving that you are human, (hopefully), but you are also helping to build the test to make bots that can read, or count, or tell lakes from mountains, or horses from humans. Seeing lots of questions about driving lately? Hmm...! What could that be building a test for? Now figuring out what's in a photo, or on a sign, or filtering videos, requires humans to make correct enough tests. But there is another kind of test that makes itself. Tests ON the humans. For example, say entirely hypothetical NetMeTube wanted users to keep watching as long as possible? Well, how long a user stays on the site is easy to measure. So, teacher bot gives each student bot a bunch of NetMeTube users to oversee, the student bots watch what their user watches, looks at their files, and do their best to pick the videos that keep the user on the site. The longer the average, the higher their test score. Build, test, repeat. A million cycles later, there's a student bot who's pretty good at keeping the users watching, at least compared to what a human could build. But when people ask: "How does the NetMeTube algorithm select videos?" Once again, there isn't a great answer other than pointing to the bot, and the user data it had access to, and most vitally, how the human overseers direct teacher bot to score the test. That's what the bot is trying to be good at to survive. But what the bot is thinking, or how it thinks it, is not really knowable. All that's knowable is this student bot gets to be the algorithm, because it's point one percent better than the previous bot at the test the humans designed. So everywhere on the internet, behind the scenes, there are tests to increase user interaction, or set prices just right to maximize revenue, or pick the posts from all your friends you'll like the most, or articles people will share the most, or whatever. If it's testable, it's teachable. Well, "teachable", and a student bot will graduate from the warehouse to be the algorithm of its domain. At least, for a little while. We're used to the idea that the tools we use, even if we don't understand them, someone does, but with our machines that learn we are increasingly in a position where we use tools, or are used by tools, that no one, not even their creators, understand. We can only hope to guide them with the tests we make, and we need to get comfortable with that, as our algorithmic bot buddies are all around, and not going anywhere. OK. The bots are watching. You know what's coming. This is where I need to ask you... To like... comment... ...and subscribe. And bell me. And share on the TweetBook. The algorithm is watching. It won't show people the video... unless you do this. Look what you've reduced me to, bots. What do you want? Do you want watch time? Is that what you want? Fine. (sigh...) Hey guys, did you know I also have podcasts you can listen to? Maybe even just in the background while you're tidying up your all room for hours? Or whatever? There's hours of audio entertainment for you, and watch time for the bots overseeing your actions. Go ahead and - and take a click. Entertain yourself. Help me. Help the bots.
Info
Channel: CGP Grey
Views: 6,427,062
Rating: 4.9738102 out of 5
Keywords: cgpgrey, education, hello internet
Id: R9OHn5ZF4Uo
Channel Id: undefined
Length: 8min 54sec (534 seconds)
Published: Mon Dec 18 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.