Why Neural Networks can learn (almost) anything

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

One of the best video on NN on youtube. I'd be more than happy to see similar ones

πŸ‘οΈŽ︎ 4 πŸ‘€οΈŽ︎ u/didinko πŸ“…οΈŽ︎ Oct 19 2022 πŸ—«︎ replies

Thank you for sharing. I really enjoyed this. Non scientist and was easily able to understand it.

πŸ‘οΈŽ︎ 3 πŸ‘€οΈŽ︎ u/rannieb πŸ“…οΈŽ︎ Oct 19 2022 πŸ—«︎ replies
Captions
you are currently watching an artificial neural network learn in particular it's learning the shape of an infinitely complex fractal known as the mandelbrot set this is what that set looks like complexity all the way down now in order to understand how a neural network can learn the mandelbrot set really how it can learn anything at all we will need to start with a fundamental mathematical concept what is a function informally a function is just a system of inputs and outputs numbers in numbers out in this case you input an x and it outputs a y you can plot all of a function's x and y values in a graph where it draws out a line what is important is that if you know the function you can always calculate the correct output y given any input x but say we don't know the function and instead only know some of its x and y values we know the inputs and outputs but we don't know the function used to produce them is there a way to reverse engineer that function that produced this data if we could construct such a function we could use it to calculate a y value given an x value that is not in our original data set this would work even if there was a little bit of noise in our data a little randomness we can still capture the overall pattern of the data and continue producing y values that aren't perfect but close enough to be useful what we need is a function approximation and more generally a function approximator that is what a neural network is this is an online tool for visualizing neural networks and i'll link it in the description below this particular network takes two inputs x1 and x2 and produces one output technically this function would create a three-dimensional surface but it's easier to visualize in two dimensions this image is rendered by passing the x y coordinate of each pixel into the network which then produces a value between negative one and one that is used as the pixel value these points are our data set and are used to train the network when we begin training it quickly constructs a shape that accurately distinguishes between blue and orange points building a decision boundary that separates them it is approximating the function that describes the data it's learning and is capable of learning the different data sets that we throw at it so what is this middle section then well as the name implies this is the network of neurons each one of these nodes is a neuron which takes in all the inputs from the previous layer of neurons and produces one output which is then fed to the next layer inputs and outputs sounds like we're dealing with a function indeed a neuron itself is just a function one that can take any number of inputs and has one output each input is multiplied by a weight and all are added together along with bias the weights and bias make up the parameters of this neuron values that can change as the network learns to keep it easy to visualize we'll simplify this down to a two-dimensional function with only one input and one output now neurons are our building blocks of the larger network building blocks that can be stretched and squeezed and shifted around and ultimately work with other blocks to construct something larger than themselves the neuron as we've defined it here works like a building block it is actually an extremely simple linear function one which forms a flat line or plane when there's more than one input with the two parameters the weight and bias we can stretch and squeeze and move our function up and down and left and right as such we should be able to combine it with other neurons to form a more complicated function one built from lots of linear functions so let's start with a target function one we want to approximate i've hard-coded a bunch of neurons whose parameters were found manually and if we weight each one and add them up as would happen in the final neuron of the network we should get a function that looks like the target function well that didn't work at all what happened well if we simplify our equation distributing weights and combining like terms we end up with a single linear function turns out linear functions can only combine to make one linear function this is a big problem because we need to make something more complicated than just a line we need something that is not linear a non-linearity in our case we will be using a relu a rectified linear unit we use it as our activation function meaning we simply apply it to our previous naive neuron this is about as close as you can get to a linear function without actually being one and we can tune it with the same parameters as before however you may notice that we can't actually lift the function off of the x-axis which seems like a pretty big limitation well let's give it a shot anyway and see if it performs any better than our previous attempt we're still trying to approximate the same function and we're using the same weights and biases as before but this time we're using a value as our activation function and just like that the approximation looks way better unlike before our function cannot simplify down to a flat linear function if we add the neurons one by one we can see the simple value functions building on one another and the inability for one neuron to lift itself off the x-axis doesn't seem to be a problem many neurons working together overcome the limitations of individual neurons now i manually found these weights and biases but how would you find them automatically the most common algorithm for this is called back propagation and is in fact what we're watching when we run this program it essentially tweaks and tunes the parameters of the network bit by bit to improve the approximation and the intricacies of this algorithm are really beyond the scope of this video i'll link some better explanations in the description now we can see how this shape is formed and why it looks like it's made up of sort of sharp linear edges it's the nature of the activation function we're using we can also see why if we use no activation function at all the network utterly fails to learn we need those non-linearities so what if we try learning a more complicated data set like this spiral let's give it a go seems to be struggling a little bit to capture the pattern no problem if we need a more complicated function we can add more building blocks more neurons and layers of neurons and the network should be able to piece together a better approximation something that really captures the spiral it seems to be working in fact no matter what the data set is we can learn it that is because neural networks can be rigorously proven to be universal function approximators they can approximate any function to any degree of precision you could ever want you can always add more neurons this is essentially the whole point of deep learning because it means that neural networks can approximate anything that can be expressed as a function a system of inputs and outputs this is an extremely general way of thinking about the world the mandelbrot set for instance can be written as a function and learned all the same this is just a scaled-up version of the experiment we were just looking at but with an infinitely complex data set we don't even really need to know what the manual brought set is the network learns it for us and that's kind of the point if you can express any intelligent behavior any process any task as a function then a network can learn it for instance your input could be an image and your output a label as to whether it's a cat or a dog or your input could be text in english and your output a translation to spanish you just need to be able to encode your inputs and outputs as numbers but computers do this all the time images video text audio they can all be represented as numbers and any processing you may want to do with this data so long as you can write it as a function can be emulated with a neural network it goes deeper than this though under a few more assumptions neural networks are provably turing complete meaning they can solve all of the same kinds of problems that any computer can solve an implication of this is that any algorithm written in any programming language can be simulated on a neural network but rather than being manually written by a human it can be learned automatically with a function approximator neural networks can learn anything okay that is not true first off you can't have an infinite number of neurons there are practical limitations on network size and what can be modeled in the real world i've also ignored the learning process in this video and just assumed that you can find the optimal parameters magically how you realistically do this introduces its own constraints on what can be learned additionally in order for neural networks to approximate a function you need the data that actually describes that function if you don't have enough data your approximation will be all wrong it doesn't matter how many neurons you have or how sophisticated your network is you just have no idea what your actual function should look like it also doesn't make a lot of sense to use a function approximator when you already know the function you wouldn't build a huge neural network to say learn the mandelbrot set when you can just write three lines of code to generate it unless of course you want to make a cool background visual for a youtube video there are countless other issues that have to be considered but for all these complications neural networks have proven themselves to be indispensable for a number of really rather famously difficult problems for computers usually these problems require a certain level of intuition and fuzzy logic that computers generally lack and are very difficult for us to manually write programs to solve fields like computer vision natural language processing and other areas of machine learning have been utterly transformed by neural networks and this is all because of the humble function a simple yet powerful way to think about the world and by combining simple computations we can get computers to construct any function we could ever want neural networks can learn almost anything [Music]
Info
Channel: Emergent Garden
Views: 949,068
Rating: undefined out of 5
Keywords:
Id: 0QczhVg5HaI
Channel Id: undefined
Length: 10min 30sec (630 seconds)
Published: Sat Mar 12 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.