10.4: Neural Networks: Multilayer Perceptron Part 1 - The Nature of Code

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi again so maybe you just watched my previous videos about coding of perceptron and now I want to ask the question why is not just stop here so okay so we have this like very simple scenario right where we have a canvas and it has a whole bunch of points in that canvas or a Cartesian plane whatever we want to call it and we threw a line in between and we were trying to classify some points that are on one side of the line and some other points that are only another side of the line so that was a scenario where we had to be single perceptron the sort of like processing unit we can call it the neuron or the processor and it received inputs it had like x0 and x1 we're like the x and y coordinates of the point it also had this thing called a bias and then it generated an output each one of these inputs was connected to the processor with a wait no wait one way to or never wait wait wait and the processor creates a weighted sum of all the inputs multiplied by the weights that weighted sum is passed through an activation function to generate the output so why isn't this good now let's first think about what what's what's the limit here so the idea is that what if I want any number of inputs to generate any number of outputs that's the essence of what I want to do in a lot of different machine learning applications let's take a very classic classification algorithm which is to say okay well what if I have a handwritten digit like the number 8 and I have all of the pixels of this digit and I want those to be the inputs to this perceptron and I want the output to tell me a set of probabilities as to which digit it is so the output should look something like you know there's a point one chance it's zero there's a point two chances of one there's a point one chance' two two zero three four five six seven oh it it's like a point ninety nine chances at eight and 0.05 chance it's a ten and I don't think I got those to add up to one but you get the idea so the idea here is that we want to be able to have some type of processing unit that can take it arbitrary amount of inputs like maybe this is a 28 by 28 pixel image so there's 784 grayscale values and instead those are coming into the processor which was wait is it sudden and all this stuff when we get an output that have some arbitrary amounts of probabilities to mitla help us guess eight not that this is an eight this model why couldn't I just have a whole bunch more inputs and then a whole bunch more outputs but still have one single processing unit and the reason why I can't is a stems from an article I don't know I'm sorry a book that was published in 1969 by Marvin Minsky and Seymour Papert paper call of perceptrons you know AI luminaries here in the book perceptron Marvin Minsky and Seymour Papert point out that a simple perceptron the thing that I built in the previous two videos can only solve linearly separable problems so what does that mean anyway and why should you care about that so let's think about this this over here is a linearly separable problem meaning I need to classify this stuff and if I were to visualize all that stuff I can draw a line in between this part of the day this this stuff is to this class and this stuff that's with this class the stuff itself is separable by a line in three dimensions I could put a plane and that would be literally separable because I can kind of divide the space in half and and and understand it that way the problem is most interesting problems are not linearly separable you know there might be some Dana which clusters all here in the center that is of one class but anything outside of it is of another class and I can't draw one line to separate that stuff and you might be even thinking but that's you know still so much you could do so much with linearly separable stuff well here I'm going to show you right now a particular problem I'm looking for an eraser logging around like a crazy person I'm going to show you a particular problem called X or I'm making the case for why we need to go a step further I just had an idea let's go back to the litter I'm thinking the case for why we need to go to a close go a step further and make something called a multi-layer perceptron and I'm going to lay out that case for you right now so you might be familiar you might remember me from my videos on conditional statements and boolean expressions well in those videos I talked about operations like and and or which in computer programming syntax are often written you know double ampersand or two pipes the idea being that if I were to make a truth table true true false false so what I'm doing now is I'm showing you a truth table I have two elements I'm saying what if I say a and the B so if a is true well this makes no sense what I've drawn here because I'm losing my brain cells slowly over time with every passing any first not true false true false true hand true yields true if I am hungry and I am thirsty I shall go and have budge right true and true yields true true and false is false false and true is false false itself is false right if I have a boolean expression a and B I need both of those things to be true in order for me to get through interestingly enough this is a linearly separable problem I can draw a line right here and true is on one side and false is on the other side this means if this is a linearly separable problem which means I could create a perceptron that perceptron is going to have two inputs there are going to be boolean values true or false for a false and I could train this perceptron to give me an output which if two truths come in I should get it true if one false to the true comes David I should get a false - false coming I should get a false great or I could do the same thing what is or change in two if I'm going to do or me erase this dotted line and or now all of these become true because with an or operation A or B I only need one of these to be true in order to get true but if both are false I get false and guess what still a linearly separable problem and is literally separable or is literally separable we could have a perceptron learn to do both of those things now hold on a second there is another boolean operator which you may you might not have heard up until this video which would be really kind of exciting for me it would make me very happy if somebody watching this never heard of this before it is called x4 can you see what I'm writing near X or the X stands for exclusive exclusive its exclusive or which means it's only true if one is true and what is false it's not true both are false this or that both those things are false I'm still false but if both are true it's also false so this is exclusive or let me erase all this exclusive or I mean if one if one is true and one is false it's true if one is true is one assault is true if both are true it's false if both are false it's false this is exclusive or a very simple boolean operation however I triple dog dare with the cherry on top you to draw a single line through here to divide the false in the truth I cannot I can draw if this is not a linearly separable problem this is the point of all this like rambling I could draw two lines one here and now I have all the truths in here and the false is outside of them this means a single perceptron the simplest cannot solve cannot solve the simple operation like this so this is what Minsky and Papert talked about in the book perceptrons well this is like an interesting idea conceptually it kind of seems very exciting but if it can't solve X or what are we supposed to do in this the answer to this is a new bike I've already thought of this yourself it's not two but I kind of missed a little piece of my diagram here right let's say this is a perceptron that knows how to solve and and this is a perceptron that knows how to solve for what if I took those same inputs and sent them into both and then I got the output here so this output would give me the result of and and this output would give me the result of or well what is XOR really XOR is actually or but not and right so if I could solve something and is linearly separable not and is also linearly separable so what I want then is for both of these out but actually to go into another perceptron that would then be and so this project run can solve not and and this perceptron can solve or and those output can come into here then this would be the results of both or is true and not and is true which is actually this these are the only two things where or is too but not in but not and and so the idea here is that more complex problems that are not linearly separable can be solved by linking multiple perceptron together and this is the idea of a multi layered perceptron we have multiple layers and this is still a very simple diagram you could think of this almost as like if you were designing a circuit right if you decide what electricity should flow and because we're like a these were switches you know how could you get a bunch of outfit you have an LED turn on with exclusive or you would actually water the circuit basically in exactly this way so this is the idea here so what I am would like to do in the next so at some point I would like to make a video where I actually just kind of build take that previous perceptron example and just take a few steps farther to do exactly that but what I'm going to do actually in the next videos is diagram out this structure of a multi-layer perceptron how the inputs how the outputs work how the feed-forward algorithm works where the inputs come in get x weights get some together and generate an output and build a simple javascript library and has all the pieces of that neural network system in it okay so I hope that this video kind of gives you a nice follow-up from the perceptron in a sense of why this is important and I'm not sure if I was done yet I'm going to go check the live chat then questions are important things like this this video will be over oh yeah back so there was one question which is important like Oh what I heard somebody in the chat asked what about the hidden layer and so this is jumping ahead a little bit because I'm going to get to this in more detail in the next video there's a way that I drew this diagram is pretty awkward let me try to fix this up for a second imagine there were two inputs and I actually drew those as if they are neuron and I know I'm out of the frame but I'm still here and these inputs were connected to each of these perceptrons each was connected and each was weighted so this is actually what's now known as a 3 layer Network there is the input layer this is the hidden layer in the reason why it will actually reduce the output layer right that's obvious right this is the input those are the inputs the trues and falses this is the output layer that should give us a result are we still true or we false and then the hidden layer are the neurons that sit in between the inputs and the outputs and they're called hidden because as a kind of user of the system we don't necessarily see them a user of the system is speeding in data and looking at the output the hidden layer in a sense is where the magic happens the hidden layer is what allows one to get around this sort of linearly separable question so the war in layers the more neurons the more amount of complexity in a way that the system the more waits the more parameters that need to be tweaked and we'll see that as they start to build a neural network library the way that I want that library to be set up I want to say I want to make a network with ten inputs three outputs one hidden layer with 15 like hidden neurons something like that but there could be multiple hidden layers and eventually as I get further and further down this road if I keep going we'll see there are all sorts of other styles of how the network can be configured and set up and whether it's the output feeds back to to the input that's something called recurrent networks convolutional Network and if some this kind of like said image processing operations almost happens early on before as part of the layers so there's a a lot of stuff in the grand scheme of things to get to but this is the fundamental building blocks so okay so I'm in the next video I'm going to start building the library and to be honest I think what I need to do yeah the next video I'm going to set up the basic skeleton of the neural network library and look at all the pieces that we need and then I'm going to have to keep going and look at some matrix math that's going to be fun okay see you soon [Music]
Info
Channel: The Coding Train
Views: 232,280
Rating: 4.9255605 out of 5
Keywords: live, programming, daniel shiffman, creative coding, coding challenge, tutorial, coding, challenges, coding train, the coding train, nature of code, artificial intelligence, itp nyu, neural network, intelligence creative coding, neural network artist, intelligence and learning, machine learning, perceptron, multilayered perceptron, neural network intro, XOR neural network, neural net, XOR perceptron, perceptron javascript, multilayer perceptron, multilayer neural net
Id: u5GAVdLQyIg
Channel Id: undefined
Length: 15min 55sec (955 seconds)
Published: Tue Jun 27 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.