Fundamentals of TensorFlow

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
you so hello everybody my name is Sam Abraham's and I'm a los angeles-based machine learning engineer one of the things that I am known for known for I don't like saying that but one of the projects that I'm a part of is the tensor flow of white paper notes essentially it's just this annotated guide to the white paper written for tensor flow and released back in November if you guys haven't read it and are interested in how tensor flow is set up what its infrastructure looks like I highly recommend it it's a very enjoyable easy to read but interesting white paper I also am the maintainer for the tensor flow for Raspberry Pi so if you guys are interested in trying out tensor flow on your Raspberry Pi trying out some of the distributed functions of tensor flow I have binaries up there for both - - Python - and Python 3 so check those out obviously completely free and I am a contributor to the tensor flow project it's obviously a lot of fun getting the opportunity to work with like even though it's digitally with a lot of these really talented engineers and if you guys want to improve yourselves improve your engineering ability I highly highly recommend getting out there just finding something right anything that you think could be moderately improved just go out there give it your best shot and they the people in charge of the 10 circle project will guide you and help you you solve the problem they won't just do it for you ok so I this talk is going to be a sprint there's a lot of things I'm trying to cover and I need I actually want to ask you guys so it seems like pretty much everyone here is for tensorflow which is fantastic so just by a show of hands Tim it or not how many people have used tensor flow before ok how many people have read about tent-like sort of understand what tensor flow is they have a good idea they've done some research and they pretty much know what going on with tensorflow okay and then how many people have heard about tensorflow but really don't know very much about it other than it's used for machine learning it's cool it's it's fast Google made it and I want to get a part of it okay okay so we got like a fairly diverse mix of people so we'll be going through this and I'll do my best to be thorough but at the same time I I don't want to bore those who are using tensorflow so we're gonna go over some of the core tensorflow API terminology I don't want anybody to be left behind as we move forward it's that on the whole tensorflow isn't a very complex workflow but you don't want to be I don't want people to get lost when we start talking about graphs and edges and adding notes and all that like I said well talk about the tensorflow workflow I want to show you guys some example tensorflow code and we're gonna try to do a little bit of a lot of mini live coding session I you know you don't want to have those go too long but I do want to showcase some really cool features of tensorflow will go for tensor board tensor board being one of the really really cool visualization software that comes with tensor flow it's just like part of the package it's great okay so let's talk about the tensor flow of programming model any of you guys have used the you know before it's very very similar it's another it's another graph based engine computation graph based engine the user facing API so most jet pretty much everybody using tensor flow on a day to day basis is going to be interfacing it with Python but it is the majority of the actual execution code is written in C++ so that the idea is that we have this very user-friendly nice Python interface in the front but the actual meat going on underneath is this already compiled C++ code which allows it to be much more efficient than the something like Python so the computational code that said was written in is written in C++ and that there are separate implementations for either CPU or GPU for each of the functions written in tensorflow so if something in general everything is capable of being run on CPU and if they can get additional benefit of being run on a GPU or if they're if it just is possible to run it on a GPU even if it isn't highly parallelized tensorflow is able to decide to put operations either on the CPU or GPU based on what is possible to go where if any of this is a data science hub it would seem so I'm I assume that a lot of you guys are familiar at least with numpy it's one of the lingua Franca's of data science it's it's the ultimate Python and dimensional array library tensorflow is designed to be completely and utterly integrated with numpy you can pass in numpy objects as as if they were tensorflow objects and objects returned at the end of tensorflow runs are actually numpy objects which means you can seamlessly go from manipulate manipulating your data passing it into a tensor flow graph and then getting receiving some data transform data and then playing around with it some more with numpy okay so tensor for programming pretty much at the end of the day boils down to two steps the first step is to build the graph that you want to run so you build a node you have all these different steps right and you haven't run it yet then you use a tensor flow session to run it as many times as you want so the in for machine learning right the idea here is that you build a graph you build your model that is what you that be your prediction step right and then you have a you we analyze the error then we do some sort of gradient descent back if for a neural network you back propagate right and then that you run it multiple multiple times with each to run corresponding to a single training batch right makes sense pretty good okay cool all right so let's talk about graphs so they're the at the core of everything is the graph right it's the primary structure of tensor flow and in general when you're making your own models you're only going to have a single graph that you're dealing with but you should know that tensor flow is capable of handling multiple graphs and that may or may not come up later I'm not I don't remember exactly what's going on but with this guy but you should know that it's possible to run multiple graphs and there's no reason not to in fact you can think of one big graph right you can you can think of it as being able to be split in half and being two separate graphs if you'd like right nodes represent computations or data transformations so I don't know how many of you guys are familiar with your bit with basic graphs and graph theory it's pretty much just circles and lines but nodes represents computations or data transformations and edges represent data transfers so you'll have circles that represent computations like add or multiply our matrix multiply and then they convey edges between them the lines connecting those our arrows saying okay this the output from this addition is going to move on to the next node which might be multiplication okay so we're gonna basically go through this hopefully this isn't too dull but I don't want to leave anybody behind if nobody seen graphs in mathematical form before so what is a data flow graph it's also known as a computational graph right it's just a very nice way to visualize an abstract mathematical computations so this is a very simple graph showcasing the right so we have some input a we have some input B right they move their data so here we have the number B moving up to this node we have the number a moving to this node we end up with a plus B and then this plus Multan will output some the value of a plus B pretty simple right and the idea is that we can chain these together we already went through these nodes and edges before right so we have nodes representing operations we have edges representing numbers so why use graphs one they're super highly compositional that means that they are incredibly flexible for research for research purposes so the pheno torch a lot of these other machine learning libraries are useful because it allows researchers to kind of put together these very fluid and flexible models without having to worry about whether or not their library can support it right it if you're using this very heavy-handed layer by layer approach you may not be able to get the sort of flexibility that you're looking for and may not be able to tweak your model to the degree necessary to get that extra percentage of correct estimation in addition graphs are useful for calculating derivatives if any of you guys remember your chain rule yeah anyone anyone up for calculus know if you guys are getting into machine learning I highly recommend you falling in love with calculus again not the hard parts of calculus just like the easy derivative but bits that's not true you should learn all of it but basically what you can do with the chain rule if you if you recall your chain rule I didn't want to like pull out the whiteboard but basically you can you the idea is that we have a function f of X right and we have a function G of X or G of Y right and then we want we have the the problem of G of f of X right and so the goal is to get the derivative of G with respect to X but we have this Y in the middle we need to figure out how to get like how do we get there and the answer is you take the derivative of G with respect to Y and multiply that with by of F with respect to X and math math happens and you have what you want right so the key the key with that is that you can the whole reason they called the chain rules that you can it doesn't just go two layers deep right it goes very deep deep that's why they call it deep learning right so but you can basically start with you can have this layer a network that is many many layers deep has goes very far back and you can get the correct derivative with respect to any of your inputs right and if any of you guys have gone through like your basic neural network training you know that back propagation is all about calculating the chain rule backwards throughout a graph cool that's another reason why we use graphs it's also easier to implement implement distributed computing this is pretty straightforward basically if you can segment your computation into individual chunks it's way easier to place them on different graphs and on different threads it's already segmented nice it you can see and like I said neural networks are pretty much already computation graphs okay I'll try to sprint through this bastard tensor is just in case you guys are uncomfortable with that terminology all they are is an abstraction of a matrix right so if a two dimensional tensor is a matrix a one-dimensional tensor is a vector a zero dimensional tensor is just a scalar or just a single number and then as you go higher you can have a three that you can imagine a three-dimensional matrix as a cube if you want Anna for dementia matrix as a hypercube or whatever in reality there you just think of them as any sort of higher dimensional or RAZR matrices right if you in general in tensor flow most of the math actually happens in two dimensions anyway that's not the case for every operation but in general if it is easier for you to just think of them as matrices feel free to do so I won't judge you and that's what's important okay so just to go for how you would define tensors in up tensorflow for those who haven't seen it you can either do them as standard Python lists so you can see up here on the top we have a scaler it's just a single number you can pass this in to type tensorflow operations and it'll just read it as a tensor right same thing with a single vector right or we can have a list of lists and have a matrix or we can have a list of lists of lists and have a three-dimensional tensor and you can go even further or it's like I said you can that tensorflow integrates seamlessly with numpy and you can basically do the same thing use the same sort of verbage I guess right but it's wrap it in an NP or a wrapper and the reason you would want to do this and this is actually pretty important you won't you probably won't see this very much in the slides but I encourage you as you're learning tensorflow to use numpy for this is you can specify the exact type of data type that you'd like Python is a very loosely loosely typed language weakly typed even and the the thing is tensorflow has a very robust selection of types available to it which go all the way down to the c++ implementation but because python doesn't have a way of specifying whether or not you want a 32-bit floating-point in or a 64-bit floating-point number or 64-bit integer etc you aren't able to do to specify that information without using numpy so if you use numpy you can do exactly that and you won't have to deal with weird like type mismatch errors as you move along okay so that's my that's that's a piece of advice number one so like I said please use no virus not doing it because I'm lazy this time and it's base is limited okay like I said I've been using the phrase operations you'll also see the word op we I've been intentionally trying to stop not say up because of DevOps and that whole field that maybe can they just just don't bother with that I'll see operations with a capital L they represent any sort of computations and in tensor flow and operation is as you'll see it as a Python function you call it you pass in your inputs 0 or more tensors as an input and it will do any sort of computation so these computations can be as simple as adding two tensors together they can do something that has nothing to do with math itself but may initialize variables they may do other operations that affect the graph in one way or another right but the main idea is that you pass in zero or more tensors they return zero more tensors and the big thing that may catch some people off is that they don't immediately execute right the idea is that you were you are what you get after you run that Python function is a handle to that operation right you'll get a handle to the output of that operation you can pass that output to another operation or you can run that output and have the graph run to that point so just remember when you are dealing with operations they won't run instantaneously so when you say ok a equals one and B equals two and C equals TF dot add a plus B a and B right you won't get the number three you'll get a handle you'll get a tensorflow operation which will if you run write which we'll go over later will return three okay hopefully I'm not making it more say that say that again exactly so that's exactly the idea is that you have access to the the operation itself like the concept of the operation and it knows that it's taking in those inputs but it hasn't actually it's lately evaluating that lazily right so here's just a quick example right so we're using the simple multiplication this isn't matrix multiplication it's just simple basic multiplication first we import tensorflow well by convention we shorten that to TF because writing tensor flow every time is sadistic we are going to just create we're gonna say a equals TF that mole and you'll notice we're just passing these indirectly like as I said I'm not using numpy because I'm a bad person and we're getting--we're what a is is a handle to this multiplication node right well when we get it it doesn't it doesn't return the number 15 yet what we have to do is start a session and we'll talk more about sessions soon and then once we have opened up a session we can run it and then finally it returns 15 hooray okay so let's talk a little bit about placeholders so placeholders are your input notes so you can imagine as you're running graphs that you're not always going to want to use static like you're not gonna want to use the same numbers every time you run the graph when you're doing machine learning you're gonna need to be passing it training data and you're gonna need to be plate giving it the labels for that data as well right so the idea is that you give a four tensorflow in order for to be able to compute matrix multiplications it needs to know the dimensions of the matrices right so when you specify a place where you say well tentatively I don't have any data for you yet but I do know it's gonna be a hundred by 400 so you can calculate all the other shapes you need to do in order to make sure everything matches nicely so when you create a placeholder that's exact when you do you give it the data type and you give it the shape you can also get this is a side note you can give all these notes names which I'll be doing in the live coding demo as it's good practice but just for the essential things you have to do you you have a you just need to give it the data type in the shape and then you can use my placeholder as you would any other tensor you can pass that into a in addition operation or a multiplication operation or matrix multiplication operation okay now let's talk about actually running a session okay so the session is in charge of coordinating your graph so tensorflow by default you know has one graph and it's just kind of hanging out in the background and as you are coding and adding in more and more nodes it's keeping track of what's going where who's outputting to what what dependencies are involved right and the session is the master of that and the most important method as we saw before is the run method it's what actually will run the graph and you get you what you want so it takes in two parameters there is fetches and there's feeds dect so fetches is a list of objects and those objects are tensors nodes in your graph that you'd like to get the results for and your common example is the final layer in a neural network but you can actually pass it in any tensor that's in your graph so the idea is if there's something there's if there's an operation that you would like to run then you just pass it in to the tensor flow session and the reason that this is nice is that you can if you'd like you don't have to run your entire graph if that's not where you're interested in and because of the the nature of your graphs it's only going to run the the nodes previous right the the dependencies of that node and it won't run the entire graph for no reason we say that again so by default it's going if you just run session run it's gonna just print it to the console you can save it as an output right so these operations will output zero or more tensors right so for for example for you ran the this the back going back to the ad right or the multiplication by default it's just going to print out the number 15 if you you can save that to a variable if you'd like and you can Oh are you asked a Kashuk - sorry caching drew by Z run yeah so from what I understand it will do its best that though the the the issue is that for many of the most commonly used graphs it's you won't like when you're doing training right it's the the caching of data is useful isn't very useful because you end up having a dependency on a placeholder or your input which changes each time you run right but denser fluid does its best to not like waste people's time and memory right coming back okay then this bead dick parameter so we talked about how we have placeholders there's actually a number of ways to give values to those placeholders to those input values what we the sort of most common way you would see it done these days is when you're not do especially when you're not doing a distributed session let's use this feed dict parameter basically if you what all the feed dick is is you give it the permit you give it a dictionary that's what the dicta part is where the key that you passed into it is the handle to the node that you want to give the input value to right you give it the handle to your placeholder node and then the value that you in the dictionary is the tensor value that you want to give it right so in the previous one I don't know if I have an example I can show it in later yeah I can type this in later but for the previous example for this placeholder we would give it a feed dictionary where the key is my placeholder just the not know quotes around it just my placeholder and then the value would be this hundred by four hundred ten sir that we want to run cool okay here we go here we go tensorflow variables these are variables that persist over time that you want to that will basically be able to take on different values and slowly change over time or drastically change over time maybe like weights and your neural network that you want to train or any parameters in your machine Learning Network like I said weights and biases and the like you said as you if you know machine learning the final values that you have after your training these will be the parameter the main parameters that you've been trying to get before you run in the graph you have to initialize them and that's just part of the housekeeping of tensorflow session the basic idea is the tensor flow has to know when you want like we give the variable a starting value but it doesn't know exactly when you want to give it that because it's possible that you've trained for a while and then you want to the it's possible that you start at zero and then you train them you're like I messed up I want to start again so there's a there's a node or there's an operation called initialized variable that you can use to initialize a single variable and then you can use a helper method for that but it says I've initialized all my variables that will set your variables to the starting value that you give them sorry for spewing that at you so there used to be the like it's still in most of the documentation you'll see this tensorflow variable like direct call as the way to initialize your variables this is okay but for a multiple tutor of reasons namely having to do with having trying to give your variables or passive variables around two different scopes variable scopes it's best practice to use TF get variable the this is this is hard to cover without having to being able to describe what variable scopes are but the basic idea is you can have variables with similar names that are in you can have a graph right so imagine this big big old graph and we have we define a variable over in this section right and this section for the most part doesn't see this section of the graph over here the main the big thing though is that they have one variable that they both want to be able to manipulate and it just becomes a lot simpler if you use TF get variable because you give it a specific name and you're able to call it them there so that's the basic idea of why you want to use get get variable that said I'm probably going to just end up using tensor float variable anyway because these graphs are making are pretty straightforward when you have when you want to update a sign-up date its value directly you can use the assign method okay so finally we're sort of closing the loop of this the this intro section so this couple of what we're trying to do here is create this this note here over this graph with tensorflow code so you create that once you create this placeholder variable you're gonna end up with X all right so we're calling we're calling our placeholder X so BAM we're gonna get this graph here then we're gonna have a tensorflow variable that's gonna have the starting value of zero and so then that just gets placed onto the graph nothing's executed yet right and then we're gonna create a new node called Y and we're gonna have that B be assign method rather I should say we're going to assign the value of we have a node called like we have a node that's going to be the assign operation and it is going to take in the values of start which is itself right and then also it's going to take an X here sorry so the basic idea is that each one of these each one of these lines of code represents an operation and then the ways that we pass in the values to each other to find the edges yes sir say that again yeah so how does how does it know that they should move together here okay so the basic idea is so we start with node X right and when we first define the code we would have just X by itself with no arrows pointing anywhere it's just a node hanging out in space it we have we have like as the programmer have access to its output right but for now all the program cares abouts like okay X is hanging out that's all we care about and then the same thing with start in isolation rax X is still here with no arrows but for all intents and purposes start is hanging out by itself but once we call this assign method once we call this assign operation all of a sudden we we start with saying okay we're going to place this assign operation we're going to call it Y but we're just gonna have this assign operation place on the node so let's what inputs do we need like what values do we need in order to make this run and we see oh we have a handle for the start this value start and then it was okay so that means we need to place this but this operation here week because we're taking it in as an input technically this is gonna get turned into an add operation and then the add operation is gonna get placed up here but for simplicity sake we're just gonna stick with this so we're saying okay we're taking in the start value as an input right and that goes up here exactly so it basically the way it works is it knows that it needs to place an operation down on the graph and then it looks at its inputs and says oh these are the these are the things that I'm dependent on that means I'm going to have an edge here right and so this node actually if you look at it will keep track of its own dependencies and that's how it knows Oh in order for me in order for me to run I know that I need both of these guys to have run I need to know what their outputs are right cool any more questions on that yes sir okay so the this start this start variable is actually a variable variable so this is a tensor flow variable object right so it's not a Python variable it is a variable object which has its own sets of methods and one of those methods is the assign method which is its own even though it's an internal method it's still in operation that makes sense right yeah so if that's that's what it as a sign is a method of exactly correct well I could mean XII missed the the details of those so so the X is a placeholder right so start at is a is a variable object right you can use it as a tensor it will be converted into a proper tensor object I'm not sure if I don't think it inherits from tensor but the variable has its own set of methods a place holder would have its own and it doesn't have a sign as its own right cool yes yes so a sign then a sign doesn't increment it or do something else it gives you give it the exact value that you want right so if you wanted to increment it by 1 you would say start plus 1 if that makes sense right it's so big right what this operation does is it increment this increments start right so start we want to give a new value to start we want that new value to be start plus some value cool yeah yes I mean this was a to be honest I wrote this this slide Oh like I may possibly a month or two ago and I haven't looked at it and it needed the updated so yes it is very poorly constructed it's still technically correct but yes I apologize for the confusion of the of that right my goal was to you to look like showcase the variable functionality and it just happened to be done and maybe not the most helpful way right so the question is is it possible to have essentially a cyclic graph is it possible for it at some point where we have all these you know again we have more nodes and then eventually we have a value over here that says like Oh what if start equals that right and then we end up having this endless feedback loop so it's a very good question and the answer in short is no you can't really use it because of the weight dependencies work right in theory you end up with a paradox like chicken what came first the chicken or the egg what came first the start or the operation that came before start which is also after start so you'll see you'll see graphs or every recurrent neural networks that look like we need to have a cyclic graph right you'll see I wish I had a way to draw this out and not scribble on this but essentially a common can I draw on this wall it looks like it's draw okay so you'll see you'll see recurrent neural network basic things right that essentially look like something like this right where they have this they like feed that looks like they feed back into themselves right and the thing is this model that this infinite model of recurrent neural networks is really more theoretical than something that we can actually do right we could we can't go infinitely into the past's and we can't go infinitely into the future so what we end up doing is what's called unrolling the the model right so when you see something like this right what's actually happening isn't this thing that's feeding back into itself infinitely what you're actually seeing is something that's closer to so now like we're gonna rotate this because there's not a lot of horizontal so now we're gonna say instead of it would be like feeding like this so we're gonna say like we're we're checking we're checking out this value here and then we have the same model up here and we have an output there and we have this and it just keeps going up as many times as we'd like and it may seem difficult to see like how this matches the recurrent model but the idea is okay we started at this point in time here we feed it into our model it does some processing and we're gonna observe it out we're gonna observe its output at that time and then we're going to feed it back into the same model right and the key that makes this work is that the weights right all the all the big weight matrices in here are identical and they are updated at the same time okay and as soon as you do that you end up with a model that in effect emulates in a true recursion right true cyclic graph without having the the headaches of a cyclic graph dependency so the in the in if you wanted to try to do that in tensor flow the reality is as soon what happens is if you tried to reassign start to a value that so if you try to reassign the value of start to it like a new a new operation right so let's say we now take this right afterwards we say start equals y plus 2 or whatever it is in tensor flow speak write like oh did we do it did we break tensor flow well under the hood what's actually happening is there is the graph is getting defined node by node and what's actually happening is this original start that we use is staying right there right what's happened what would happen is that we create a new node right somewhere maybe up here right this is okay it's y equals x plus start plus two right so we're sitting like y plus two and we're gonna call it start so just because it has the same variable name doesn't mean that it's gonna get it's gonna bump the old start off the graph what we will lose is the handle because we're changing this Python variable right we're changing what that's pointing to and instead of this the word start pointing to this it's going to point to the new operation in the graph does that answer your question do you feel good where do you want okay yes exactly I mean not you'll have to ask this guy but I'm not sure whether like are all tensor is immutable always okay cool I thought so but I didn't want to say that because I'm like yeah yeah cool but it's very very good question it's that's like something that's fun to explore so yeah that's that's building a graph and also recurrent neural networks and also enrolling them okay so we we are they mean we've already gone through this I don't want to to basically run this into the dirt the basic idea is you start a session you have to initialize the variables right so that's what I was talking about before this initialize all variables is an OP and I forgot to put in parentheses here you have to run that operation and then you're allowed to use the variable okay and then this is a terminology thing the term devices so a device is a CPU or GPU it does not mean a computer or a whole machine a machine can have multiple devices I don't know if you guys saw this in the news but Google announced that they've made their own tensor processing units for their cloud computing it's gonna be so cool except they're so expensive and maybe not practical but you know kind of fun so yeah just know that multi device doesn't necessarily mean distributed if you have a regular computer with a GPU on it you can run a multi device setup because you can run it on have part of it executed on the GPU and part of it executing on the CPU but it all stays within the same machine okay so we're gonna try to turn this into like I feel like this I'm I feel like I'm going on for a long time but I don't I don't know what the time limits we have here but I want to do talk about tensor board quickly because it's very cool and if you've worked with tensor flow and haven't used it it's very easy to make just a lot of great use out of it basically tensor board allows you to visualize your models it allows you to visualize statistics from your graph from your models and it can help you make sure that you are doing it like basically it can help you make sure that you are that your models match what you've sketched out on paper right so let's I'd want to do a brief life devil demo I don't I don't need to get into the like crazy details about these scalars like doing a bunch of summaries but I wanted to show you guys how easy it is to just visualize your graph instantly okay so we're gonna close out of this we're gonna open up terminal let's start this guy up okay so we're gonna create a new tensor flow Python notebook we're gonna import tensor flow yeah that we're system I do this we're gonna import tensor air we're gonna have a boy tensor flow as TF okay so we run this it's gonna go perfect okay so now let's create our basic graph okay so we're gonna say a equals some basic tensor and like I said I'm bad person I'm not using dumb pi so we have this and then we're gonna have B equals let's do one two we're going to make this an actual matrix 1 2 3 4 5 6 comma 7 8 9 cool so now we have these two tensors that we're going to be able to use and now we're going to want to do a matrix multiply so we'll say C equals T f dot Matt well let's see if I get these right on the first time I'm going to say B and a you can't remember that it's gonna be right damn backwards know what I do transpose B what's that right I think they are this guy needs to be yeah you're right I think if we do that maybe there we go thank you okay sorry this so this vectors in make vectors bad matrices good okay so now we have these nodes on our graph and now we can open up the session and now like just as a demo we can say to session dot run C and we should get the correct array right hooray but that's not what we're here for what we're here for is to open up a writer is it summary writer or yeah mmm-hmm no dude these it's actually pretty fast I feel like it's this isn't opening itself up right wanted to double-check this summary writer we have to pass it in the is it yeah we have to pass it in a log directory so the first value is just the my graph right so it's gonna output some data here and then we also need to pass it in the actual the the graph of this session so a session that graph we do this right no it's not so what does that do around here for AC I feel like summary writers not right I mean I just don't see the summary writer thing mink deep apologies we're gonna go to the tensorflow API real quick to do just as a side note the defensive love API your resources are actually not trained that thank you sir no the part of the magic is that you don't need to do oh like we're not we're not going to be going over the crazy like the summary statistics that are available because that's a little bit more involved it's the same thing my my graph and then we're passing in the graph but we'd like it to that we'd like it to write great okay so now it doesn't look like much has happened but if we go into this notebook we close this out and then we look it's gonna be huge but we're gonna go into we're going to start a tensor board we look up somewhere we should see yes somewhere in there okay let's the tensor board and then we'll say vlogged or equals my crap okay great so now we see that there's something happening we have some something good on port 606 so if we go to localhost at 600 six well we're gonna get brought to tensor board you'll see there's no scalar data this is I meant where we're not showing any summary statistics but if we go to graph look at that so we have our inputs here yeah look cool I know I knew I did and it's gonna make you guys happy at some point yeah so for all my rambling I sort of know what I'm doing so basically you can see that we just wrote in three different and I apologize for all this it's just the low resolution on the one presenting but what what tensor flow is is this highly interactive very very cool way of visualizing your nodes very quickly right and so you can see that you have data coming in you can click on any of your attributes and take a look-see at some of its information and as the scales up you can create names namespaces which allow you to basically minimize your graph into the bigger and smaller spaces in fact I can it's not not really worth it but basically all you need to do like to get yourself started with tensor board add your add just add in a TF dot train that summary writer passing your graph and you're gonna be able to immediately start visualizing your graph and seeing it and you won't have to worry about anything else you don't have to do it you don't have to do anything else it's just like oh great I can sanity check to make sure that I'm not doing anything really dumb so unfortunately tensor board is not a lie like a live stream what it does is it saves data to that my graph file or whatever you wanted to call it but what you what you can what you should be using it for is it's basically saying oh that doesn't look like I would I expected it to that like this isn't connected to dollars this is like I'm completely missing something or I forgot to do this right so it's really useful as a sanity check er and it's also once you start getting used to it you can use it to actually test your keep track of your training data over time and visualize that there's histograms and the scatter plots and all sorts of their feature that I'm not able to show to you in such a short live coding session but there are things there yes sir oh sorry so yeah basically when you install tensorflow it comes with when you install it it comes with this tensor board command so all your typing is tensor board and then we're passing in a log der flag to pass in the directory that we want to use okay so if you recall in the code in here I called it I basically said to save it to my graph and that's why we were able to pass it in here cool yeah there's there's third party yeah it's it's it's very cool and tensor board is one of those things that everybody knows is really cool but I don't think people appreciate how simple it is to actually start making use of it yes sir why is it so small can I do this there you go so let people write down this simple the this this code really quickly any other questions about tensor board I know that there's a lot of features that I didn't cover but any anything else that we went over quickly here very very cool all right sir yes okay so that so that that's okay right so you'll see that scalar data is so just to describe what you would be doing right so I this writer is actually a live object right the it's something that you can open up and close as you would any other file writer right so what the writer does is you open it up and when you pass in the session graph it says oh okay I like because the graph the graph definition doesn't depend on the the data flowing through it necessarily right it's when we're just talking about the can there are connections there are these types of like these types of operations are happening and they depend on these other nodes right it's as soon as you open up the writer can say oh yeah I can I can write down that and that's what we're that's what we get there but there's a whole lot more we can do with this writer and that includes a number of summer of summary operations so if you go to the tensorflow API and just look up summary right and just search for those you'll see histogram summaries you'll see scaler summaries and all of those are you pass those in to the writer and basically say writer dot you know add summary and put that put the scalar summary in there you just say okay I want to keep track of this weight over time like every hundred iterations right you'll see in sample code it'll do a modulo of your step your training steps and say okay every 200 steps I want to save the value of this weight and I also want to save the percentage correct or incorrect that we got I also want to get the standard deviation of these valve variables and whatever right so you can the thing with this is incredibly flexible it is hands-on you do have to explicitly tell it what you want but it gives you the power to be able to say what you want to save exactly when and so when you first load up tensor border they'll say no scalar data's bound you're like oh god I up but you didn't don't worry right so I mean right so over you know you the people who have no data science here know that a lot of what we're looking at is how how much better is our model doing over time over training data iterations and how and how does it compare to other models in general and what statistics can we use to basically get a good sense of whether or not it's doing well you can have your training error and then you have your validation error right and then your test error but then you can also just look at things like you can create you know if you would like if you wanted to you can construct like create a model that is worth constructing p-values for and whatever right so the the key here right is I I've just touched like the very very tip of tensor board and introduced it as a way for you to visualize your code right as the graph because at the end of the day tensorflow is all about building a computation graph and the more and more you can get used to just thinking of your code as well I got circles and I got arrows connecting those circles and this is how gonna do it in code the more comfortable you are with that concept the more fluent you're going to become with tensor flow and tensor board is a great way to immediately give you feedback on how your mental image compares to what tensor flow is showing you and but again tensor board also allows you to do much more sophisticated data analysis right as well as just give you like doing something as simple as keeping track of your test error over time and giving you that very you know the nice downward sloping error rate over time so you wouldn't need to do that so the metaphor the thing with some with tensor board is you you run your graph and it's tensor board saves your your statistics right so we can close this we can come back tomorrow right and it's just saved in this my graphic folder right the it's as much as it's sad that tensorflow our tensor board doesn't work in real time and show you data flowing through your graph and just like oh my god I can see little dots flowing that'd be really cool but at the same time it's also kind of nice it's it's stable and you come back to it and you'll have the same statistics you can use it as a data dashboard if you're trying to show it to clients or if you're working with colleagues and you need to show them that you can basically use tensor board as that data dashboard yes sir yeah so I mean that this comes down to the type of model you're training so and this is this is something that expands beyond just tensorflow this is just model comparison in general so one example of ways you could do it right so obviously you have your your how like does it learn right does your model like machine learning model learn and at the end of your training sessions which model learned this has a lower test or higher test score that's one way but then you can change it'll be like well there's more to it than that how how quickly right how quickly does this those this model train and you can graph you the learning rate over time right you can do things that are akin to seeing whether you can basically visualize what if we didn't train on the data enough right you look at that you look at your training graph and it's it's still sloping down at a pretty good clip and then we just stopped let's say we only did 3,000 training iterations and what if we had done 10,000 right we might be able to have a better idea of oh it looked like this model was still learning maybe we needed to just run it for longer right so it's the the question of how do you know which model is better is it or how do you compare them that's basically tensorflow our tensor board allows you to keep track of statistics that you would like right sophistic that you would think are important for comparing models training training or training rate the size of you know the size of the model the number of you can keep track with the number you could calculate the number of parameters going on or whatever though that would stay the same over time well I mean I mean the at the end of the day for that it's not the tensor tensor board isn't gonna like be like check mark this one's super better like because we just know and you know with with all of these machine learning tools right it's it's not just it's not just a magic black box it's is gonna be able to to do the critical thinking for you right at the end of the day it's it's what its gonna try to do is be able to present data to you in a way that is useful and digestible I mean basically basically I use sensors I use tensor board to keep track of my my learning my learning rates might it's like I said before your test test validation training error it's like there's there are you know I'm projects I'm working on right now I am working on a parking meter for a predictive parking parking meter application right and so we're able to take a look at the we can take a look at the percentage of parking meters available at a given time and we can we can filter that by location right we can decide which parking which areas we want to track for a given cycle and have that feed into tensor board we can have tensor board do all like we can do it by block and have tensor board keep track of all of the blocks at the same time and have it just be this giant grid that's not particularly like that's a little overwhelming but you can do it like that but at the end the day it's tense it's just a nice way to be able to graph information from your from your back well I mean this again yeah yeah so I would say that if you're in terms of like comparing your model right you can't look at the looking at the graph of the model isn't going to be give you a better way of saying oh this model is better because it look it look it look just look at it just look at how much better it is yeah no that's better for debugging your code right so the the use of being able to look at your graph is being able to keep track of making sure that when you're doing these models it's very I find it very useful to actually draw out what and what my inputs are what the maître matrix sizes are what's being connected to what other if I'm inputting some auto and auto encoder off to the side right draw that in and make sure that I have all my inputs transformed correctly so I would say that the yes sir I would say that the graphing thing that I just showed you is good for debugging and visualizing your code and the summary statistics are good for analyzing the success or the actual training of your of your model cool yes sir so the so what will happen is you'll you'll run your session right and what you'll what you'll end up doing is having a a you'll have a writer operation write a summary up you'll write out all the different summaries that you'd like and then in general the useful thing that people do is create use an operation called I think it's collect all summaries or merge all merge all summaries and so merge all summaries all it does is says okay I'm gonna find all of the summaries I'm going to bunch them all together so that we can easily run those and then let's see if I can do this all right so so I believe unfortunately like unfortunately this is something that I would hope will be updated soon but when you tensor board will pretty much read the data once as this or when the server is running and so in order to update with new data you'll just have to control C and then up hit tensor board again I may be wrong with that but in my experience it's the server doesn't try to reread data as it changes so that that is yeah yeah I don't want to I don't want to hold like this guys got cool stuff I hope so as anything anymore like last minute I know like there's a tensor board is I didn't really get deep enough into it to be able to satisfy satisfy your desires but or you feel you feel good yeah cool cool so we'll showcase some more intricate examples of that and so I just I'll really try to do this as fast as I can but I just want to go over if you want to get deeper into the tensor flow right it's because pretty much everything I talked about was fairly high level and for like people who just had a curse like either just heard about it or they have haven't really practiced with it okay so let's talk about yeah yeah right okay and unfortunately I don't have a huge amount of time to talk about this yeah yeah hopefully you guys are learning something not feeling like this is a too slow for you all but let's talk about the code base because pencil goes a very cool framework so as Chris talked about we it starts at the bottom and then the stacks up from there so at the at the very root of tensorflow is your standard C++ library is including heavily heavy heavy use of the eigen library which if you haven't seen before check it out it's very cool high highly optimized implementations of matrix out math right then on top of that we have the tensorflow core framework that's where you have your graphs you have your matrices in C++ you have your operations in C++ and then on top of that right so you have the the enterprise level like structure then you actually have the implementation of that and that's in your the tensorflow slash course less kernel right you actually see the detailed code and you'll have your C plus your CPU implementations you'll have your GPU implementations great those get compiled and then we need to figure out a way to connect those to Python so we use swig swig is a native way of connecting like C++ compiled code and hooking that up into higher-level languages in this case we're using Python and then you have all the Python library on top that okay so from what I understand again has a lot more focus on scientific Matt like mathematical implementations so you have very efficient implementations of like the hyperbolic tangent function right things that are useful in data science that you may not see in from the simple more rudimentary though very very useful arithmetic operations that you'd see in other libraries it's it's a tux family of sweets so I'm not exactly sure the so Google ish but yeah it's I mean it's something that I did more research on that door I saw tensor flow it's you know it's got pretty good documentation though the website does look like it's from 1997 but you know it's I highly recommend checking out again because if you want if you're interested in getting into implementing your own code and tensor flow if you know C++ or interested in learning more about this sort of implementation learning eigen or at least getting a good sense of the basic I cannae PAP is will give you a much more comfortable time looking at the tensor flow court code as it calls a lot of eigen code so I wanted for the rest of this talk of gonna try to go over the just the structure of the tensor flow repository because it's not explicitly made clear what what's what and where you'll find things right and then I just want to quickly go over some basic concepts if you do if you are already doing it or thinking about getting into implementing your own operations we're going to be doing more of that at the at the workshop next Saturday but it's just too much to try to cover as you you guys see how long it takes me to get through the basics we weren't able to be able to showcase a real operation but we're gonna go over the structure of the code and then talk about basics and then I'll pass it off to Fabrizio so so so basically when you get into your tensor flow that you go to the tensor flow repository it's a it's structured like a Python project which means that the first folder that you need to go into is actually called tensor flow right so you go to the main repository you have the meat readme you'll see a bunch of you know you see a folder called third party and you'll see some other things just for the meat of the actual tensorflow implementation go into the tensor flow folder and then from there we're running so the first the big important folder that you'll see is core so core has all the C++ pretty much all the C++ implementations as the kernels it has the framework itself and it has the register their so-called registration of the operations so the inside of tensorflow slash core you have these further folders ops is the the registration of operation signatures the way that tensorflow works is that it has a fairly sophisticated and integrated way of basically saying okay I'm going to have these operations available to me and in order for this to run it needs exactly these these parameters right so that's registered in score slash ops and then in the kernels is the actual implementation of this of this code right you'll see some files that end with dot CUDA cc is that right and though are those are the GPU implementations and then you'll see files that are H your header files see see files but if you want to just look at all pretty much all the operations and tensorflow you can go right into core / kernels if you go into core / framework you'll actually see the Enterprise Edition big like the structured right the the bones holding everything up and then inside a platform is our the abstraction that allow them to you know basically abstract away different operating systems then if you go in instead of going into the core folder instead of tensorflow you go into python you'll see the python implementations you'll see wrappers you'll see test testing you'll see gradient implementations and you'll see the the attend the fundamental python code something that I wanted to write in if you compile if you installed tensorflow there's files in the in your installed version that you won't see in the repository the so-called gem files and if you want to really see sort of what's going on you can look go go into your either Python or anaconda folder find your Python library and look at that those gem files and then finally there's the coentrão folder there are other folders in there but the the last main one is this contributor and that contains contributed files or parts of the tensor whole project that aren't fully adopted yet yeah and then basically there's so much documentation inside the code that the only way that has yet to be written properly in tutorials that the only way to get yourself and like into fully to understand it is to dive right in and start reading the documentation in the code luckily the the Google does a very good job at documenting their code with comments so just if you don't understand something just go in you can go in deeper right if the easiest way to get in there is to if you see a function or a class that you don't understand go up to the top of your C++ file look for the include statement that looks like it goes with it and open up that file started diving deeper and going down and deeper and deeper the rabbit down the rabbit hole right and each one of those should have really solid explanations and like yeah it's just gonna be a wild romp and I had some things like like just so you know like this is like this tensorflow c++ implementation stuff isn't fully documented it's not documented like anywhere except in code right so the only way to know how to use a tensor when building an operation is to actually look in the documentation itself and look at the code itself okay well I think I'll leave it at that I've been somewhat files worth checking out I mean there's there's just a lot of very cool like utility files very cool classes that are included in the tensor flow library that aren't mentioned there is a how-to on the tensor flow website called adding an op right how do you add an operand tensorflow but for many purposes it's it's not really complete it goes over the the high-level view right of what you need to do but it doesn't talk about what what actually like the actual C++ API that you need to to call in order to make the tensor do the thing you want to do right so there are a lot of really useful macros and classes that are available in the library that Google that is being used in the code but isn't made explicitly clear to contributors so I want you guys to like go in there just start crawling and diving it's incredibly well documented so just get in there learn learn the code and have a good time with it and again if you want to contribute they do a fantastic job of really making you feel comfortable with your code and making your code better right and it's your code that's getting better it's not them it's not them changing it and then being like thanks I guess it's just like okay like here here's the theft here's the thought try this and then you work on it and you give it back to them it's a solid back and forth yes so yeah so in in CUDA like GPU processing you have this the idea of a kernel that is the the code that's going to run on each of your thread so I believe right more or less right and in tensor flow the kernel is simply a C++ implementation of an operation right it's the code that's actually running when you call your Python like function at the top right cool no yeah so the word I mean the thing is the term kernel is used all the time these days you use it as your kernels are used for image processing they're used for I mean they make sense with convolutional yeah exactly convolutional neural networks which often are used for image processing right it's in tensorflow when you hear a kernel in general that just means an operation implementation or yeah that's what that means yeah so basically you can have a kernel for standard CPU implementation then you can have a kernel for GPU implementation and TPU implementation in fact they do there's this long-standing is it the LS TM operation that the that this very very talented contributor is working has been working on and it's it's supposed to supposedly with 50 percent faster than the current LS TM operation so yeah if you if you know that it's like this isn't done right you feel free to re-implement it yourself similar to spark like what people yeah I mean the tensorflow people are for better for worse there's a incredible amount of attention being paid on this raw speed right which is important when you're training like very sophisticated models and because of that the Google the Google team working on this is very interested in having correct fast implementations happening so if you know that there's a better way to do this as a faster implementation you go for it right because they they will love that right you know people I love the play like I know people like sign like Oh Google is holding out on the good stuff from us like Google you tease us here's the years of thing right Google is is really is interested in having as many people learning this software and as fully as humanly possible for all it from what I understand right and from what I can tell and from what the actions of the developers over time the only parts that aren't in the only parts that are not in tensorflow right especially at this point are the things that are so deep deeply ingrained within Google's internal infrastructure that it isn't even relevant to tensorflow by itself right so the the they work their butts off to try to get the the distributed implementation out which is what we're going to look at next and they're they're constantly just taking out in taking internal changes and pushing them to the the public repository right they want people to know this they're interested in having people who know tensorflow possibly come in to work with Google and not have to worry about having to learn tensorflow from scratch right so Google has like no reason to hide the good stuff really there they would rather have more people use tensorflow so that they can have more collaboration and have more good implementations come in for from the public Google does a very good job of trying to trying to like even if the the initial code is isn't up to up to par they work with they work very hard to try to make it good enough and they want they want the people contributing to feel like they're valued right and yeah yeah so for bricio one of the many things that he's contributed he and he implemented the ada grad optimization like he did he did that implementation for tensorflow so it's like that was a pretty lengthy and intensive contribution mojo props boy my props it's a it's yes yeah the exact the exact the exact number that you use as your vector is different for different you know different reasons right exactly you can thank you for all intents and purposes you can look at they're the same way you would regular gradient descent or City stochastic gradient descent or anything like that in any case thank you so much for putting up with my incredibly heat long a presentation thank you so much for sticking you
Info
Channel: Data Council
Views: 2,389
Rating: 5 out of 5
Keywords:
Id: EM6SU8QVSlY
Channel Id: undefined
Length: 81min 19sec (4879 seconds)
Published: Thu Jun 09 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.