Build a Deep Facial Recognition App // Part 5 - Training a Siamese Neural Network // #Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what's happening guys my name is nicholas ronatt and welcome to part five in the siamese neural network series in this series what we're doing is we're going from a sophisticated research paper and building a full-blown siamese neural network that allows us to perform facial recognition on our face in this video we're going to be focused on training our neural network let's take it if you look as what we've gone through all right so what we're going to be going through in this video is quite a fair bit so we're actually going to be going through five key steps and these steps are all to do with the training a neural network now step one is all going to be to do with defining our loss function and this is going to be the loss function for sims neural network we're then going to define an optimizer this is going to help us back propagate our updated weights throughout our neural network we'll then build a custom training step so this is a really useful skill when it comes to building sophisticated neural networks because it allows you to handle multiple different types of inputs something which is definitely required when it comes to building really advanced styles of neural networks once we've built our custom training steps step 4 is all going to be to do with defining our training loop so we're going to blend our training step with our training loop to be able to go to step 5 which is training a model on that note ready to do it let's get to it alrighty guys so in this video as i was saying there's going to be five specific steps that we need to go through and this entire video is all to do with training our model so we're actually getting into the nitty-gritty now so first up what we need to do is set up our loss and our optimizer now remember the whole purpose of our neural network or what it's trying to do is minimize that loss because that means that it's performing better for this specific function that we're trying to minimize then we are also and we're going to set up our optimizer as well there so that's going to help us perform back propagation through our custom neural network there's a helicopter flying past my desk right now kind of strange but anyway um okay so then in 5.2 what we're going to do is we are going to establish some checkpoints and this basically means that if we need to or if something screws up in our neural network as we're training we've got almost like a placeholder that we can go back to to reload our neural network from we are then going to build our train step function so this effectively defines what happens when we train on a single batch now the basic process is that we use our model we pass through some data we get a prediction we then calculate the loss so whether or not it is determined with a specific image as being verified or not and we compare that to the actual value so we calculate the loss on that using binary cross entropy then what we do is we calculate the gradients for all of our weights across the network we then use our optimizer to effectively back propagate throughout that network to minimize that loss and this is where our learning rate comes in because we try to minimize that loss effectively to get it as small as possible so once we've gone and done that we then define our training loop our training loop is effectively going through every single batch for every single epoch and applying our training step so training step gets applied per batch a training loop does it over our entire data set over the entire set of epochs then what we're going to do is train our model so we'll actually train our model in this video so you're going to be able to kick things off okay without further ado let's go ahead and set up our loss function and optimizer okay that is our loss function now define again it probably sounds more intimidating than it actually is but we have gone and written a single line of code so i've written binary underscore cross underscore loss equals tf dot losses dot binary cross entropy and then open parentheses close parentheses and certain circumstances you might also pass through from logits equals true but i've got to do a little bit more research as to whether or not we need to apply it here but in this particular case i've tested it without that parameter set it looks to perform okay but again need to do a little bit more research and if anyone has any ideas hit me up in the comments below but for now that is our loss function so we've written binary underscore cross underscore loss equals tf dot losses stop binary cross entropy and then open parentheses close parentheses the next thing that we need to do is define our optimizer and for this particular model we are going to be using the atom optimizer there's a bunch of them out there there's um stochastic gradient descent uh what is another one that's the one i could think of off the top of my head but there's a whole bunch of others you can search uh tensorflow optimizers or tensorflow.keras dot optimizers and you can see a whole bunch of them so let's go ahead and define our atom loss or atom optimizer okay that is our optimizer now defined so i've gone and written opt equals tf.keras.optimizers.adam and then i've gone and passed through a learning rate of one e negative four which effectively represents 0.001 now again as i was saying there's a ton of optimizers so optimizers for keras let's go and take a look at what we've got over here so again you've got a bunch where are they so these are some really popular ones over here so you've got uh stochastic gradient descent rms prop atom ada delta ada grad adam max n adam and ftrl i've never heard or used ftrl but again there's a whole bunch available here if you want a video on optimizers hit me up in the comments below but for this particular case we're going to stick with this bad boy over here which is adam and we've gone and set our learning rate already cool that is our loss and our optimizer now defined so that is step 5.1 now done next thing that we're going to do is we are going to establish a checkpoint callbacks which we'll use when we go and define our training loop down here so let's go ahead and define these okay that is let's make sure that works okay that is our checkpoint components now set up so what we've gone and done is we've written three lines there so first up we've gone and define our checkpoint directory we're then gonna define our checkpoint prefix which i'll explain in a sec and then we've actually gone and defined our checkpoint class which we're going to use to checkpoint our stuff out so first things first so the first thing is pretty self-explanatory so we've defined a directory where we're going to save all our checkpoints and this is going to be called training underscore checkpoints so if we haven't defined that already let's go and create that folder now so if we go into the same folder that we're currently working in you can see that we don't have a folder called training checkpoint so let's just create one i think it might actually create it for you but uh just to be on the safe side i'm going to create a folder let me zoom in on this so you can see what we're doing i'm creating a folder let's drag this over here product or training uh checkpoints is that what we call the training checkpoints training underscore checkpoints yep training checkpoints all right cool so i've just gone and created a folder called training checkpoints inside of the same repository or same folder that i'm actually building our model in cool so that is that now done that's this first line here so written checkpoint underscore dr equals dot forward slash training underscore checkpoint so this basically represents us from the current folder we're going to go into our training checkpoints folder and save our stuff down here then i've gone and created a checkpoint prefix in order to do that written checkpoint underscore prefix equals os.path dot join and then we've defined it as being inside of our checkpoint directory and we are going to prefix all of our checkpoints with ckpt so this basically means that we're going to have all of our checkpoints in a consistent format they'll start with ckpt and then normally they'll end with a set of unique numbers or effectively it goes in sequence so zero zero one zero zero two so on and so forth and then we've gone and defined what we actually want to save out so i've written checkpoint equals tf.train.checkpoint and then i've gone and saved our optimizer so opt equals opt and then our siamese model so siamese underscore model equals siamese underscore model and so this effectively means we're going to be saving out these two components as our set of checkpoints now we could probably just save out our model in this case i've saved the optimizer as well so the full thing is checkpoint equals tf.train.checkpoint and checkpoint is in caps and then open parentheses and then we'll pass true keyword parameters so opt equals opt so i'm saving this and then siamese underscore model equals siamese underscore model so we're saving this siamese model that we created in the last video okay that is step 5.2 now done now we're on to step 5.3 so this is the really good bit so this is one of the things which or one of the components of coding in neural networks that i thought was super fascinating so defining the actual training step so this is what is going to be used to train on one batch of data so effectively what happens is one batch of data comes through to our training step we go and make a prediction we calculate our loss we go and calculate our gradients and then we apply back propagation through our neural network to ideally get the best possible model now these steps are going to be pretty much the same whether or not you're building a siamese neural network or a network for classification or a what's another good example so i'm building the super resolution model at the moment so again exact same train step style now again you can wrap this inside of a model and in some of my future tutorials which are coming up top secret i actually do that um because it just makes it a little bit cleaner but in this case i've gone and done it this way so you'll see how to do it using the tf dot compile or tf.function decorator all right so let's go on ahead and start defining this okay so those are the first what is that three lines of code that we are defining for our train step function so i've gone and and again this function's not actually going to do anything as of yet so first up what we're doing is we are wrapping the function inside of this at tf function decorator so the reason that we do this i'm just going to show you is it actually goes and compiles this specific training function so you can see if i actually show you that there so wrapping it inside of at tf.function compiles a function into a callable tensorflow graph a key component of how all tensorflow works is using a concept of a graph so it effectively compiles entire neural network into a graph and allows us to actually go and train it in an efficient manner so exactly what we're doing is by using this at tf.functiondecorator we are compiling what is going to happen under this train step function so again some really nifty coding's gone behind this so just know that when you're going and creating your train step function outside of a consolidated or compiled model you need to have this decorator here and again i'm going to show you how to do it the other method in some of the future tutorials that i've got coming up all right so what we've written is at tf.function that's our decorator that i've described here the next line is actually defining our functions over d f train underscore step and then to that i've gone and passed through or we're going to pass through at one batch of data so this function or this what is this this positional argument is actually going to be our batch of data and then i've closed that with a colon and i've just written pass because we don't have what we're going to do yet so again this is sort of the beginnings of the function now let's go ahead and write out what we're actually going to do okay so there i've actually gone and written another three lines of code and i've started applying some comments so first up what i've written is with tf dot gradient tape as tape so this allows us to actually start capturing our gradients from our neural network model so if i type in a tf dot gradient tape let me show you the dock of this so this actually allows us to start performing or calculating or performing our differentiation so this is what is going to allow us to actually get our gradients for our neural network so what is actually happening here is it's recording every single thing that's happening inside of our neural network and it's trying to look at these function that we're actually defining to go and perform differentiation differentiation is uh i guess a component of calculus and it's what allows us to go and perform our back prop across our neural network so by defining with tf dog gradient tape as tape we're starting to capture all of the functions or all of the operations that are happening inside of our neural network so we can eventually go and calculate those gradients so you'll see this in action in a sec and then i've written a comment so we're grabbing our anchor and a positive slash negative image so remember when we defined our batch so what do we call that data again i'm going to have to scroll up because i cannot remember now we have called it train underscore data so if we go and take a look at our data again train dot let's call it test batch equals train.data.take1 so if we take a look at testbatch uh we need to let's just convert it to a numpy iterator okay so that's one example of our batch now uh let's store that batch one i'm just going to store that so what i've done is again you would have seen me do this in the previous tutorials i've written test underscore batch equals train underscore data dot as numpy iterator and then in order to get one batch we can use the dot next method and i've stored that inside a variable called batch underscore one and if we take a look at our batch it is made up of it should be made up of two components two or three okay three perfectly fine so it's made up of three components i thought the first two might be consolidated but that's perfectly fine the first component is going to be our anchor image so if we take a look at zero this is the representation of our anchor images and it should be the same shape as how many images we've got in our batch which in this case is 16. so we've got 16 anchor images in the first part of our batch we then would have 16 images as part of our negative or positive image remember which we're going to use for validation so if we take a look at the length there 16 images and then our last thing should be our actual labels you can see we've got all of our labels in this case they're all zeros so that is perfectly fine okay so in this particular case we're looking good so we've got some of our data so what we're doing here is we're effectively extrapolating or extracting our batch so we're setting x equal to batch one and then two oh so we're slicing the first two values so remember if we take a look at our what is it len we've got two values there and if we take can we run dot shape no because it's not a numpy array an np dot array you can see we have two components so remember our anchor and either our positive or negative images each batch is made up of 16 images and they are in the shape of a hundred by 100 by three channels so that is all of our data stored in our x think of these as our features then what we're doing is we're grabbing our y value we're setting that equal to batch two and what have i done there this should be batch underscore one and if we take a look at y these are all of our labels in this particular case i thought it would have been a little bit more shuffled kind of strange that we've got all zeros that's fine we'll leave it for now okay so what we've got there is this is effectively what we're doing here so we're grabbing our features and we are grabbing our labels what we can then do now is actually pass this through to our siamese neural network to actually start getting some predictions so let's go ahead and write out the rest of our train step function okay so those are the next two components of our train step so i've gone and completed our forward pass and we've gone and calculated our loss so what i've got to write in there is y hat equals siamese model so this is actually making a prediction right so we're actually passing some data to our siamese model so y hat equals siamese underscore model which is coming from here and really we should be passing this through to our train step function but that's fine whatever we'll skip that for now um so y hat equals siamese underscore model and then to that we're passing through our x which we're getting from here so our features and we're setting training equals true super important that you set training equals to true because certain layers are only going to activate when this is actually set to true so if you're using batch norm or dropout you need to set training equals true training equals to true in order to activate those layers so we've set it equal to true there when you go and make actual predictions in the real world you don't need to set training equal to true then we've calculated our loss so our loss is equal to binary cross loss which is what we defined over here and we are passing so let's actually take a look at the docker for that um so if i type in tf.losses.binary so what we pass through to this is uh so if we take a look here so you can see there that we first let me zoom in on that so to our binary cross entropy loss which you can see there so tf.keras.losses.binarycrossentropy we first pass through our y true value and then our y predicted value in order to calculate our loss which is effectively what we're doing over here so we're passing through our y true value which is our label set from over there and we are then passing through our y hat value and because these are floats these are interpreted as a probability so i think that's why we can get away with not setting from logics equals to true here but again if anybody knows any better hit me up in the comments below i'd love to hear your thoughts okay so what have we gone and done so we've gone and done our forward pass so this makes our prediction we then go and calculate our loss and we've written loss equals binary underscore cross underscore loss and then we pass through y true so this y here is y true right remember again there's our dot go we pass through y true then y pred in our particular case we're passing through y true and then y pred so y hat is another name for y predicted or the predicted outcome in this particular case perfectly fine calling it y hat you can call it y true if you want okay we are almost there for our training step function and this is super cool guys so this is the building or the true building blocks of actually building a neural network you go and calculate or produce a forward pass you calculate your loss all that's left to do now is actually perform or calculate our gradients and then go and perform our back prop so let's go ahead and do it okay that is our train step function now done now in this particular case i don't think we're actually outputting our loss but that's particular perfectly fine if we wanted to go and output some information we definitely could you could also wrap it up inside of a model and best practice actually probably dictates that you probably should do that i've sort of skipped through it but that's perfectly fine so the last two lines of code that we've gone and written is grad equals tape dot gradient and this is actually starting to use a tape that recorded all of our operations through our neural network to go and calculate our gradient so we're in grad equals tape dot gradient and then we're grabbing our loss and we are going and calculating all of the gradients for all of the different trainable variables inside of our neural network and specifically the siamese model which is our neural network so i've written grad equals tape dot gradient first positional argument is our loss which is this over here and then we're basically saying calculate all of our gradients with respect to this loss for all of our trainable variables so the second positional argument is siamese underscore model dot trainable underscore variables then in order to calculate our updated weights i've written calculate or i've actually written the comments or called calculate updated weights and apply to siamese model then in order to do that i've written opt equals or opt dot apply gradients and then what we're doing is we're zipping all of those gradients and we're actually going and applying them through all of the different trainable variables so then we've written zip grad which is this over here and then we've passed through siamese.model.trainable underscore variables so in a nutshell what these two lines are doing are first up calculating all the gradients for our different weights within our specific model with respect to our loss then our optimizer is actually applying our learning rate and slightly reducing the loss by changing our weights to be a little bit closer to the global optima so effectively what we're doing is imagine we've got this big curve if we know that uh right now our gradient is down here if we apply a learning rate to that gradient ideally what should happen is by applying them in a certain direction we're effectively going to reduce our loss across the board as we go and train our neural networks again if you want a deeper understanding of this i do hit me up i'm more than happy to write a little bit more or give you a little bit more information okay but for now what we've actually got is we've actually got our train step function what we might also do is we might also just write um print loss here so we can see if that actually runs okay not can't remember if we need to do something a little bit more sophisticated here to get our loss but that's perfectly fine we might also need to return it down here return loss we'll see if that works okay so that is our train step function done so we went and wrote a ton of stuff in there but in a nutshell what we do is we record all of our operations here we get all of our data we go and perform a forward pass we calculate our loss we then go and calculate all of our gradients and we update our weights across our neural network and we are going to try to return a loss to c if we get that out so return the one disadvantage to actually training it using this method is that you have to go and define all of your progress bars and all of your loss metrics being output when you actually wrap this up inside of a model it does look a little bit cleaner so that's something that i went and learned after i actually built up this tutorial but that's perfectly fine i'm going to show you how to do that later okay so in this particular case though we've got our train step function now defined now what we need to do is actually go ahead and build our training loop so let's go ahead and kick that off okay so our training loop is going to be called train i know super unique but in this particular case i've just written df train and then to that we're going to pass through our data and the number of epochs that we want to train for and what we're then going to do is we're effectively going to loop through the epochs we're going to loop through uh each batch and then we're going to run our train step here so run train step yeah okay let's go and fill this out now so let's do it okay let me just paste that there for now we'll come back to that okay so that's the first loop now done so we've gone and we're effectively looping through each one of our epochs so i've written four epoch in range one because we're going to start at one and then we're going to increment epochs by one otherwise it's going to start from zero so we've written four epochs in range one and then comma epochs in caps plus one colon and then we're effectively just printing out what epoch we're up to so it'll print epoc and then our current epoch and then all of the epoch so it'll be like 1 out of 50 or 1 out of 100 or 2 out of 100 and this will sort of give us a bit of a status update then we're defining our progress bar and again this is where i sort of said that if you wrap this up inside of a model class you don't have to do this explicitly the native api just does it for you but again this gives you a little bit more control so i've written prog bar equals tf.utils.progbar and this is in capitals and then we're going and passing through the length of our training data because effectively we're going to increment every time we go through a specific batch so that is our first loop now done then what we need to do is actually loop through each batch so i've written four idx comma batch in enumerate train data so this is going to give us a counter and the actual batch itself let's go ahead and finish this up okay that is our train step now applied so what we've gone and then applied is our train step function which is this over here and we're applying that to a single batch and then we're updating a progress bar so i've written prog bar dot update and then we're passing through a specific index which we had from up here so effectively every time we go through a batch we're going to update our progress bar and that gives you that cool little keras progress bar that sort of goes across the screen the last thing that we need to do and again this is optional is to go and save our checkpoints so let's go ahead and do that okay that is our checkpoint function now applied so i've gone and written if epoc modulo 10 equals zero so effectively we're going to save it every 10 epochs again you could change this if you wanted to then what we're going to go ahead and do if this is satisfied is we're going to run checkpoint dot save and we're going to set up file prefix equal to our checkpoint for prefix which is what we defined over here so it's going to start with ckpt and i think that is our train function now done so we've gone and looped through each one of our epochs and defined our progress bar we're then looping through each batch and then applying our training step and then we're going to save each checkpoint every 10 epochs assuming we have that many epochs all right let's go ahead and test this out now so we're now that's 5.4 now done so we can actually go to step 5.5 and actually try to train our model so we're going to define our epochs let's set it to uh we should set it to more than 10 so we actually see whether or not we're saving so i'm going to set it equal to let's say that's 50 for example and then what we can go ahead and do is in order to start training we can run train pass through our training data and then pass through our epochs actually i just realized we've got an error there so we've defined data as data over here but we've gone and defined this as training data this should actually be data over there and data over there because what we're going to go ahead and do is pass through train data to this function rather than the full data so if i pass through data over here then that's effectively going to be replicated over here and over here so that's fine so all that changes so let me undo that is we had train underscore data as the data that we're going to be looping through but what we actually need to do is set that equal to data because we'll pass through training data as the parameter okay that's fine that is all good so what we now need to do is actually go and train our model so let's test this out so i just want to apply that these print loss functions and return loss functions over here so we'll see if that works and then go from there all right so i'm going to write train and then we're going to pass through train data and then pass through our epochs and fingers crossed this will work uh we're not actually returning a loss we need early uh eager execution in order to print that up okay that's fine you can see that we are actually training in this particular case so the reason that you're getting these values rather than the actual loss value is because you need something called eager execution set in order to get that printed out but that's perfectly fine don't worry about it we'll be able to see our model trained relatively quickly because you can see it's training pretty fast but in a nutshell that is our model now training so again if you wanted a little bit more information as to how to print out the loss hit me up in the comments below i'm more than happy to help you out i had a feeling this wasn't going to work because i remember i did it slightly differently in some of my other models but that's perfectly fine it's all good our model's training how good is that no errors looks like everything's happening pretty successfully so on that note that about does wrap it up so let's let this finish and we should be good to go in the next tutorial so on that note that about does wrap it up so what we've gone and done is a ton of stuff in this tutorial it's still training at the moment but we have gone and defined our loss and our optimizer we've gone and established our training checkpoints we went and built our train step function which is a custom train step function and again i think the lesson learned that i took away from this series is that it's probably more efficient to actually wrap this up in the model class because you get all of the progress by stuff able to print out your loss function a whole heap easier you don't need to work with eager execution that's perfectly fine it's still going to work we then went and built a training loop and we also kicked off the training of our model so ideally in the next video we'll actually get to evaluate it save it and then perform a real-time test but on that note that about does wrap it up thanks so much for tuning in guys hopefully you enjoyed this video if you did be sure to give it a big thumbs up hit subscribe and click that bell if you've got any questions comments or queries by all means do hit me up in the comments section below i'll be happy to help you out thanks again tuning in peace you
Info
Channel: Nicholas Renotte
Views: 3,365
Rating: undefined out of 5
Keywords: facial recognition, facial recognition python, siamese neural network, siamese neural networks for one-shot image recognition, siamese neural networks an overview, siamese neural network python, siamese neural network face recognition, siamese neural network example, siamese neural network tutorial
Id: Z3Bafw9DO7o
Channel Id: undefined
Length: 31min 7sec (1867 seconds)
Published: Thu Sep 30 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.