Build a Deep Facial Recognition App // Part 4 - Building a Siamese Neural Network // #Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what's happening guys welcome to part four in this siamese neural network series where we try to implement a siamese neural network model from a paper all the way through to a final end application that we're eventually going to be building with kiwi now this tutorial is the one that i've been waiting for because we're starting to get into the deep learning component of this series so let's go on ahead and take a look as to what we're going to be going through in this video so we are going to be doing three key things here so we're going to first up build an embedding layer and this is effectively going to form almost like a feature mapping pipeline for our specific model so we'll pass through an image this is going to go through our embedding layer and effectively convert our raw image to a data representation that's going to represent what we're going to pass through to our siamese neural network so think about it as though we're effectively translating it to something that's going to allow a neural network to determine whether or not the person is verified or not so it's almost like a data translator to a certain extent then what we're going to do is we're going to create an l1 distance layer so i'll show this a little bit more once we actually take a look at the paper but the way it's sort of going to work is we're going to have two streams of images we'll have our anchor and either our positive or a negative and these are streams are sort of going to be like rivers and the way that we compare them is using this l1 distance layer so we actually bring the rivers together and we use that l1 distance layer to compare whether or not the images or the embeddings are similar enough to be verified or not so that's what we're going to do in step two and then last but not least we're going to compile them together to be able to build assamese neural network and then in the next video in this series what we're actually going to do is start training our siamese neural network model okay but without further ado let's actually get to the tutorial and let's do it so ready to do it let's get to it alrighty so what we're going to do in this tutorial is three key things let me bring the mic a little bit closer so first up what we're going to do is we are going to be building our embedding layer then we are going to build our distance layer so this is going to be our l1 distance layer and then last but not least we're going to make our siamese neural network model now specifically what we're going to be doing so remember we are replicating this paper here so siamese neural networks one shot image recognition now we are specifically we've made a few tweaks and the numbers are going to be a little bit off but that's fine we're going to be building this neural network here now i said right at the start think of this as having two streams of information so we are going to be passing through two input images and we are effectively going to be combining them down over here where it says l1 siamese distance so you're going to have two streams so an anchor and a positive or a negative these are going to be passed through to our embedding layer and then they're effectively going to be compared once you get down to here so we're going to compare them using our l1 siamese distance layer okay let's kick this off so rather than talking anymore let's actually start building it so we are first up going to create a function that takes our input image now the paper uses an image which has the shape 105 by 105 but we've gone and why is that weird we've gone and converted it to be 100 pixels by 100 pixels so these numbers at the top here represent what our output shapes are going to be these values down the bottom specify what are the different layers within our neural network now because we've gone and converted our images to be 100 pixels by 100 pixels they're going to be a little bit off but that's perfectly fine it's still going to work so first up what we now need to do is create a function which builds our embedding layer so we're going to create a function and we're effectively going to be passing through all of our different layers so let's start setting up our function and then we're going to add to it incrementally okay so that is the beginnings of our model so i've gone and written two lines of code there and again we're going to build this incrementally so i've written d e f make underscore embedding so this is defining a new function called make embedding and i've closed it off with a colon and then what we're going to be doing is we're going to be returning our final embedding model now you're probably thinking nick where is this model value coming from or model class remember right at the start in one of our earlier tutorials we went and imported a number of different tensorflow dependencies so namely we imported our model class and we also imported a bunch of different layer type components so we imported the base layer class we also imported conf 2d which is going to be used here we imported dense which is going to be used over here we import and i'm going to explain this in more detail don't stress we import a max pooling 2d which is used over here so we perform a convolution and add a relu activation over here and then we go and perform the max pooling so it's going to be convolution max pulling well convolution relu max falling convolution relu max pooling convolution relu max pooling convolution relu fully connected so this is where we go and perform our siamese distance layer and then we go and perform another fully connected with the sigmoid then produce an output lots of fancy words but really we're just passing it through a data pipeline okay so what we or neural network pipeline really so what we're going to do is we're going to build this up step by step so first up what we want to do is we want to deal with that input so let's go ahead and create our input and now this is going to be 100 by 100 pixels so these numbers are going to be a couple of pixels off but it should be pretty much the same so to define our input we can use our input layer so right up here so let's go ahead and create our input layer cool so that is our input layer now created so i've gone and written inp equals input and then i've specified the shape that we want our input to be so in this particular case it's going to be shape equals 100 pixels by 100 pixels by three channels so if i copy this out of here so that is our input now defines if i write out input you can see that what we've got is we've got a keras tensor with the shape of none because this represents the batch size and then we've got 100 pixels by 100 pixels by three we can actually pass through the name here as well um name is input image copy that rather than having this weird name over here so you can see that our input layer is going to be called input image cool so that is the first part of our neural network done now if you wanted to stick with the exact same shapes as the paper all you need to do is change the input shape to be 105 by 105. so if i wanted to do that type 105 by 105 and you can see that our input tensor is now going to be 105 pixels by 105 pixels cool all right what's next so the next layer that we want to build in is our convolution plus a relu activation and in this case remember our convolution is composed of two key things well really three key things but we're going to ignore the last so our convolution takes the number of filters that we want to pass through so in this case it's going to be 64 filters and our filter shape is going to be 10 pixels by 10 pixels now normally you'll take a look at a parameter called stride and this is how far our pixels or our filters actually move across the image but in this case we know stride is going to be one so you don't you can sort of ignore that and we know that there is an activation that we need to apply here as well which is relu so let's go ahead and implement our convolutional layer and this is going to be using the conv 2d layer so let's go ahead and do it i'm going to call this layer c1 or we're going to name it created as a variable called c1 so uh you'll see that in a sec okay that is uh this should be capital d that is our conf to the d layer now created so i've gone and written c1 equals conf 2d and then we're going to pass through that we want 64 filters as our paper says we've gone and specified that we want the shape to be 10 pixels by 10 pixels 10 by 10 and we've gone and applied an activation which is a rally which again in the paper says it's got a relu activation there and then in order to pass through or start connecting our neural network together we're grabbing our input and we're passing it through to our convolutional layer so this is how the keras functional api wants its input so let's actually take a look at this so if i grab this layer over here and we're going to paste it up there so we've got our input that we defined already we've got our convolutional layer now so if we take a look at c1 you can see the shape is 96 pixels by 96 pixels by 64 channels which is pretty close to this right but it's not perfect so remember this is this represents our input shape these are the actual layers so here we've got 64 channels and we've got 96 pixels by 96 pixels we've actually got 96 but only six by six wait that one's perfect so that's perfectly fine okay so we're going to ignore that oh wait it's because we had 105 105 up here i thought that was something weird there so if we change this to 100 by 100 there you go so you can see it's going to be a little bit different so rather than having the exact same shape which is 96 by 96 by 64 we've got 91 by 91 by 64. if we wanted to have the exact same shape we could change this to be 105 by 105 and you can see we're getting the exact same shape now but that's fine we are going to stick with 100 by 100 and then we are going to keep going so we if i change this back to 100 and 100 that is effectively what we've got here now now we've got one more layer that we need to implement which is our max pooling layer and then really everything from here on out is pretty much repeating itself at least for our embedding layer so let's go ahead and implement our max pooling layer and then we should have our building blocks sort of ready so let's do it okay that is our max pooling layer implemented so i've gone and written there so i've created a new variable called m1 to represent our max pooling layer or max pooling layer 1 because we're going to do it multiple times and then i've set that equal to max pooling 2d pass through that we want 64 units of that so you can see that it's 64 units and then we want it to have a shape of two by two so it's effectively going to take the max value out of a two by two area and return the max value so it's effectively condensing down the amount of data that we've got and we're specifying padding as being the same here so this is something that i noted when i was building this up originally you want to have the padding as being the same in order to replicate a similar output shape and then to that we are going to be passing through a convolutional layer so if again we copy this out and if we take a look at layer m1 you can see it has the shape 46 by 46 by 64. here it's expecting 48 by 48 by 64. but if we go and change our input layer back to 105 you'll see that it mimics that exactly so if i go and do that you can see we have in fact 48 by 48 by 64. 48 by 48 by 64. cool so this means that we effectively have our core building blocks ready for this neural network right so we've gone and implemented a convolution plus relu plus our max pooling layer so these two layers sort of form a core block so you'll often refer to a specific neural network block referred to really frequently whenever you're building neural networks so these two are a block which gets replicated a few different times now obviously they've got different shapes so in this case you can see for at least for our first block we've got a convolution with 64 units or 64 filters which has a shape of 10 by 10 and then a max pooling layer with 64 units which has a shape of 2x2 but if we scroll on you can see the convolution shape changes a little bit the max pooling shape changes a little bit so on and so forth so let's go ahead and implement our next block which is this two of these two over here so we can do that reasonably easily all we need to do is copy these should we copy them or write them from scratch let's write them from scratch okay so this is our first block so i'm just going to comment this up so first block and then second block let's do it okay that is our second block now implemented so i've gone and written two lines of code there so i've written c2 equals conv 2d and this one we're going to pass 128 filters with a shape of 7x7 so again our second block is going to have a convolution which is 128 filters with a 7 by 7 shape with a max pooling oh we'll come back to our max pooling layer all right so convolution with 128 units with a 7x7 shape which haven't has a relu activation relu activation and then we're passing through our the results of our max pooling layer so remember our max pooling layer is called m1 so inside of parentheses we are appending to the end of it and we're passing through our max pool or the output of our max pooling layer of output from our graph which is going to be m1 storing that inside a variable called c2 so this layer over here is implementing this layer over here so if i zoom out a little it's implementing this over here julio then we've gone and done our next layer as well which is our max pooling layout so you can see max pooling over here and we've gone and written m2 equals max pooling 2d and then i've gone and specified that we're going to have 64 units with a shape of 2x2 which has a same padding and then to that we're going to be passing through our c2 convolution or the output from our c2 convolution which you can see there now if we go and copy this and as per usual paste it up over here so if we take a look at the output from our c2 block we're going to get a shape of 42 by 42 by 128 which is pretty much mimicking this are we on one yeah so we're on 105 right now which is why we're getting exactly the same shape so 42 by 42 by 128 and then if we take a look at our max pooling layer we're getting 21 by 21 by 128 which is 21 21 by 128 so it all is looking good at the moment so remember this is going to have exactly the same shapes because outside of here we've got the shape 105 by 105 in our actual model we're going to be using 100 pixels by 100 pixels but this is sort of like a good sense check really okay so what's next so we have oh what have i done there nope wrong so we've gone and done this two layers we're gonna done these two layers we've got to do this layer and then we've got to do that over there i think yes and a flattened all right let's go ahead and build our third block four okay that is a third block and now done so we've gone and written two lines again and again pretty much exactly the same as what we've written up here but now we're passing through our map the results from our max pooling 2 layer so this one over here to our next convolution layer and again we've gone and written the exact same thing or pretty much the exact same thing so c3 equals conv 2d specifying that we want 128 filters with a shape of 4x4 128 filters with a shape of 4x4 so that over there you can see that there and we're specifying that we want an activation of relu activation of relu and we are passing through the output of our max pooling the second max pooling layer or the max pooling layer from our second block as the value that's going to be accepted into that convolutional layer so m2 over here then we've gone and specified another max pooling layer which is pretty much identical to what we wrote over here and what we wrote over here again same exact number of units and a same exact shape so m3 equals max pooling 2d 64 comma and then inside of parentheses our shape which is 2 comma two and then we've gotta specified our padding as equaling same and passing through the results of our convolution from over here as the value at the end cool all right so what's left so we've really we don't have too much left now so we've gone and done let's take a look now where we're up to so we've gone and done our input we've handled this we've handled that and with that so this is our first block this is our second block this is our third block so our next two lines are going to be a convolution plus a fully connected layer so this is effectively going to be a convolution and we'll probably need a flatten plus a dense layer over here so let's go ahead and implement that okay i think that is our embedding layer now done so i went and wrote an additional three lines of code there so we've gone and created a final convolutional layer which is specified as c4 equals conf 2d and this has got units of 256 filters which is over here so 256 256 with a shape of 4x4 so 256 with a shape of 4x4 with an activation of relu activation of relu cool and then two that we are passing through at the results of our third max pooling layer so we're passing through m3 and then we are flattening our convolution so you'll see this in a second but effectively we're taking all of the outputs of our convolutional layer which has three dimensions and we're flatting it flattening it into a single dimension so let's actually take a look at this so i'm going to copy this over here paste it up and this should oh we didn't actually paste our m3 layer up there so let's copy this cool all right so if we take a look at the output of m3 which was a max pooling layer from our third and block we didn't actually take a look at those shapes let's actually do this properly so c3 has the shape of 18 by 18 by 128 which is going to be over here so 18 by 18 by 128 output from our m3 layer is going to be nine by nine by 128 nine by nine by 128 and then we are taking this m3 output and we're passing it through to our convolution so if i take c4 you can see it's six by six by 256 and again keep in mind this is based on the input shape of 105 by 105 so if we change this to 100 by 100 the output shape is going to be a little bit different perfectly fine so uh what were we doing so c4 has shaped six by six by 256 six by six by 256 now just so happens if we multiply six by six by 256 we get uh 9216. so those units are going to be flattened in our feature vector so if we take a look at the shape of our f1 shape we have 9216 units which is the output which is effectively this multiplied by this multiplied by this which you could see over there then what we're doing and remember f1 is just flattening all of these elements together so rather than having it in the shape of 6 by 6 by 256 you're just going to have a single dimension so then and that is the output of the flattened layer and then if we take a look what we've gone and done is we've passed our flattened layer to our dense layer which should give us 4096 units back so if i type in d1 you can see that we've got 4096 which is our 4096 feature vector which eventually then gets passed to our siamese distance late but we're going to come back to that in a second and keep in mind the last activation that we had was a sigmoid as specified over here so fully connected plus sigmoid and again this is a little bit uh densely written up but you can sort of see how these all fit together okay i think that's it now what we need to do is we need to pass this through to our model class in order to sort of compile it because at the moment we haven't actually gone and brought this all together as a model so let's go ahead and do that so when i first wrote the function up here so def make underscore embedding then i also specified this last layer so return and then model inputs equals outputs equals and their names equal so now what we need to do is specify these so let's do it okay and that is the final component done so i've gone and ridden so i've basically just gone and filled this out so i've written model equals inputs and then inside of that i've passed through a set of square brackets and passed through our input which was right up here and then i've gone and specified outputs and again what i've gone and passed through is our final layer which is this dense layer over here so what we're going to be outputting is this big feature vector with 4096 outputs so this is what i meant by the two streams or the two rivers so we're effectively going to have two rivers of data flowing through or two rivers for our neural networks and each of those rivers is going to be outputting a feature vector of at 4096 units so this is almost like translating input images of our faces into a embedding or a feature vector so yeah what have we gotten done there so i've written model equals inputs equals model and then inside of parentheses i've specified inputs equals and then in square brackets pass through input comma outputs equals d1 which is this and then i've specified name equals embedding so if i copy this bring it over here that is effectively compiling our model so if i write a mod equals model that's our final model so if we uh can we type in summary that gives us our final model so this is the model based on 105 by 105 but that effectively gives you an idea of what our model is actually or actually looks like now the cool thing about this is that if i bring this shape over to this side this is effectively one final model which mimics what we had in our paper so this paper is obviously a little bit small but if we take a look that input image over here so 105 105 by one so that's our input image so this looks like it might be a single channel we're going to be doing ours on color hence why it's going to be a 3. but 105 105 by 3 that's our first layer if we take a look at our next layer 96 by 96 by 64 96 by 96 by 64. take a look at our next layer 48 by 48 by 64 48 by 48 by 64. and then if you go all the way to the end i'm not going to do every single layer we've got a dense layer which has 4096. we're in business guys so that is our embedding layer now done now what was it going to do so because our input shape is going to be slightly different if we type in what we're actually going to have is a 100 by 100 up here so if i go and run all of these layers again you can see that our input changes the output shape a little bit but eventually we're going to get a 4096 output layer because that's how many units our dense layer has anyway so we're good to go there so that is our make embedding function and now done so if we go and run this we're in business so if we wanted to go and create this model so let's go ahead and do it so if i type out model equals make embedding that is our model now generated and if i type in model.summary that is our model so we've got all of our different input layers and again you could name these and make it a little bit cleaner but i haven't done that i've sort of skipped it but this effectively forms our embedding so which is i'm going to bring this back over here which is pretty much from here all the way through to here done apart from this l1 siamese distance lab we're gonna do that now but that's that done pretty cool right so that is let me zoom out or not zoom out so that is step 4.1 now done so we've gone and built our embedding layer now what we need to do is we need to bring them together so remember i was talking about the two rivers right so the two rivals as part of our neural network graph we're going to have our anchor and we are going to have either our negative or a positive image which forms the basis of our one-shot classification so if we want to join these rivers together we need some way to compare them right so what we're actually going to do is we're not going to be adding them together so effectively our river is combining we're actually going to be subtracting them so this is our l1 siamese distance layer and it's going to tell us how similar our images actually are which is effectively what allows us to perform our facial recognition or facial verification so let's go ahead and define this distance layer so it's going to take the embedding or the output of these embeddings so our 4096 feature vectors going to take the output or the those as the input and it should effectively output a value out of this we're then going to pass that to a fully connected layer and then output a final result so let's go ahead and do this okay so the first part of that that's our distance layer done no so that the first part of this is defining a new class so we're going to be creating a new class for our custom layout now this is actually really really cool so as part of producing this tutorial i actually did a ton of learning but this actually shows you how to create a custom neural network layer so if ever you need to go and do some other custom stuff this gives you a good sort of template as to how to go about doing it and i've also defined this so that when we actually go and export our model we'll be able to bring this layer as part of it as well so let's go and finish this out okay that is our l1 distance layout now produced so i went and wrote four additional lines after i showed you the class layer and these are the init section is pretty sort of itself or the call section is a little bit more important but let's actually take a look at it in its entirety so i've written class l1 dist and i've written this in caps you don't need to but it's good practice or sort of common practice with python so l one disk so these are all in caps and then to that we're passing through our layer class now this comes from all the way up here it is the abstracted class or the base class for our keras layers and then what we're doing is we're performing a little bit of inheritance inside of our init function so i've written d e f and then underscore underscore init underscore underscore so it's the base init method inside of a python class and then i've gone and passed through self so we can operate on ourselves and then i've gone and passed through asterix asterix kw args so this allows you to work with the this specific layer as part of a bigger model so when it comes to actually exporting and importing this having this actually defined makes your life a lot easier so when we actually go and export it we're actually going to use the abstracted versions so passing this through means that if we wanted to go and pass through specific keyword arguments it's going to handle them innately cool so def underscore underscore init underscore underscore and then inside of parentheses self comma astrix asterix kw args and then close parentheses and then colon and then we're just going and performing inheritance so super and then parentheses dot underscore underscore this should actually be underscore underscore init afterwards and then close parentheses cool so that is that now covered now this is where the magic happens so let me write a comment magic happens here so the call function is actually or actually tells this layer what to do when some data is passed to it so i've written def call pass through self and then remember our two rivers are going to combine so our anchor image and either our positive or negative is going to be brought together and we're going to compare their similarity so i've written input underscore embedding so this is going to be the first river and then validation underscore embedding output of our second river so this is effectively going to be our anchor embedding this is going to be either our positive or our negative embedding and then we are returning tf.math.abs so this is going to return an absolute value and we are subtracting the validation embedding from our input embedding so written input underscore embedding minus validation underscore embedding so that is our l1 distance layout now done so when we actually go and call this it's effectively pretty much the same as how we might call another layer so we can write l1 equals l1 dist and has no attribute l1 disk in it have we gone and written something wrong there let's check okay that is not working and the super object has no attribute underscore l1 disk to underscore underscore init i don't know maybe i typed something wrong there oh actually i didn't uh i didn't go and rerun that cell again my bad all right so that's worked out so remember we had it like that so if i go and run that there you go that's the error that we're getting there so if i just go and pass through those underscores it again we're good so that is our l1 distance layout now produced so if we go and take a look at it there's not going to be nothing inside of it because we're not passing anything through yet but what we're effectively going to do is we're going to pass through effectively our anchor embedding and our validation embedding and we are then going to combine this into a dense layer which is over yes our fully connected layer and then to produce our final output so that's the last bit which we are going to do over here so that is our custom l1 distance layer and this is a defining characteristic in the siamese neural network sometimes what you will see is that they implement a slightly different function here so they'll actually have three rivers or three streams so that you'll have an anchor and a negative and an anchor plus a positive at the same time and you'll actually compare all three of those at the same run or is part of the same run in this case we're just doing it with two streams or two images as part of our graph so that is perfectly fine so let's take a look at what we did there so we first up so we're creating our siamese distance class uh l1 distance class and we've gone and defined our init method which is pretty self-standard so this performs inheritance and then we're doing our actual magic so this is actually performing a similarity calculation which is effectively just this really we're just grabbing one stream we're subtracting it from the other and we're performing an absolute function over the top of it nothing crazy there cool all right that is our one distance layer now defined so that's 4.2 done the last thing that we need to do is combine all of this together so right now we've got sort of like abstract uh components so we've got our embedding layer or our embedding model we've got our l1 distance labor right now that our streams aren't sort of all running simultaneously so we actually need to bring all of this together to produce a siamese neural network so that's what we need to do in step.4.3 so what we're going to do is we're going to create another function so def make siamese model and we're going to bring it all together so let's actually do our first bit and then we'll take a step back and take a look at what we've gone and done okay so we are going to be defining another similar model to what we did up here but we're going to bring it all together now so what i've gone and defined is first up i've written def make underscore siamese underscore model opera open parentheses close parentheses and then colon then i've gone and written comments to handle our inputs so first up what we're doing is we're defining our two inputs because we're going to have two streams we need two images that are coming through so we're going to pass through our input image which is going to be specified as input underscore image and then i've set that equal to input name equals input underscore image so this is effectively creating one input here this is creating a secondary input so let me actually separate this so this is our going to be our anchor image input in our network this is going to be the validation image in the network right so input image equals input and then i've gone and passed through two keyword arguments of specified name equals input image and shape equals 100 by 100 by 3 because that's going to be the shape of our input images and then we've gone and passed through our validation image so i've written validation underscore image so that's our variable name and i've set that equal to input and then again two keyword arguments this time the name of our input layer is going to be called validation underscore image and i've set that equal to shape 100 by 100 by 3. cool what we now need to do is we need to take these inputs and actually pass them through to our embedding model because where right now our raw inputs are just going to be we're at this stage we need to take these raw input images and pass them through to our embeddings and then what we're going to do is combine them with our siamese distance layer so let's go ahead and do that okay let's pause there i realized that what we should have done is rather than calling our embedding model just model let's actually call this embedding because otherwise it's going to be a bit weird and it's going to get kind of confusing so i'm going to run this again run this and then convert this to embedding cool so our embedding layer or our embedding model is actually going to be called embedding now rather than it just being called model kind of dumb doing that so we are then going to wrap this up let me actually go and finalize it and then i'll explain it okay so that is our siamese distance layer done and uh we are actually passing through our two streams so i've gone and written siamese underscore layer equals l1 dist which is beginning to use our distance layer and then i've just gone and named it again i'm super pedantic i like naming so siamese underscore layout dot underscore name equals distance so this means that when we actually take a look at our model summary you'll actually be able to see that name there and then i've gone and created a new layer now so i've written distances equals siamese layer and then to that what we're actually doing is we're passing through this input image to our embedding model which we had from up here so again i've wrapped this inside of a function but it's kind of like redundant at the moment doesn't matter we'll take a look at that later or there's possible improvements there so if somebody goes and cleans this up dude let me know i'd love to see it so we're taking our input image which is effectively this and we're passing it through to our embedding so if i go and run that and type in embedding and input image that is effectively what we're doing so we're taking this input image which has a shape of 100 by 100 by three and the output that we're going to get out of that is our 4096 units which are out of this then what we actually going to do so what we do is we actually do this on our validation image as well right and if we go and type in so let's do a input embed equals this and val embed equals embedding and then validation image cool so that is effectively our two sets of data now transformed into our feature vectors which is a 4096 unit output so if we take a look input embedding is going to we're passing through an image which is 100 by 100 by three and the output that we get is a unit or an output of 4096 values right and again same thing that we're going to get from our validation embedding 4096 units cool so then what we finally do is we go and take our siamese layer which is going to be called l1 dist and because remember this is going to take two inputs so if we take a look at our call function it takes our input embedding and our validation embedding which is effectively what we've got there so i'm just going to delete that because we don't need that and that so what we're doing is we're taking our siamese layer and passing through our input embedding input embedding and validation embedding and that is going to output a 4096 vector or a vector which has 4096 units which we're then going to finally pass through to another dense layer which is going to pass through either a one or a zero so we've still got to do that later but by passing through these two embeddings to our siamese layer we get a result of 4096 so this represents the distances between our input embedding and our validation embedding now all we need to do is pass this through to our final layer to tell us hey do these embeddings or are these embedding similar enough to consider them the same person so that's our final component that we need to do in here so let's go on ahead and do that and that's done guys so that is our siamese model now produced so remember i said that the last thing that we needed to do was combine these distances into a final fully connected layer with a sigmoids with a sigmoid activation that is what this line is doing so i've written classifier equals dense and then we're going in specifying that we want one unit one unit specifying that we want an activation of sigmoid sigmoid and then to that what we're doing is we're passing through these distances which have a shape of 4096 units so we're passing through 4096 units in and we're going to get one output value out which will either be a one or a zero so if we take a look if i grab this line and bring it over here this should be distances so let's define that distances equals time means layer boom that is our classifier you can see that we are going to have an output of shape one output of shape one by one now the last line that i wrote is effectively combining all of this together so i've written return model remember this is our base level model class and i've specified our inputs as being our input image and our validation image so our input image and our validation image and then our outputs are just going to be our classifier so remember we're going to have both of our streams running simultaneously they're going to combine together on our l1 distance layer and we're going to output our classification layer which is this final little bit over here yeah so what do we write there so return model and then inputs equals and then inside of square brackets we're passing through two inputs so input underscore image comma validation image so these are these two components over here and then comma outputs equals classifier which is outputting this classification layer there and then we'll specify the name as siamese network so if we go and do this let's actually just copy this over here siamese network because that did i go and run that yeah okay cool so if i run siamese network that's assamese network fully done now so if i type in dot summary take a look at that so we've gone and passed through our input image part of me so input image a validation image which then goes through our embedding layer they then get combined through to our l1 distance layer which you can see there and we then combine that to a dense layer which then outputs our single value that is pretty cool guys so we've now gone and combined all of that together so that is our siamese neural network now done now if we go and use our function if i can run this so i'm going to call it siamese network or model and run make siamese model that is exactly the same thing let's write down the bottom dot summary that is our siamese neural network ready for one hot or one shot classification not one hot one shot cool so we've now successfully gone and built this now in the next video what we're going to start doing is start training this model but for now let's actually take a look what we did so i'm just going to minimize that and that so we can actually see so in step one what we went and did is we went and built our embedding layer so we went and defined all of the different blocks that we saw over here we then went and defined our distance layer which represents our l1 distance and remember it's really just subtracting the two rivers from each other to determine similarity and then last but not least we combine them all together inside of our make underscore siamese underscore model function which then returns our siamese model which looks like this and on that note that does wrap it up so thanks so much for tuning in guys hopefully you enjoyed this video if you did be sure to give it a big thumbs up hit subscribe and tick that bell see you in the next one thanks again for tuning in peace
Info
Channel: Nicholas Renotte
Views: 3,322
Rating: undefined out of 5
Keywords: python, machine learning, deep learning
Id: sQpPaW17TwU
Channel Id: undefined
Length: 47min 33sec (2853 seconds)
Published: Wed Sep 22 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.