Instance Segmentation using Mask-RCNN with PixelLib and Python

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

what's happening guys my name is nicole astronaut and in this video we're going to be taking a look at instant segmentation using a model called mask rcnn so this means that we'll be able to segment ourselves from different objects within a particular frame and we're going to be doing it in real time using opencv and a webcam let's take a deeper look as to what we'll be going through so in this video we're going to be doing three key things so specifically we're going to be leveraging the mask rcnn model to form something called instant segmentation so if you've seen an object detection walkthrough or if you've seen an image classification walkthrough this is sort of like the next level so rather than just detecting a or classifying an image or detecting an object this takes it one step further and actually traces out the shape of that particular object within a particular frame so in order to do this we're going to be leveraging a library called pixel lib so pixel lib works really really well so we're going to be installing that we're then going to capture real-time video using our webcam and then we're going to segment objects in real time and perform our instant segmentation let's take a look as to how this is all going to work so as i was saying first up what we're going to be doing is installing pixel lib from python so this is just a straight up pip install pretty simple then what we'll do is we'll use opencv to capture our frames in real time pretty similar to what we've done for our other real time detection videos and then last but not least we're going to apply our mask our cnn overlay to be able to mask out our image ready to do it let's get to it alrighty guys so in order to go and perform our instant segmentation there's going to be three key things that we're going to walk through so as per usual we're going to take this step by step and all the code for the full tutorial is going to be available in the description below so you can pick it up and run with it now in this particular case our first step is going to be to install and import our dependencies sorry i've just stepped into that then what we're going to do is set up our model so this is where we're going to load up a pre-trained mask our cnn model so which already has pre-changed which already has pre-trained checkpoints that we're able to leverage then what we're going to do is we're going to perform some real-time capture with our webcam but again you could do this with an image or you could do it with a pre-captured video feel free to do that if you'd like and also we're going to be doing it with a pre-trained model so the key thing to note is that this is going to have pre-trained classes if you'd like to see this done with a custom data set so for example how to go through labeling how to go through training as well as leveraging that particular custom trained model do let me know in the comments below i'd love to hear your feedback now in this particular case we're going to start off by installing and importing our dependencies so there's three key dependencies that we're going to need now in this particular case i'm going to be leveraging tensorflow so tensorflow is our main back end then we're going to need pixel lib and then the last dependency that we need is opencv so tensorflow gives us our acceleration or our deep learning platform pixel lib gives us a nice wrapper to be able to leverage instant segmentation and then opencv allows us to work with our webcam and our images so let's go on ahead and install this other key thing to note is that if you wanted to run this inside of your own ide say for example pycharm or vs code you could do this the only thing that you need to keep in mind is that when we're using magic commands like exclamation mark you need to run this at the command prompt so rather than having the install inside your python script do that at the command prompt rather than having it in there all right cool let's go on ahead and install these dependencies all righty so in this particular case it looks like we've got all of our dependencies pre-installed but what we've gone and done here is written one line of code so this one liner allows us to install all of our key dependencies so i've written exclamation mark pip install and then tensorflow equals equals 2.4.1 and then tensorflow dash gpu equals equals 2.4.1 so that's going to give us tensorflow these two now if you're or if you've got a gpu on your machine by all means do make sure you go or particularly an nvidia gpu do go on ahead and install cuda or cu dnn if you need a hand with that hit me up in the comments below in this case i've already got it pre-installed on this particular machine so we're good to go our last two dependencies that we've gone and imported are pixel lib so p i x e l l i b and opencv dash python so opencv is going to give us our webcam interactivity and pixel lib is going to give us our instant segmentation wrapper now that that's done what we can then go on ahead and do is actually start importing our dependencies so let's go ahead and do that so we've installed them now we need to import them so let's do it okay so in order to bring in our dependencies we've written three lines of code there so the first one is just import pixel lib so this is going to give us our overarching pixel lib wrapper then the next one is from pixlib.instance import instant segmentation so you'll see in a second that we're going to be leveraging this particular function to actually load up our pre-trained checkpoints and i'm going to show you where to get those checkpoints as well as well as having them linked in the description below then our last line of code is import cv2 so this is going to give us opencv so all up three lines of code so import pixel lib from pixellib.instance import instant segmentation and then import cv2 now the next thing that we need to do is go on ahead and download the pre-trained checkpoints or weights for this particular mask rcnn model so in order to do or in order to get this you can go on a head to the matterport github repository so github.com forward slash metaport forward slash mask underscore rcnn and then forward slash releases so from here you've got a whole bunch of checkpoints that you can go and leverage now in this particular case i believe i'm using the mask rcnn 2.0 download so if you wanted to download these all you need to do is hit that but again i'm going to include this link in the description below and down here you can see that we've got this mask rcnn coco.h5 dataset so what you'd want to do is you'd want to select that and go on ahead and download it so you can see it's now downloading down there let's zoom in on that so you can see it a little bit better so mask rcnn underscore coco underscore or dot h5 so that's going to be this or those are going to be the checkpoints that you want to download and again i'm going to include a link to that in the description below so you can pick that up once that's downloaded what you want to go on ahead and do is copy that into the same folder that you're working so if i show you my current folder you can see i've already got mask underscore r cnn underscore coco.h5 already there so i've already got my checkpoints there now this is particularly important because when we go and load it we want to pass through the full path to these particular checkpoints i've already got them there all good to go but again i'm going to make this available in the description below all the links are going to be there so you'll be able to see that so once we've gone and downloaded that model then it's time to actually set up our model and load it into our notebook so let's go ahead and do that okay so that is our segmentation model now loaded so what i've gone and written there is two lines of code so the first one is creating our instant segmentation model and the second one is loading up our pre-trained checkpoints so what i've gone and written is segmentation underscore model equals instance segmentation and this is really leveraging the instance segmentation class that we imported up here and then the second line is actually loading up our checkpoints so think of this as creating the model or creating a blank shell and then the second one is actually loading up our pre-trained checkpoints or pre-trained weights and biases so the second line is segmentation underscore model dot load model let's close this we don't need that there so segmentation underscore model dot load model and then to that we've passed through the full path to our weights and biases that we went and downloaded from the matterport github repository so in this case it's just mask underscore rcnn underscore coco.h5 which represents this file that we're looking at over here so you can see that we've got that over there cool all right so that's now loaded now the next thing that we want to go on ahead and do is perform our real-time capture so again we're going to be using opencv to do this so if you've watched any of my real-time detection videos it's going to look really really familiar the only difference is that rather than using an object detection model or rather than using body picks or one of those or like even media pipe we're going to be using our segmentation model which we just set up over here so let's go ahead and set up our real-time capture first and then we'll go and apply our segmentation overlay so let's do it okay so what i've gone and done there is written a pretty sort of standard set of real-time capture code right so this is exactly the same as any code i would have used previously to be able to perform real-time capture so again if you've gone through my object detection tutorial or any of the real-time media pipe tutorials this is going to look really familiar to you so all up there is one two three four five six seven eight lines of code that allow us to capture a real-time video feed from our webcam but again if you wanted to do this using an image or if you wanted to do it with a video so for a video all you need to do is put your path to video here and it'll effectively do that in this case we're going to be using a real-time video capture device so we don't need to do that if you wanted to do it on an image all you need to do is replace this oh so you're basically getting rid of your video capture loop and you're replacing it with a single image here but again if you want a little bit of help with that hit me up in the comments below let's take a look at what we just wrote so first up i've grabbed out or set up our capture to our video capture device i've written cap equals cv2.video capture and then to that i've passed through video capture device zero so that represents the video capture device that represents my webcam now again you might need to play around with this depending on what video capture device represents your particular webcam in this case it's zero on my windows machine and 2 on my mac so again play around with it if it's not working then what we're doing is we're effectively looping through every single frame that we're getting from our webcam so what i've written is while cap is opened and then colon then we're going to read each frame from our webcam so to do that we're using cap.read then what we're doing is we're unpacking the results from that particular function so we're getting our return value plus our frame now this frame here actually represents the image at a point in time from our webcam now obviously our webcam is going to be on a loop so that image is going to be consistently refreshing but in this case that gives us our capture at that point in time then what we're doing is we're using cv2.iamshow to actually go and render that image back to the user on our desktop so to that we're passing two key arguments what we want our frame to be called in this case i'm just calling it instance segmentation you could name it whatever you want and then we're actually passing through the image at a point in time which is coming from this cap.read function up here and then everything that you can see from here is all to do with quitting out of opencv gracefully so basically what we're saying is if we hit a wait key or if we hit q on our keyboard we're first up going to break out of the loop and then we're going to use cap dot release to release our webcam so this gracefully releases our webcam and then cv2.destroy or windows to actually go on ahead and close our frame i did i find it fascinating that we'll call it destroy all windows anyway enough nesting right let's go on ahead and run this so if i go and hit shift enter this should effectively open up a little pop-up towards the bottom of our screen and we should be able to see a real-time video capture so that's looking all well and good so you can see i've got myself on the screen so this actually represents python grabbing our real-time video feed using our webcam pretty cool right so again nothing too crazy there we've done that before so what we can do to quit out of this is just hit q and that's going to close it down so again this is going to allow us to gracefully close our frame now what we want to do is actually go on ahead and apply a little bit of instant segmentation goodness which is what we did all of this stuff for so let's go ahead add our last two lines and we'll actually be able to apply our instant segmentation okay that's all we really need to apply to perform our segmentation it's pretty lightweight i like how quick that you're able to work and operate with this library so what we've then gone and done is we've gone and added two additional lines of code to our real time capture cell so what we've gone and written is segmentation underscore model dot segment frame now there's a whole bunch of different functions that you can access from this segmentation model so you can choose to segment a frame segment and image segment target classes so if you're segmenting an image i believe that works better it'll actually pick up the image and dump it out when you're doing stuff in real time i believe the best method to use is segment frame but again if you find out any more information on that hit me up in the comments below happy to have a chat cool so segmentation model or segmentation underscore model dot segment frame and then to that i'm passing through one argument and one keyword argument so i'm passing through the frame that we're getting from our webcam from up here so again if you had a different image that's effectively where you'd be passing that in and then i'll pass you a second keyword argument which is show underscore b boxes equals true so if we actually take a look at the different arguments we can pass to this so we can choose to show our different bounding boxes we can segment our target classes extract our segment objects save extracted objects you can output the image as well so if you wanted to output a particular image to a particular file path you could do that as well in real time but again you've got a whole bunch of different keyword arguments that you can pass through over here pretty cool right now in this case we're pretty happy with our setup so what we're going to then go ahead and do is extract our image result from this res variable so everything that we're written over here is being stored in a variable called res then what we need to do in order to extract our image which is then being segmented is grab the second result which represents index one so res and then inside of square brackets we're passing through the number one and we're storing that value inside of a variable called image now as of right now we're not going to see those instant segmentation results to the screen because right now in the cv2.iamshow method we're still passing through our baseline frame but if i change this to image and run this now we should get our real-time segmentation actually happening so let's give it a sec ideally we should get our pop-up and then we'll be able to see it so this is a good sign and it looks like we've got a bit of an issue let's take a look at that editing nick here so that particular area that you just saw on the screen was caused because i had my gpu completely utilized on my particular machine now in order to solve this all i had to do was stop the other notebook that i actually had running which had already preloaded pixel lib and then kicked it off again and then this happened and there you go so you can see our instant segmentation is running so as soon as you run it you'll get a little pop-up and you'll see that it's able to start segmenting in this case it's detecting our microphone as skateboard probably not the most accurate what happens if we throw up our phone you can see it's accurately classifying our phone pretty cool right now if i take down the green screen you're going to be able to see a whole bunch of additional stuff which is segmented and there you go it looks like a bit of a rave in my apartment so you can see it's accurately detecting the couch it's detecting a little potted plant somewhere over here it's saying it's broccoli it's detecting the potted plant over there as well as a whole bunch of chairs the dining table is it getting the tv i can't see it looks like it might be saying refrigerator but it looks like it's doing pretty well so that in a nutshell is how you can go and perform instant segmentation in real time using pixel lib and opencv now in this particular case we went and did three key things so we went and installed and imported our dependencies we then went and set up our instant segmentation model and downloaded our key checkpoints from mataport and then we went and performed our real-time capture which allows us to do awesome stuff like this and on that note that about wraps it up thanks so much for tuning guys hopefully you enjoyed this video if you did be sure to give it a thumbs up hit subscribe and tick that bell and let me know what you thought and also let me know what you'd like to use instant segmentation for thanks again for tuning in peace

Info

Channel: Nicholas Renotte

Views: 37,217

Rating: undefined out of 5

Keywords: instance segmentation tensorflow, instance segmentation deep learning, instance segmentation tutorial, instance segmentation mask rcnn, instance segmentation python, instance segmentation github

Id: i_-ud01wFhc

Channel Id: undefined

Length: 17min 12sec (1032 seconds)

Published: Wed May 19 2021