Segment Images & Videos in Python using Segment Anything Model (SAM) | YOLOv5 | YOLOv8 and SAM

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone in this video tutorial I will talk about the latest AI model which was released a few days ago by meta the parent company of Facebook and the model name is segment everything the segment anything model is capable of identifying and extracting objects from the images as well as from the videos segment anything model is an image segmentation model and using this image segmentation model we can solve image segmentation problems segment anything model can be used for various tasks so the primary aim of the segment editing model is to use this for regular segmentation but I may use this to help me in adaptation this can be a great animation tool as well we will see in this later part of the video as well so segment anything modern has been trained on 11 billion images and 1.1 billion mask and has a strong zero short promise on a variety of segmentation so let's dive into the details and see how does the whole segment editing model works foreign .com website over here so now you can see over here the detail is presented in the Home tab I am frankly so here the extensible output is presented and the zero short generalization is presented over here you can review it plus if we just go below down uh PNR demo is also presented how you can do the segmentation on images and here you can also see that the segment yeah anything model is being trained on 11 million images and 1.1 billion so mass as well and the data set is available as well so you can download this data set as well okay and uh and here the architecture of the model is being presented you can further read the complete uh paper by just clicking over here and it will take you the paper tab okay and here the some frequently asked questions which you can see over here and where the acknowledgment is being given so let's first click on the demo and see how does it works from we will take an image from the gallery and I will upload an image as well so first let's let me upload an image and see how does it works okay so I'm just going over here and just upload this sample image I am just uploading December here images and let's see what how does segment editing works okay so now you can see that I have just uploaded an image and if I just hover it over it like you can see when powering over an image okay so now I'm just hovering over the image you can see over here uh it's doing the segmentation okay but if I just want to do the segmentation on complete image I will just click on everything from here now you can see that uh it will do the segmentation on the complete image so now you can see that we have done the segmentation on the complete image okay and if you want to uh cut each of the objects separately you can just click on this cut out all objects so this will cut each of the objects separately as one so this might take few seconds so now you can see over here each of the object is being cut separately like you can see over here we have the Lego cut separately uh we have the uh this uh score chart separately and you can see that the player this player has been cut separately and this Federal appearing over here is being cut separately so in this way you can cut each of the objects separately and you can see that we have 100 objects over here like which are being cut separately the 10 circuit has been cut separately as well so in this way you can do the segmentation on the complete image and cut out each of the objects separately as well so let me just close it uh plus you can do multi mask as well okay so if you just go to everything again okay so we have covered uh cut out all objects and I have shown you how you can do the segmentation on the complete image as well okay so now plus if you just click on box uh you can draw the boundary box around the image as well okay so like you can see over here I have just drawn the bounding box around the uh person and I can draw a bonding box around here as well now you can see here we have drawn the bounding box around here as well we can just cut out objects over here by just clicking out cartoon objects so now you can see that uh we have cut out objects over here okay and if I just click on Power and click and add mask so let's see what does it work so you can see that we have added mask over here and we can just do it multi mask as well okay so if I just click over here and multi mask so now you can see over here uh this is the multi mask so if I just click on Multi mask tab so you can see that we have created a multi mask over here as well now just unselect it and now you can cut out objects over here as well like you can see that we have cut out objects each of the objects separately uh plus uh you can also take a sample image from the gallery as well so let me just take this docs image and just upload this uh now if I just click on everything it will do the segmentation on the complete image that you can see over here so now you can see over here uh we have done the segmentation on the compute image uh now if I just click on cut out all objects it will cut each of the objects separately which we can see over here so this might take few seconds or these are the from the previous image but this is from the latest image so let's see how it works uh so let's uh I now you can see that we have got each of the object and now you can see here the dog and everything the trees behind the greenery behind we can see all that thing over here okay so this is the in the tail or this is taken separately now you can see the tail of the dog over here okay and so in this way how it works and uh if I just go back over here and click on ADD mask from here and then you will see how we can do multi mask from here so now you can see that we have added mass and we can do the pinpoint as well like you can see one here we have edit mask and now we can do the click on Multi mask from here solely let's see how it works the multi mask tab so now you can see that uh we have done the multi mask over here as well so in this way plus we can also click on cut out objects and we can cut out a separate object separate PS1 okay now you can see that we have cut each of the objects separately uh plus we can also draw a bounding box around each of the object as well so if I just click on close over here and click on box from here and just add mask so just record new and we can draw the bounding box over here as well okay so now now you can see that we have draw a bounding box over here as well so in this way how it works okay so if you just want to delete the paper you can just flow to the paper tab over here and it will redirect you to the paper so you can distribute this paper and you can download this paper and uh go through in detail okay and if you want to just see the data set you can just go to the data set Tab and you can see the data set over here okay they have made the data set public and if you just want to go to the blog uh you can just review this what is segmentation of different segmentation models and all this architecture detail is being presented over here uh you can view the complete architecture over here the training data set and all everything detail as well and the comparison of the data set with the previous data sets as well like open images popo data set on all the previous data set the comparison a comparative analysis has been presented over here as well uh so the main aim in this video tutorial is that we need to implement the segment anything model in Python uh using Google collab so the next part we will see how we can implement this Samuel anything model on images videos and how we can integrate this with yellow V8 yellow V5 uh in Google collabs so let's move towards the next part so this is the segment anything GitHub repo so you can see over here okay and all the detail is being presented over here so if you just go towards the start uh we can see that that the segment anything model has been trained on a data set of 11 million images and 1.1 billion masks and uh has a strong zero short performance on a variety of segmentation toss okay so we will be using the pre-trained model over here and if I just go towards here the repository provides code for running inference with the segment anything model uh links for downloading that trained model so we'll be using the train models to do the segmentation on images videos and to integrate this segment anything model with yellow V8 and the exam uh so links for downloading the trade model job checkpoint so using these links uh if you just click on the links of the modern checkpoints you can download uh these model checkpoints into your local system but in order to download this into the Google app notebook we will just follow these links into our Google app notebook to download the model checkpoints over there okay so and if they have also presented the example notebook over here but we will prepare our own notebook for this uh project okay so here the installation is provided to run the segmented model you need to have the python version greater than or equal to 3.8 plus your python Pi torch version should be greater or equal to 1.7 and torch version should be greater or equal to 0.8 so this is all about uh the packages okay install so installing both by Dodge and Dodge Vision with Buddha support is highly recommended that you uh to run the segment anything model on your local system you should have GPU as they recommend over here and you can install the segment anything by just writing pip install and your you can clone this repository by just copying this from here as well and by using pip install minus E you can install all the required dependencies as well and uh here uh you can see that uh starting point is being presented how you can just start uh or run in your system or we will go into details as we move towards the whole Lab part and here you can see that how you can convert your model segmentation model segment anything model into Onyx format and here the script is presented using the script uh you can convert the segment anything frame model into unexport right okay and here we have the modern checkpoints or model trained weights you can see over here uh three model versions of the what three three model versions of the segment anything modern are available with different backbone sizes these models can be instantiated by running so you can just the command is given over here to instance share this and you can see that we have three different models uh we with Dash as segment segment anything model uh with Dash L segment anything modern and with Dash B segment anything model so there are three different uh models are being presented and each modern has a different backbone size which you can see over here three modern versions of the model are available with different backbone sizes okay and the size of each of the model is different as well what I have seen like this with Dash and model is about around 1.4 GP while with Dash H model is around 2.5 GB as well okay and here the information about the data set is being presented which I have already told you and here we have the lesson Center conscription Partners so details over here okay so let's just move towards the overlap part and see how we can do segment Edition on the images as well as on the videos so in the first step you just need to go over here go over here and just copy this to clone this GitHub why we are just copying because we need to uh go on the GitHub repository into our Google collab notebook so to clone the uh this step into our Google app notebook you just need to click over here and copy this and just go back to the Google app notebook and just paste this command over here okay and when you run this cell in this way you can just clone this GitHub repo into the Google app notebook okay so I'm just running the cell so this might take few seconds okay so it's done so now you can see over here uh we have flown the segment anything a repo into our Google app notebook and here we can see all the files over here okay so now we just need to set this phone repo as our current directory so just click on copy path and just paste this path over here so now we just have set this segment anything loan folder as our current directory okay so CD stands for current directory so so after setting this uh clone folder as our current directory which you can see over here just run this cell we need to install all the required libraries that is required to run the segment anything model on images as well as videos so by just writing pip install minus E it will install all the required libraries that are listed in the setup.pi files and all the required libraries will be get installed so when we run segment anything model on images or videos we don't get that error that Hydra library c-bond library is not installed so it will install all the required libraries that are required to run the segment animated model on images as well as on videos okay so in the step 4 we will download the pre-trained model or download the free train model I think that's would be better sum so now uh with down the pre-trained model or you can say that we will download the print model checkpoints so I will just go back to the repo and you can see over here uh we have three different versions with different backbone sizes are available so uh to download this just click over here and copy link address from here and just come over here and just move this and just paste this link address over here and just run this cell okay so now you can see that we are just downloading the pre-trained model checkpoints into the Google call app notebook so this might take few seconds so let's wait until it gets downloaded okay so we have downloaded the trade model checkpoints okay so now we have clone the GitHub repository we have set this repository as our current directory and we have the print model checkpoints which we can see over here as well okay uh so now in the next step we will import all the required libraries which you can see over here okay uh we are just importing my plot trip library because we want to display an input or output image into our Google app notebook and just clearing our memory over here Buddha memory and let's see if we have the GPU available or not so the GPU is available okay uh so I will just download some sample images uh from my drive I have just taken some sample images from the Google and just place them into the drive so I can directly uh download those sample images from the drive into this Google collab notebook so I'm just downloading some sample image is from the drive into this Google app notebook so okay so I've downloaded multiple images from the drive into this Google app notebook so let me show you as an example image so you can see over here if I just refresh it uh we have multiple images which I have downloaded from the drive and when seeing this slow and in this project folder segment anything you can see this a card image and multiple images which I have just downloaded from the drive into the Google app notebook so as opencp python reads the image in the form of pgr format and if you want to display any image using matplotlip we need to convert that image from pgr to RGB basically any rgpe when see opencv python read any image it reads in the form of BGR format blue green red format but uh to display an image we're using matplotlab we need to convert that vgr format into RGB format so now as we want to display the image using backplotlib PNG is the abbreviation we just shown the back load like you can see here we have import the matlotip library using PLT so to display any image using mac.lib we need to convert that image into from vgr to RGB so I've just converted that image from BGR to RGB and just showing you the uh input image which you can see over here we can see that we have multiple persons walking and we have some buildings in the background so now we will just load the model checkpoints uh which we have downloaded over here you can see that here we have downloaded the model checkpoints dot PT file uh which you can see over here so I will just load this modern checkpoints into the Google Chrome app notebook okay and the model type is with Dash Edge you can see that with Dash H is the model type and we have the GPU available okay so now it will uh so now we will just uh load this model bits and push it to the device okay so just run this cell uh so I'm just loading the model wait and pushing it to the device so this might take few seconds okay so let's see braid and see okay so that's now it's pushing to the device so ah this might take two seconds okay so here we have the details about the modern so just gonna skip this okay so now here you can see that uh we are just applying the SEC mask segment automatic uh segment anything model so this Samsung code segment anything model automatic Master generator is being applied on the images to segment the images so now we will apply segment anything model automatic Mass generator on the images to segment the images uh we will see how we can segment videos uh in the later part of the video segment anything model anything model automatic Mass generator is imported from the segment anything Library so segment anything model automatic Master editor is being imported like you can show over here segment anything model automatic Mass generator is being imported from the segment anything Library which you can see over here okay so automatic Master let us scans the image based on the points we provide so here you can see that we are just providing pointing providing the points per side so using these points uh segment anything model automatic masterators and the images okay so if you just finish the GPU memory error you can just make these points per size when you a bit more or less like eight or four if you just paste the GPU memory issue plus here you can see that we have set the prediction iot threshold as 0.9 uh the lower the iot treasure the more objective so the noise treasure will be the more object it will be picking up and doing the segmentation but if you just uh a lot of junk can be added and this will not do the segmentation very good so uh the default value of prediction is scheduled uh being set by the segment anything requisite p is 0.86 but I have sent it I have set it a bit higher of 0.9 although the deformed value is 0.86 but as you lower the value of iOS a new threshold uh the more object it will be picking up and doing the segmentation but a lot of jet adjunct can be added so uh if if we increase the audio treadual uh we get uh we get free we get more accurate object objects as well so if you just increase the IU uh uh schedule we get more accurate objects okay and less amount of junk okay and here I've just set the stability sketch for threshold you can adjust it as well so just run this end now we'll see the length of our mask so now we are just applying the masterater on the image which you can see over here okay so we're just printing the length of mask over here let's see what is the length of our Mass so and currently I'm just applying a segment addition anything modern on a single image but you can apply this one multiple image and see what results you get okay so now here we have the output image okay you can see that uh here is our output image so let me just open this in new tab okay so you know you can see that we have done segmentation on the on the image and you can see that the segmentation is done very perfectly like you can see that uh person and everything the building is also segmented as well this tree is also segmented as well so the segmentation results are quite impressive and we have done the segment on the image uh you can test it on some other image as well like let me just write uh this image New Image dot jff okay so I've just written this image so just write this image download image.jff and see what does uh how the segmentation works on this image and let's apply segmentation on this set sample image okay so now we are doing the segmentation of this image by just calling this image or applying the master later on this image so this might take few seconds okay so now we are applying the mouse generator on this second image and see how does it works okay so now you can see that we have done the segmentation on this sample image as well this is the person we have done the segmentation already and here you can see that we have you can see the building glasses and we can have done the segmentation on the building glasses as well and the table as when we have done the segmentation on that table as well so that's works perfectly fine so now we will do the segmentation on it uh on video so now we will apply segment anything model on video so for this we will install the meta segmentation package so let me just uh tell you about this package as well to implement I want to apply a segment anything model on video I will be using this GitHub repo by Scotty not you can just give it a star if you like his word okay so I'm just not connected to my GitHub what otherwise I will give you my start okay so this is done impressive work you can see okay so you can see uh we will be using this uh GitHub repo or to do this as a segmentation or to apply segment anything model on video so to apply segment anything model on video I will be using this uh GitHub repo so basically this repo is a packet packaged version of segment editing model so this repo is a package version of segment anything model and to do apply segment anything model on video we will be using this GitHub wrap off by the indoor the current goes to my phone credit so first of all we'll just install the meta segmentation package so let me just go back watch this our code map notebook so in the first step we will install the meta segmentation package into this Google Map notebook okay so this might take few seconds okay okay so it's not need this okay so now if you just go to Next Step uh so now if you want to slice segmentation we have seen how we can uh apply Implement segment anything model on image but now we will see how we can Implement segment editing model on our videos okay so first of all let me just import the segment Auto mask predicted to updo segment to apply segment anything model on video so just import this required Library by swapping this from gear and just running pasting this over here and just run in this cell just paste this over here and let's run this cell okay just attaining our Cuda memory from here okay and let's download some sample videos from the drive into this Google connect notebook so I've just taken some sample videos from the pixel side and just place them onto my drive so I can just directly download those videos from drive to the Google for lap notebook uh now we will do apply segment anything or we will be implementing segment anything model on video okay so now you can just go to this repository and you just need to copy this port from here and just add this code over here okay so now you can see that uh if I just over here we have three different models okay in the segment anything model repository comes with three different train model checkpoints which are given over here with Dash edge with Dash L every Dash piece each of the model has different backbone size okay so here you can just Define uh you can just write the name of the modern with Dash L if you just want to call this model if you just want to call this model just write with Dash B and and if you just want to call the default model just write with Dash Edge okay here we will just write the name of the model type okay and here you can just write the name of the demo video and just write I've just downloaded a demo video a video from the drive this Google app notebook so I've just written the name video to dot MP4 and I will just want to implement segment anything modern on the video so just return video Dash predict and here I've just defined the point per side point for batch and minimum area so if you just paste the GPU memory error just make this point per side where you has 8 and point per batch value is 32. so now just run this uh SEC now we are just implying segment editing model on the video which have just passed over here so this is a quite time taking process so I will just wait until it finishes so I can just show you the output demo video so now you can see that uh it's about to start so as you start I will just pause the video and when it completes I will be back so now it's plenty one percent but it will take some time uh to complete to 100 Okay so add star diving just pause the video so now you can see that it's two percent uh so it is a bit time taking process like you can see here 23 minutes are appearing so I'm just pausing the video and as it completes I will back and explain you that with rest of the code and show you the output video now you can see that we have implemented a segment anything modern on the video and uh here is our output video uh you can see over here so let me just uh you can see here we have the very small screen so let me just download it and show you on the full screen how our outputs look likes so let me just navigate my screen towards it okay so now you can see over here uh we are able to do the segmentation of buildings trees as well as we are able to apply segmentation on the cars as well okay so now you can see that uh if I have we are able to do the segmentation of course you can see here uh the green color over here the pink color the trees we are able to do the segmentation of trees as well okay so this is very perfect so that uh the models works very perfectly uh so now we will integrate the segment anything model with yellow V5 and yellow V8 and see how does it work for this as well I will just go towards the Cardinal a segment editing video repository and here he has given the instruction how you can integrate segment anything model with your V5 or yellow V8 so I will just copy this code from here and I will be integrating the segment anything model with yellow V5 and Adobe 8 so just install all the required packages uh your ultralytics is the package we required to run the YOLO V8 or implement the Adobe model we require the ultra analytics package okay okay so now I'm just importing some required libraries and they have just passed the sample image of mine and I'm just uh using your V8 to move the detections and here I'm just using the yodob 8 and Dot PT pretrain model to do the detections and do the segmentation I am just using with Dash B okay the segmentation model which you can see over here we have three different types of segmentation modern uh which segment and three models comes with like we can see over here uh we have three model versions of segment anything models available with Dash H with Dash L with Dash B okay each motor has different backbone sizes so we are just using with Dash B model over here and for the segmentation and for the detections we are using yellow V8 and Yodo V8 where its path is being passed over here we are just using either we get nanow weights uh YOLO V8 Nano is the smallest model and it is less accurate but it is the fastest model and here this past segment sample image one which I just want detection as well as the segment segmentation so just uh okay so the image is missing okay so let me just upload this image over here again I think I've not uploaded this so just click on upload okay then go to pictures and let me just upload this image of mine over here okay so the image is uploaded now let's run this so hopefully it will work now uh previously the image was not loaded so it was giving error so I hope it will work perfectly now Okay so they have some warning now we can just ignore These Warnings as well and let's see what results so now you can see that we are able to put the segmentation like we can see okay so uh we are able to do the segmentation plus using yellow V8 we have created a bounding box around we look like I am a person so you can see a bounding box around me so it has directed me as a person and you can see that uh it has done segmentation of on me as well so I it has one complete segmentation on me and I am a person so it has detected me as a person as well look so you can see here a bounding box is created alongside me with the green color you can see the bounding box and here you can see that it has done the segmentation as well so in this way we have integrated yellow V8 with the segmentation anything model so that's all from this video tutorial see you all in the next video till then bye bye
Info
Channel: Muhammad Moin
Views: 2,998
Rating: undefined out of 5
Keywords: segment anything model, segmentation, yolo, yolov8, object segmentation, object detection, meta, SAM, pytorch, opencv, python, computer vision, research and development
Id: HFE09sPLK7Q
Channel Id: undefined
Length: 30min 42sec (1842 seconds)
Published: Fri Apr 14 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.