Image Inpainting with Segment Anything Model (SAM) and Stable Diffusion

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone today I'll show you image impacting using signalizing model and stable division basically image impacting is image reconstruction or image repaint here you can see in the first image that a car and a guard inside the desert area and first I have replaced this area with a highway you can see here and second I have replaced this Sky area with the Eiffel Tower and finally I have replaced this girl with a young man here so today I'll show you step by step implementation of image imprinting using segment anything model and the stable Division and before going to implementation I would like to share some basic idea how these two models works together for image in painting here is the basic diagram of how segment and the model and subdivision model were together basically segment anything model same takes an RGB image as input and based on this input image segment anything model we can generate lots of segmented marks for each part of the segmented image so here you can see some sample of the segmented Marks here basically seven marks is the 2D binary image here you can see the first marks is for the sky portion and second marks you can see for this area and Third Mask is for this wheel of the car so in this way basically second entry model returns us reloads of segmentation marks which is a 2d binary image with some other information like bounding box information coordinates areas and lots of some other information so basically stable division model need only this segmentation marks as input and this establution model is used for image generation and it takes basically three inputs one is original image second is the segmentation marks and third is a text prompt basically from these three inputs stable division model generates a new image so if we send this original image to stabilitation model and this segmented marks generated by the segment anything model and it takes form then stable division model will generate a new image inside this segmentation marks area so this is the basic idea how segment anything model generated segmentation marks and how stable division model use this segmentation marks and original image and a text prompt for generating a new image inside this segmentation area so in this way basically secondary model and stable division work together for image impacting and finally here you can see the output of the stable division model basically I have used lots of marks step by step to generate new image using stable division model and this is the final output of stable Division if you are interested to know details about sequence anything model you can watch my another video from my channel and and I'll give the video links inside my video description box and also you can watch this video from this link here so now I can go for step by step implementation of image impacting using segment anything model and the stabilitation in Google collab so let's get started now I'll show you how establution generate image using the mass of anything model so first of all we just need to set the runtime so here uh let's change the runtime you just need to sit here GPU we save it also you can restart the runtime then all other sessions also will be closed so first you need to check the GPU status and here you can see that is the default GB is assigned and first we'll get the current working directory of this Google app notebook and we assign a variable home here so you can see the home directory and first you need to install the segment anything model from this git link after the installations of second editing model we just need to download the segment anything model to it and all of you know that there are three types of segment anything model weights one is beat is heavy model beat a large model and beat b is the beta model so a heavy model is more accurate but beta model is faster than the lightweight I will use the weights of heavy model so we just create a voice directory inside this home directory and inside the home directory automatically rewards 0 to duplicated and insert the words definitely will get the model weights so here you can see uh words deleted and inside there was directory uh heavy modern weights of Sigma energy model is downloaded and now inside the home directory we just need to install some important packages like torch transmission supervision and also uh stable divisions so we just need to install one by one submission and then it initial supervision basically submission is used for the visualizations and also it is used for The annotation image annotation and then you need to install a stable diffusion so all of this we need to install software Computing the installation so you just need to assign the device and here definitely will get the quota0 and you need to initialize the stabilitation model and inside the pipes and here we use and here we use stable division to impacting for the imagination using the mass of the segment anything model so here we will just initialize the system division models so initialization of the stable division model is done here you need to just assign some importance parameters of the segment editing model but here you can see the pause uh oh we have downloaded the AP model part here we assign the beta model so you just need to change to ease and also you need to copy this model with path and paste it here for details of segment anything model you can watch my other video on segmentation model is from the video description box next we just need to confirm yes it exists true so this model type and second path is used to initialize the segment anything model so here you use model type and the sequent path and basically we will use some automatic Mouse generator to generate marks for our image and this marks will send to the to the stable division model uh for generating the image here we'll use opencb to read the image and just then we convert this bizarre image to RGB and we have resize the image to 52.5 D1 because stability Vision only works with this shape of the image so we just need to change it to 512 so first of all we just need to upload an image inside this home directory so you need to click here and upload image after uploading we can rename this image so it will be easy to handle so here you can see the original image size is a very large we just need to convert it to 5 kilowatt virtual uh because it's evaluation only works with this shape of the image now we'll send this RGB image to same automator and simultaneous generator will return some information for all of the signal part of the image for details you can see my another video on secondary model and the length of the result which shows the number of marks divided by the same atomic mass generator and for each of the segmented part of the image basically this simultaneous generator will return some information like segmentations area bounding box predicted IO Point coordinates stability score and the probe box here you can see here you can see inside the loop we just append all of this information to this code variable and after keeping this information or result uh we'll show all of this segmentation rounding boxes and point coordinates basically we need here only the segmentations uh boundary box and one coordinates information will not require for the stable division model so here we will short all of this information based on the area written by the segment energy model so after shorting inside the simulation 0 we'll get the largest bounding box information or largest segmentation part and inside the simulation one we'll get the second largest segmentation part of the image so that's why we we have shared all of this information based on the area here you can see and if we plot first 10 largest segmentation part uh total number of generated marks is from here you can say uh 81 total number of generative mass is 81 length of the result return this information so from this 81 marks we just draw so this part basically draw first and largest segmentation part of the image or segmented marks first and largest segmented mass of this image generated by the second anything model so we can draw here you can see 0 1 2 3 4 5 6 7 8 9 so this is for the wheels on the car uh there's the human body so uh we can draw individually here uh we can just draw the first largest segment part of the image here you can see largest marks is here so basically this part of the original image in the largest segmented part so uh we can replace this part of the image using stable division so we can generate a new image here using seven Division and the using this segmented part of the image here we need to assign it prompt a fe highway in Germany and Max image is the segmentation zero so this is this part is the segmentation zero basically this part is the segmentation zero and here we replace by a free highway in Germany um but this is not a Highway uh looks like a desert but we'll replace this one this portion largest second part of this image by a highway so here you can use this type of complex language uh in the stable division model so basically uh in inside the establ division model uh we just need to send a prompt an RGB image and the marks so here I have assigned the max image is the largest segment part of this image basically this image I will send this image and also this original image and a prompt prompt original image and the marks and based on these three information stable division model will generate a new image and the name is I'll keep this new image inside the same diffusion out this variable so I will execute execute now this we did not execute this version so first we need to execute this so then I need to execute this and here you can see the establ division is generating a new image based on these three information uh like this prompt original image and the marks and here you will see the mode of this image mode is RGB so we just need to and it is the pil image so we need to convert it to numpy array and then we need to convert this RGB image to PCR because the mode is RGB and then we compare side by side this original image and the stabilitation output image oh wow highway is generated by the establ division model if we change the prompt uh the new road will be generated we can take it just if we if we Highway in France we just need we'll change the prompt and execute this again and we just need to convert this klms to numpy array and if we say here you can see the highway another new highway is generated by the stable division basically establish division use uh this Mass generated by the same model this original image and a prompt these three information is processed by the stable division model here you can see the prompt image and the marks this three version and basically inside this marks area of this Mass a new uh image will be generated and added with the original image based on the prompt so uh we have got this new image inside this segmented larger signal part of the image and now we will replace one by one now we would like to replace this portion this portion would like to replace this portion this is Sky and also here so we first we need to save the marks for this version generated by the segment anything model uh we just need to check the marks for this question so here you can see I think this mass is suitable for this sky so here you can see the marks so this is zero segmentation zero this is segmentation one and this is the segmentation two and uh the human body is segmentation uh four so now we will use first this segmentation two to replace the sky so basically we will replace this sky with uh another one so how you can do this uh this is already done basically in the next time I will send another prompt and this marks this new image basically now I will send the marks for this Sky generated by the segmentality model and also this new image generated by this double division model and I will give a prompt uh to generate a new image inside here here you can see the prompt to replace this Sky a beautiful remote view of the eye filter this is the prompt and Marx is the zero one two segmentation marks would be 2 and the original image is the new image added by the stable division so here you can see the new prompt a beautiful remote view of the Eiffel Tower and image is the same division out the new image and the marks so max is segmentation two Sony image is generating by the establ division in painting and the image mode is RGB so you first and it is pi image so you first you need to convert it to numpy array then we just need to convert this RGB MS2 VCR and also we need to again convert this first output from RGB to BGR for display participating by the solution so here we will check three image first original image and second one and this is the third one here you can see the eye filter here uh basically I have seen here a prompt I have sent here a prompt a beautiful remote view of the Eiffel Tower and inside this sky and also I have sent the marks of this portion and this image divided by the seven division this organ image and New Image is generated here now I would like to replace this card uh by a young man so how we can do this so we just need to right here you know man with black shark and this stage will send this image as original image segmentation Max 5 because uh I'm showing you which one the segmentation marks five here you can see I would like to replace this person so this is the segmentation Mark is zero this is one this is two three and this is four if I would like to generate an image for this mask uh we just need to use segmentation 4. here we just need to use segmentation 4 uh also we need to reset again by plotting in the single marks so here we can plot segmentation 5 with original image so it would be four yes we have got this segmented Mass segmentation 4 to replace uh two generations for this portion uh with a young gentleman with flash HR black shirt so here black the T-shirt a young man with black T-shirt now image is generating by the subdivision uh using marks of this portion and I have sent a new prompt a young man with black T-shirt and uh this image at the original image because this is the segmentation of two here you can see in the image is segmentation same efficient out 2 and prompt is this prompt and Max is thus oh something missing here we just need to use segmentation 4. so we just need to again execute this and if you plot this four image using grid search two by two oh nice here you can see a new man uh replace here so it's very fantastic in this way basically step by step we can transform the image using the simulation mask of the second anything model and the stable division so this was the original image and first step we just use the largest part of the segmented marks centered by the same model and we will generate a new image using this using a prompt and second step we replace this sky with the remote view of the Eiffel Tower and in the third step we just use the fifth largest segmented mask to replace this body from the image and generate a new image using the stable division so it's really fantastic and if you change the prompt and execute again and again different different image will be generated by the stable Direction so this is all about the same with the establ division uh if you are interested to watch my next video please subscribe my channel and keep in touch with me thank you all thank you very much
Info
Channel: SILICON VISION
Views: 1,291
Rating: undefined out of 5
Keywords: sam github, sam roboflow, sam model, segment anything model, segment anything model github, segment anything colab, segment anything github, segment anything paper, segment anything demo, segment anything model paper, computer vision, opencv python, deep learning, SAM expalnation, extract a part of image using SAM, python, segment anything model from roboflow, segment anything model roboflow, image segmentation, artificial intelligence, image impainting, stable diffusion
Id: x4-kC0fNU8c
Channel Id: undefined
Length: 21min 36sec (1296 seconds)
Published: Tue Aug 01 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.