Training (Fine-Tuning) Your Own Stable Diffusion Model Using Colab

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone today we are going to fine tune your stable diffusion model using Google collab next to image models are becoming more and more trending although we can create very beautiful images and videos as well they have some limitations and the limitation depends on the database the model was trained on so obviously wasn't trained on our faces so if we want to create a portrait of ourself made by Van Gogh for example we cannot do that because the model doesn't have information about our phases and this is where fine tuning helps indeed fine tuning means training a pre-trained model on something new something that the model doesn't know last year Roots at all introduced Dreamboat and Dreamboat is simply a technique to teach stable diffusion new object subjects new context for the experiment they used imagine which is a pre-trained model which is now available to anyone just to Google researchers so if we look to their example on the paper we can see that they are inputting like three or five pictures images of a fluffy dog and its class name which is dog because it's a dog right so they are fine-tuning the model using dream both and then the the model output will be a unique identifier for that specific dog so at inference time what happens is that when you prompt uh V in bracket dog in the beach for example well you will get you will get the picture of the V dog on the beach and this is something that it's very difficult to achieve with other models like dalitu for example or even with Imogen so the pre-trained model they are using so if we look at this example again on from the paper you can see that the problem they are using is a v clock in the jungle right and if you input the same prompt in Dali 2 you will get not the same object so not Fidelity and you don't have the jungle so not context right and if you input the same prompt into images which is the pre-train model they are using you get a new context because it's actually in in the jungle or maybe it seems to be but you don't get Fidelity because the clock looks a little bit different from the one we want but if you look at that result using rainbows you can see that you have both Fidelity and new context so you have exactly the same object exactly the same clock in the jungle I'm going to use Google Scholar because it provides access to a powerful GPU if you want to follow along I'm going to leave all of the links I'm going to use during the video in the description okay so this is the main collab interface so what we're going to do we are going to run this code actually it's really straightforward I'm just gonna explain you what this code means just make sure here on top right you are on the Google account you actually want to use to run stable diffusion on and then we can start by running the first section so in this first section we are simply connecting our Google call up to our Google Drive so we will find all of the folders We are going to use inside our Google Drive is going to ask you the authorization for connecting to Google Drive you need to accept it and then after that we need to run this other code which is dependencies and here we are just going to install some libraries which we need for training the model and then running stable diffusion okay once done we go into the model download here we need to choose which model we want to fine tune we want to train so the latest from hacking face is 1.5 which is already here by default so if you want to use that you just run this code otherwise you have the drop down list here so you can choose previous models or you can add ear a path to again face as well in that case you just need to add something like this so if you go into a game phase you will have just to copy the link here clicking on this Square icon so you will copy and then you will paste that here otherwise what you can do if you have downloaded your ckpt file you can upload that into Google Drive so if you click here this is your Google drive right so I go inside that I have some models saved inside so I'm going where I saved my models and here is my model in this case so I right click and I copy path I'm going to insert paste the pot here and then you run the session so I'm going to use the model version 1.5 I'm going to remove this I'm going to run okay once this is done the best starts so we are in this Dreamboat session now and the first thing we need to do is to create or load a session let's assume we don't have a session at all so we need to create a new one the first thing we need to do is to give a session name so this will be the name that appears on the folder we want to create where our model is going to be and in my case I decided that I want to fine tune the model based on my face because I want to create some nice pictures of my face Maybe with some makeup I don't know so I'm going to write Laura underscore test just because I'm testing it but you can call it whatever you want to so this is the only thing we need to do in this section if we want to create a new session if you want to load a previous session you need to write here in the session name the name of the previous session so let's assume that yesterday I created a session name cut example what I'm going to do I'm going to write here in session name cut example that's it I'm going to run the model so let's go back to Laura because this is what this is what I want to do and then we run this section now I'm not gonna lie I already created this Laura test session before because of this tutorial so obviously the session was found and it's loading my train model okay once this is done let's go on the next session which is instant images again let's assume that this session is new so we haven't uploaded already images or anything so what we're gonna do we need to untick this because it's going to remove existing instance images which we don't have so we can simply untick this and then if you have a folder with the pictures you want to upload you can add the link to the folder inside this text box here obviously the folder needs to be in Google Drive not outside of Google Drive and then you need to do the same so for example let me show you this I have this cat example right actually let me show you Laura that's them so I have distance images here so what I'm going to do I have six images I want to upload so I will right click on the folder copy path I'm going to insert the pot in here if you don't have a folder but you want to upload pictures from your computer directly you can avoid adding this link you need this is optional and I'm gonna show you how to do but first let me tell you about this so you have here a smart crop images and crop size is 512 by default this means that the pictures you need to upload have to have the same Dimension now the dimension is completely up to you right but what you need to keep in mind is that version 1.5 was trained on 512 by 512 pixels so it's really ideal that you use the same dimensions also for fine-tuning the model so that's why there is this 512. now if you tick these smart crop images what happens is that whichever image you upload inside this file is going to crop your image smartly now this works but in case you want to be more precise and I would suggest you to be more precise even because we are going to upload just five to ten images so it doesn't require a lot of effort you can use Photoshop for resizing them or there is a very useful website which um Excalibur one suggested and is this one this one here is it's very easy because you just browse from your computer you can upload an image and then here you can choose the width and the height okay so we want 512 by 512 you're going to move these around and then you just need to save files obviously you can add more files all together and then you when you say files it will download all together so it's quite useful and then you can see here you have the picture 512 by 512. now another thing I need to specify is that the pictures you're going to upload need to have exactly the same file name so in my case I'm going to upload these six images and I name them exactly the same just the number is changing so you have louder one louder two lower three lower four lower five and lower six exactly the same that's how they have to be so let's go back to the notebook I'm going to untick this smart crop images because I crop my images with this amazing website I'm going to run this and here is where you have this button called choose files which you need to click and you can choose their pictures you want to upload so I'm going to choose these pictures yeah I showed you before and the upload is pretty quick another thing is is that they have to have the same extension so they have to be all PNG or all jpeg you cannot have one PNG one JPEG and so on and so forth okay so done proceed to the next cell which is captions so in this section we can update or create captions let's run this this session and we'll see what happens okay and here we can see all the pictures we have uploaded my pictures and what we have to do here we need to change the caption so what we can write for example a photo of Laura so we specify the name of the subject which is Laura looking to the right and then we save and then here the now our photo of Laura looking to the camera while talking and save and so on and so forth you need to do this for all of the images you have uploaded so this was the case in which you are creating a new folder with new images if you have a folder with images you use for training a previous model and you want to remove those pictures and use other pictures you can just stick this box here remove existing instance images so what will happen you will be able to remove all of the pictures from the folder and to upload the new ones let's go on let's go into these concept images regularization now in this section you need to upload other pictures this in series this should be way more so if the instance images were from 5 to 10 in this case there should be between 100 and 200 pictures and this is mainly when you want to recreate a specific position or you want to generate a specific style like Picasso or mangog but very important is that this picture needs to be very similar to what you are training your model on so for example if you want to train the model on the um on a woman it would be good to upload pictures of a lot of women right if you want to train your model on a specific dog you will need to upload pictures of a lot of dogs on different positions or also different dog but dogs however if you are training your model on a specific face like in my case you can completely skip this session you don't need to run it and this is what I'm going to do now I'm just going to skip it and now we are in the training part so you will see a lot of numbers going on here but don't get scared this everything is explained very well in this notebook I think and and it's very easy to understand so the first two inputs are the training steps and the learning steps we want to use now there is an amazing paper by patil Cuenca sorry if I misspelled the name and Cuisine and they are doing some experiments to train stable diffusion with Dreamboat and they are giving some interesting results you can have a look at this paper it's quite useful I think something really relevant they found is that with Dreamboat stable diffusion over fits really really easily so we really need to pay attention to the combination between learning rate and training staff what they found is that usually a lower learning rate works better and if we look at this table we can see that when we are training objects a good combination of learning rate at training steps is learning rate of two timestamp to the power of -6 and 20 steps equals to 400 while in the case of phases which is my case now the right combination is a learning rate of 1 times 10 to the power of minus six two times ten to the power of minus six with training steps way higher so 1200 or 1500 and this is what you have here by default so I'm going to use this for now after you train the model if you realize that the image generated are of that quality or they are noisy that means that something in the combination you used is not quite right so you should go back to this session change the training step and learning rate train it again and see if something changes so for example if I use this combination and I get very bad images what I will do I will come back here and I will increase maybe the training steps to 2000 and run again if you still bad I will come back here maybe I will go back to 1500 training steps but we'll change the learning rate to 1 times 10 to the power -6 and see what happens so you just need to play around with this a little bit sometimes you may be lucky and you don't need to change anything going on we have the number of steps to train the text encoder Now by default this is 350 and I'm Gonna Leave This like that if you read here it says it says 200 450 steps is enough for a small data set which is what we are doing we are using a very small data set between 5 and 10 images again set 0 to disable Okay we don't need this then here we have the text encoder concept training steps I'm gonna keep this to zero because I don't have concept images because I'm training only on my face but if you are training on objects or different you know as we said before different position you need to update this and if you read here what it says set it 1500 steps for 200 concept images so if you uploaded 100 or 200 concept images probably you will want to use something between 1000 and 1500 steps then you have the learning rate for both the text encoder and the concept text encoder and also in this case is giving you some suggestions so this should be between four times ten to the power or minus seven and one times ten to the power of -6 Now by default this is one times ten to the power of minus six which I think it's fine for now also something very useful with the learning rate is that you don't have to write it manually but there is as you can see here like a drop down list so if you click you have different options and you can easily switch between them and the same is going to be here now this checkbox offset noise you need to tick this all if you are training the model on a specific style we are not doing this now so I don't need it you may want to upload a text file for each image for the external captions I don't have that so I'm not gonna use it and and then here you have this resolution which is set to 512 you can change this in this drop down list but make sure that this is always smaller than the cropped size you used before so before we used 512 so you cannot go above that right if we crop before to 1000 for example we could have chosen something between 1512. so from 960 to 512 something like this uh I don't think that these other that relevant uh it's just a way of saving your checkpoints I think we can run this as it is training takes about 10 to 15 minutes at least for me so I'm not running it because I run it before as I told you before just to save time you should run it obviously and I'm going in the next session which is test the train model so in this section you actually don't have to do anything really because you have trained your model already the only thing you need to do is to check this box use local tunnel in case you want to use a local tunnel instead of gradius okay and once you run this piece of code you should have a local URL you can click on it the local tunnel if you chose the local tunnel will open so you need to click to continue and hopefully stable diffusion will be that and it is cool so now you can start creating pictures let's see what happens here you have the name as you can see the name of the model we chose in my case it's Laura underscore test Dot checkpoint but probably you will find the model with the name you chose right and then let's start I don't know something very very simple just to see if it actually works I can type a photo of Laura in the jungle let's see what what happens I'm not going to change any settings here I'm just gonna press generate and see what happens okay seems interesting for now don't know it looks not very nice I I must admit it so something we can do before changing all of these settings something that came out from the research I mentioned before is that if you get not very nice pictures you can also try and change the sampling method to ddim which is this one and generate again hopefully I get something nice maybe not so let's um let's add a negative prompt and change also some other instructions here because like the quality of the picture looks quite good that it is not noisy right so maybe we can make something nicer just changing these little options so for example here I can change it to Upper trait of Lara in the jungle AKA realistic EPA detailed cinematic lighting natural candid Maybe Focus I usually also add like Nikon Canon you know this uh cameras and maybe we can add more weight to imperialistic to Natural and candid so when you put the brackets you give more importance to the board so the prompt should give more importance to that word so if you add just one bracket the weight is less say if you add two or three or four so depending on how many brackets you add to your prompt the importance of that word increases and then in the negative prompt we can I don't know we can write something like ugly low resolution would put this in Brackets as well in the negative from this work the brackets work the same so I'm giving more importance to not have ugly results right morbid long neck eyes crossed maybe I can put man because it seems a bit manly and let's see what happens if we take the restore phases if we I will leave the CFG scale a seven this is how much the image will be aligned to The Prompt so if you hover over this uh the title you can see the lower values produce more creative results but usually between 7 and 15 Works quite well so I'm Gonna Leave This as it is and then I can increase sampling steps to maybe 25 with this we increase the number of iterations so in theory if you increase the number of steps you should get better results but the generation of the image will be slower so let's generate again let's see what happens okay we are getting the I think the face looks less like me I think maybe let's try and increase this uh and let's generate again you need to play a little bit with all of these options we have in stable diffusions okay we are going down maybe we can generate again I like this one it's not too bad I will say still doesn't look like me exactly but uh let's see maybe let's increase this again let's increase this to 15 now it's getting noisy so let's go back to what it was before like 10. okay let's change the prompt or Laura I would I would I want something with red lips let's see what happened whoa he's having as well my my same necklace so let's go back to the notebook here and what we want to do we go on with our code and we want to upload our trade model into hug him face so we can use it in a second moment right so what we have to do we stop this running first and then we go into this section here we need to name your concept so this will be the name you want to give to your data set on hiking face and in my case I can call it Laura that's it and then you can choose whether you want to upload the images you used for training the model I can tick that so I will have these pictures in hugging face as well and then you need to create a token on hugging face for doing that you need to have an account on a game face this is for free so once you created your account you click on this link here which directs you on hugging face on your account under access tokens and then here you have the tokens you can create a new token if you want or you can use the one you have already if you have already one and then you click this icon here to copy the token and you paste it here so after inserting your token you need to run this code so once you run it's going to ask you to choose again the files you want to upload into your data set so you just need to click here and add the picture the same pictures you used and then it's going to upload on again face something very cool about training the model I think is that even if you you don't train the model on each part of the object or the subject he will be able to reproduce it so for example I was training the model on a cut I just input some images of the face of the cut from from the front right from front and from the side actually and then when I was generating pictures uh like I was like okay let me try and and ask for a picture from the back right because the model didn't really see how the cat was on the back but it was capable of reproducing the cat from the back now I'm not sure that that was exactly how the cat it's about because the the picture of the cat just took it from internet but um it was perfect actually I can show you so I was turning the model using this uh pictures like all similar picture I took them from a splash and then when I was generating the picture from the back this is the result it seems quite good right and the picture from the front I tried in the jungle as well for the cut it looks pretty good look at this uh I think that all the pictures looks quite good and quite similar to the pictures I use for training the model takes a little bit of time to upload the data set to again face but I think it's quite useful to have it because then you can just use a link for uploading your your checkpoint in a second moment you don't have to save the model anywhere you will have that in hugging face and you can use it whenever you want to you can you can put it as private or public depending whether you want to share with other people or not so it's something definitely worth doing in my opinion Okay cool so now it says your concept will save successfully click here to access it so let's do it and that's amazing so you can see here you have my data set you have the pictures uploaded all the pictures I've used for training my model and then let's see file and versions yeah you have all the informations here you have your model here you can get comments from other people and here you can choose whether to make it public or keep it private now now that we train the model let's see how we can reopen and use it again without going through all of these steps right so I'm going to restart this notebook now so reload
Info
Channel: Laura Carnevali
Views: 34,000
Rating: undefined out of 5
Keywords: stable diffusion, stable diffusion v2, diffusion, ai art, diffusion model, generative ai, generative art, openai, stability ai, ai artist, imagen, stable diffusion on m1, stable diffusion with python, stable diffusion hugging face, stable diffusion github, stable diffusion v1.5, stable diffusion tutorial, googlecolab, colab, train ai model, train stable diffusion, fine-tuning model, fine-tuning, train sd, fine tuning, training stable diffusion, training ai, dreambooth
Id: c6r25rT8DV0
Channel Id: undefined
Length: 29min 57sec (1797 seconds)
Published: Thu Apr 06 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.