Stable diffusion tutorial. ULTIMATE guide - everything you need to know!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
are all your friends and colleagues posting AI images even your sweaty neighbor Jones taking more attention than ever after updating the stadium profile with images always vacation do you feel left out I have no idea what to do and just want to create pictures of dogs wearing Star Wars clothes then this tutorial is for you my name is Seb and I'll be your guide now one of these six images is real and the rest are AI made Can you spot which one if you dare post the one you thought in the comments I'll give you a second if you stick around I'll show you which one was the real image in the end I'll show you how to make AI images just as good as these and all I spent was 5 minutes alright so what I'm about to show you might feel like a daunting task but it's really really easy I'm gonna guide you through every step of the way so we're gonna Google automatic one one one enter the GitHub I'll also provide this link in the description stable diffusion web UI then you're going to scroll down and you have the installation instructions here for Windows and start with python scroll down Windows installer 64-bit download that run it it's super important that you check this box at python to path check it then just install we can close this and this head back into the GitHub we're going to install git download that 64-bit git for Windows save you can just run all this through you don't need to change anything leave everything at default foreign command by typing CMD down here start the command prompt and then copy paste this git clone that will copy the files needed to your computer so you just copy paste that you're gonna go to hugging face and download the models for this so I'm gonna link that in the description you can also go click this link here click here official download now first you need to make an account and when you create an account go back to this page click here to access the repository and when you've done that you should reach this page so when you've logged in you're going to see the weights here ignore the full one just call the standard one click that download it's a pretty big file but four gigabytes it's going to take some time when that's finished you're going to enter your users folder and there's a folder here stable Fusion web UI this is been created from the command prompts the git clone we ran earlier so just enter this folder enter models enter stable diffusion and this is where you're going to put your model so just drag that in there rename it to model and now we can run stable diffusion so let's back into the directory here you're going to look at this file the web UI Dash user let me give you a little tip we're going to first open Notepad [Music] I'm gonna drag that there and before this line here I'm gonna add git pull so what happens here that before you run stable diffusion every time this command will pull the files from GitHub the latest files so it will automatically update to the latest version so let's save that exit it and run the web UI user now it says already up to date that's because we run the git pull and we have already the latest files now this may take some time depending on your computer and bandwidth I've spoken to users that have been waiting at this stage for about 50 minutes but that's like worst case so just stick with it leave it as this it's gonna finish up eventually now the installation has completed and this took about 5 minutes on my system which is a fairly fast system stable diffusion is now up and running locally on your computer and how you access that is entering a web browser and this URL so let's do that I'm gonna copy that in here then this is our user interface for stable diffusion now everything runs in the background here and here is where you're going to see all your progress from creating the images but you don't need to look at this you can use the web user interface for now so let's talk about what this is first we have some tabs up here and the first one is text to image so this is where you would create an image out of pure text so let's say for example we want a photograph of a woman with brown hair portraits now before you get antsy and press generate I want to go into the settings here because we're going to change one thing and this is the progress bar and we want to show the image creation progress for every step well not every step but let's put it like for every three steps and apply settings and then go back and now we can generate our first image and here we have a photograph of a woman with brown hair and a portrait now it isn't the best result that we can get in stable diffusion but then again we didn't do much here so what you need to do first is work with the prompts and what you do is you say what you want which is first your object this is the photograph of a woman with brown hair and then you could add more information like we do here it's a portrait we could say photo by Irving Pam we could add details like hyper realism or or 8K so let's run this again and see if anything changes and now we get a much more interesting result now it may not be what you were looking for but you can change that with a prompt and if you're feeling a little lost with the prompts I would recommend you to look at some great stable diffusion images and copy those prompts and then change it the way you want it then how do you find those prompts you may ask well let's go to lexica.art and here you have a stable diffusion search engine a huge library of images where every image shows you a prompt so let's say here for example let's look at this image it looks like Albert Einstein someone has actually written a portrait of a Kurdish Albert Einstein in Kurdish clothes and then there's lots of info here is by an artist Greg Witkowski which is very Infamous in the stable diffusion Community they wanted this in digital art style not a not a photo which is clearly shown here and more Styles here trending on our station anime Arts HD 8K highly detailed etc etc so you could add anything you want here to try and adapt the prompt to style your image better now let's see if we can find anything that we can play with now let's copy this prompt so let's take this prompt go back here let's keep the photograph of woman with brown hair because that's what we wanted and let's paste the rest in and remove the friendly humanoid robot now we have a little too much prompting here so it's gonna cut some of the end we'll just run it anyway see what happens and this compared to our starting image is looking a lot more creatively interesting so this is how you can play with the prompts to get the results you want now let's look a little more at the settings that you can change now the prompt is one of the hugest Parts in stable diffusion but there are a lots and lots of other options the second thing here you're gonna see is the sampling steps and now we ran this at default which is the sampling steps of 20 with the Euler ancestral sampling method and 20 sampling steps is usually quite low but Euler ancestral is one of the Samplers that can handle that pretty good and if we look at this sample versus steps comparison here we have the okay you learn ancestral here and it's we had 20 steps so let's uh it should be somewhere between here but since the 16 steps here shows it's a good result well 20 will work as well now the steps are like iterations of your image so if your image starts with blur like this like every image does like you see here on all the Samplers for every step or every iteration going forward it would try to create more of your image so if you have this and apparently they've tried to create some sort of Medusa woman here it's trying in this noise to find an image of a woman or Medusa you can see it's getting a little closer here every time now you would say well why don't I use your ancestor for everything since it seems to make images even at lower step counts well you could do that but it's a bit inconsistent with what you're getting as you can see from the steps 8 to 16 up to 64 the image is changing which it isn't at well many of the others LMS here or klms which is one of the most widely used because it gives consistent results here you can see that the image stays consistent most of the way I would recommend you to start with klms but you will need to use at least 50 sampling steps I would say a value between 50 and 70. it's a good start for you if you're a beginner stick with 50 we play around with the other Samplers as well so we're going to take that advice we're gonna change your sampling steps here to 50. we're going to change it to LMS and we're gonna remove some of the text here in the prompt so we get down to um 75 again now since we changed both the prompt and the sampling steps and and the sampling method or image is going to change quite a lot but let's run it to see what happens so this is a fairly interesting image now the eyes look a little weird but there's an option to fix that which is restore faces however if you would want this image and you want it restored with this function you can't just press this and generate again because there's one thing down here called the seed and this is randomized each time and if you run this now you will get a new noise from the start that will produce a completely new image let's save this and I'll show you let's go back to restore faces so we're running this again with the exact same settings and as you can see we're getting a different image now if you want to go back to the other image we actually have that saved in our outputs folder in the stable diffusion web UI so you can find the settings there so you can always find an old image and the settings for it so if we would have copy pasted all those settings from that file it would be the same prompt here the same sampling steps in the sampler and the seed now we will get the same image we had previously this is super important and something you can learn much from so let's see if we're storing faces here actually does something about the eyes here now this generation took a little longer than previously I sped this up but first you saw the image as before and then it started a new generation which would fix the face and now the eyes are looking much much better and the whole face looks well more normal now let's look at the rest of the settings so you have the width and the height here and stable diffusion has been trained on images of 512 by 512 so you're going to get the most consistent results using this size now you're thinking well I want to go 10 24 by 1024 or 1920 by 1080. well if you have Hardware that can handle that sure go ahead but most likely it's going to crash your Sable diffusion so let's leave this at 512 by 5 12 for now the batch count and the batch size here are how many images you want to regenerated so you can just up or down this and if you create more images you will get a grid as well let's look at that but let's remove restore faces to save some time I'm gonna add this up this to four we're gonna generate so now we have four new images and these are all saved in your output folder as well so you can look at them individually here or you can see the grid now if you're a keen Observer you might say well Seb why aren't all these four Images the same since you have the seed put in down here you said if you have all the settings the same the images will be the same well you're absolutely right but when you're running batches like this every new image gets a new seed so the first image here gets this seed but this image gets to seed after that and this after that and this after that now if we would want all new images we could just go back to -1 or press this which is set to -1 if you didn't run a random seed you can just press this button to reuse the old seed if you want it back you will always have it down here from your last generation and in your output folder that it's easy to just be able to press the button obviously the batch count back and let's talk about the scale here now the scale is how closely this the AL will listen to your prompts so working with this image the slower the scale down to one that means the AI will not listen to your prompt it will have a glance at it and say well alright this looks cool maybe I'll listen to it but I'll make something that I like instead so let's see what happens now we did get a woman the hair is well it could be brown it's not a portrait and it feels like it hasn't listened a lot too what was going on here so let's up this to the Max and now we're basically telling the AI now you better look at this we're gonna have a talk later so let's see what the AI does so now we have a photograph of a woman with brown hair Art Deco style is much much clearer now it's in a portrait now looking at the prompt here that we copied from lexico you might also be saying well wait a minute this woman doesn't have exotic alien features and it's well probably supposed to be cyberpunk here but The Cypher Punk background and well yeah I agree with you there's not a lot of exotic alien features here and that could be based on several factors firstly this is a very long prompt so it will evaluate everything here differently the items in this in the beginning will have a high importance and the items in the end will have lower importance you can change this let's say we want the Exotic alien features I have no idea what that would look like but let's try it so we could add parenthesis here this would accentuate this these words and you can add as few or as many as you want the more you add the more focus it will give to these words now let's run this with the exact same settings we had last time however I'm expecting less than great results let's see now we started to see some exotic alien features here but we ended up with this warped mess then you might say well well said what's happening well I'm going to tell you the reason we're getting this is because our scale is too high now klms is very consistent but it has issues under 50 sampling steps and it has issues when going too high on the scale so we're gonna drop this down I say a value between 7 and 14 is usually good so you know let's set it at 14 you're going to higher you can go up to 20. but you have to play around with that value so let's run this again and let's keep the Exotic alien features in here and see what happens and now the AI is starting to understand what we want and all these features here I would assume would be the Exotic alien features we can actually try that we can remove these parentheses and run this again and see what happens and this is now removed now it's a piece of cloth so that's the way how you can adapt The Prompt that you already have without reworking it prompting is the primary way to work with your images now the rest of the settings here they need to be understood obviously but as soon as you understand them most of the work is going to be with your prompts now let's randomize the seed again let's generate four new images and let's see if we can get something to work with an image to image and I'll show you how that works now we have four completely new images and I'm thinking that we should change one of them into a photograph so let's take let's take this one here which is a little bit resemblance of the woman in Battlestar Galactica and Ascend it to image to image now what this is that you can use an input image to create images from it could be a paint sketch that you have or a more refined image like this so let's remove everything here because we want some something new now let's put in a photo so I put in photo of a woman with brown hair I put ocean in background 8K in outdoor photography I figured since we already have a blue background we might use this and see if we can get an ocean out of it now we have the same settings we have assembly steps here 50 we're gonna change it to klms that we talked about earlier and for image to image we have a new setting which is denoising strength and this is the most important setting that you will ever use for image to image I've made a whole tutorial about this setting what happens here since you already have an image you can't just have a full starting noise like in the images we saw previously because then you would have nothing resembling this image that's why you can adjust the strength there of how much you are going to make this into noise and then back into a new image so let me show you how that works so let's say we set this to zero that would mean we're not denoising this at all we're saving everything from this and this generate and see what happens now we should get the exact same image back and we do because we don't give stable diffusion any new noise to work with now let's set this to one which would break it down to nothingness and try to build it up again which would give us problems in stable diffusion because we want something that resembles this and now we're not getting that now we're getting the prompt here because we have denoised this and broken this down so much that you can't remember what this was so I mean we have the photo of a woman with brown hair ocean in background this is all correct you're not getting this so this is where you need to play with the denoising strength and I would say the more you want to move away from your image the higher the settings would be and the more you want to keep this the lower it would be so let's say that we wanna start up here because we want to change the fair amount we want to change the art style and everything but we don't want to change our position or her angle let's generate and see if we're going too far away from the image or staying true to it now I think this is a pretty good result we have the ocean in the background here we have a beautiful sky and we have a woman with brown hair now it's not as resemblant as the original image but the angle is correct this object here has been a hat which looks fairly decent now let's say that you feel that your image here didn't resemble this one as much then you can put the strength down say 0.35 and you should be even closer to your input image and we are we're much much closer but we're also further away from the prompt so that's why you need to play with this see what you want away when working with image to image is you can start high and when you're reaching what you want with your image you can lower the strength let's try 0.5 and see where that gets us now the face is very close to our original image but the background is not let's see if we can work with that let's say you have this and you want to play with it you want to change the background because you're super happy with this you don't want to change this if you would run it again and have a seed randomized or change the strength you will get a new iteration so what we can do is we can send this to impact so let's send this to in paint now and it's it's this tab here now what this is is you can actually mask out the part of your image and change only that part so you can change the size of your brush here and then you can paint either what you want to save or what you want to remove so let's paint here let's say we want to save this we want to save the woman as much as possible now this doesn't need to be perfect but as close as possible will always help you now let's change the prompt to what we want because since we're saving the woman now we want ocean here in the background and if we keep the prompt then we're gonna get a photo of a woman with brown hair and the ocean in the parts where we only want the ocean so let's put photo of ocean background and if we leave the setting as is it's going to try and make the ocean here and we don't want that we want the ocean where we haven't painted so let's change this to in paint not to mask it you can also upload a mask that you've made in Photoshop or or whatever it would accept the mask that is black or white now this masked content setting here now let's say this is the part that you've masked now this setting decides what should be filled before stable diffusion starts creating your new prompt ninety percent of the time you're gonna want original or Latin noise and that's kind of depending on what you had in the image before now since they've masked here we should be here on the original if they would want to add something that's already white and gray like a mountain here in some parts of green that would work perfectly but if they would want something that's completely different I would recommend working with Latin noise now latent noise can produce issues where it doesn't blend properly into the background original is better that now I wouldn't recommend you working with Latin nothing or Phil for now you can play around with them but you're probably gonna get inconsistent results so let's keep original for now and we have the same sampling sampling steps sampling method sizes as before unlike in our image to image there's a denoising strength here the more you up this the more it will change but since we're not saving much of the ocean here we can change a lot of this image so let's up this and see what happens now we're getting to get a photo of the ocean in the background and we have the sky here which is all looking decent however this part here actually turned into some kind of hair up here and that's not something that we wanted I'm gonna show you how you can paint the parts you want so I'll open photoper which is basically Photoshop online I'm going to take this image that we got for the original one put that back put that in there and then we're gonna paint I'm gonna paint with blue now we just want to remove this this hat here or whatever it is so stable diffusion thinks it's Ocean or sky now again this doesn't doesn't have to be perfect the AI is pretty good at understanding what things are as long as you give them a reference okay let's try with this one board as a jpeg save that woman ocean now we're gonna go back into stable diffusion and we're just gonna drag and drop that in here now you do need to repaint the mask so if you have a mask image that could work as well then you would just go here and upload the mask here but we're gonna draw so let's continue with that and let's generate and now we're getting much closer to the result that we're looking for we still have some artifacts around here and here but you can play with this um either work with a mask a little better or either painting or just play with the mask blur setting here which is basically how much blur you're gonna have to the edge of this mask now what you can do is you can just take this and keep working from this a good way when you have something you like is that you can send it to image to image and just generate a new image from the one you have here so let's say full of warm with brown hair and ocean in background now we're redrawing the whole image we don't have a mask here previously we had a very high denoising strength because we wanted to change the image a lot but now we don't want that so let's lower this so run it to 0.3 and let's generate and see what we get and this is starting to look much better the artifacts are starting to remove around the face now the eye has turned a little weird here but if we just press restore faces here we can generate a new one just remember that if you want this exact face you need to save the seed and input it here again we're just gonna run with the random seed this time let's generate so this is a pretty good result the eye is better it's still not perfect but you can work with that by running new generations until you find one you like let me show what I would do to get a perfect result out of this one I would drag this back into here I might up the strength just a little bit and I would run this in several images because you can can't always expect to get a good image every time so you're going to need to run multiple batches we will generate this we still have restore faces click them now this one's pretty close to original I think this is the best one now the eye looks completely normal so does the mouth so let's just send this to extras now let's talk upscalers here you can enlarge your image and you do that with something called an upscaler and you have a couple of options here I would recommend going with swin IR which is my favorite ldsr gives good good results as well but takes a lot longer time the schoolnet variations give good anime results ESR Gan is not bad but let me show you some comparisons we're going to look at BS argon ESR again and Sweden IR so the first image here is bsogam and the next one is ESR gown you can see it's definitely improving so this is this is totally color bleeding all over the place so here you have more details it's much more crisp and in swin IR it takes it even further this is my recommendation so let's pick that explore it up to four times the size and press generate and here we have our final image and this is now 2048 by 2048. there you have it how to work with stable diffusion from start to finish we have installed stable diffusion we have spoken about text to image image to image in painting and Extras now there are more advanced features like textual immersion dream booth and animation in stable diffusion if you want to learn more about that you can check out my other tutorials on my channel and for now I hope you've learned something and I hope you have fun while creating your very own AI art so good luck and have fun now I almost forgot we had these six images at the start and one of them was real and the rest were AI images I'm not sure which one you guessed but if you did and put it in the comments I applaud you your braveness and it was actually number four so from number four I created all the other Generations
Info
Channel: Sebastian Kamph
Views: 605,721
Rating: undefined out of 5
Keywords: difusion, difussion, stabledifusion, stable difusion, stable defusion
Id: DHaL56P6f5M
Channel Id: undefined
Length: 33min 35sec (2015 seconds)
Published: Mon Oct 03 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.