AUTOMATIC1111 FULL TUTORIAL - Text to Image with Stable Diffusion

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
today I'm going to show you how to use automatic 1111 we're going to go over all the basic settings that you need to know to create the best images that you want just a quick intro automatic 1111 is the most popular interface for creating images using stable diffusion it's totally free and open source and you can even run it locally if you have a good enough GPU in this video we're going to go over all you need to know to generate images from text prompts I'm going to assume you already have it installed and running let's Jump Right In just so you know I'm using the latest version of automatic 1111 which is version 1.6 if you have an older version you can update it by simply opening this web UI user.bat file in notepad wordpad or a code editor and then adding git pull right before the call statement this will update your automatic 1111 to the latest version but you don't have to update it to follow along in this tutorial most of the settings I'll go over will be the same in older versions as well so after opening the interface the first thing you'll notice at the top is the checkpoint this defines the style of image that you want to generate usually the default stable diffusion checkpoint isn't good enough so you'll want to search for a checkpoint that suits the style you want a good place to search for checkpoints is civitai or Civic AI simply head over to civitai.com click in the models Tab and then in the filters select checkpoint only then in the style section you can find hundreds if not thousands of checkpoints that you can choose from for example this one gives you an anime style image this one gives you a Disney Pixar style of image and this one gives you a more realistic looking image once you've found the one that suits you click into it and then download the checkpoint just a warning though these checkpoints are usually huge like a few gigabytes in size so it's going to take a while to download make sure you save the checkpoint to the folder models slash stable diffusion once you've done all these steps it'll show up correctly in this checkpoint drop down if you don't see it click this refresh icon or reload the interface next you'll see a bunch of tabs each of these offer different functions for this tutorial we'll go over the text to image tab but just know that you can also do image to image and upscale images in this extras tab I'll make tutorials about these tabs as well in the future and then these Tabs are more advanced and most of you would probably never need to use it one quick thing to note is that in the settings tab you can change the format of the images you create for example you can set it to jpeg instead of PNG you can also set the image quality Max size Etc alright let's go back to the text to image tab like the name implies this tab allows you to type in a text prompt to generate an image the first box is the positive prompt this is what you want the image to contain now there's a whole art behind prompting so if you want to get good at it look at previous examples and notice patterns in keywords that people are using again a good resource for that is civitai so if you just click on whatever checkpoint you're using you can scroll down to view images that other people have generated using that checkpoint if you click on the image most of them would contain the prompts and negative prompts that they have used notice that a few keywords are used quite often such as masterpiece 8K absurd res intricate details these keywords allow you to add more details and sharpness to your image for us let's try one girl brown hair wearing black shirt 8K Masterpiece absurd res intricate details best quality sunny and City it's always best to specify as much details as possible on your character and the background so for example what color is her hair what is she wearing what's the background like is she in the city is she indoors Outdoors at the beach at the park Etc and then for the negative prompt this is what you want to exclude from the image again browse through the images in civitai to get a sense of what other people are using for the negative prompt you'll notice some common keywords such as easy negative deformed limbs extra fingers fewer fingers extra limbs blurry chromatic aberration low res low quality Etc these are things that you don't want to see in your image for us let's try the following so easy negative paintings sketches worst quality low res normal quality blurry monochrome grayscale extra fingers fear fingers extra limbs deformed limbs bad Anatomy disfigure and NSFW don't worry I'm going to paste both the positive and negative prompts in the description below and to generate the image simply click generate once the image has finished generating you can simply click into it and then to save it right click and then save image let's jump down to seed real quick so seed is a number that defines the starting point of your image even if you keep all the other settings the same if you use a different seed number you'll get a completely different image you can see that in the previous image we generated a seed of one two three eight four one eight four six if we want to generate the exact same image not only do we need to keep all the settings the same but we need to use the same seed number so let's paste in that number here and then click generate and see what we get you can see that it's the same image if you change the seat to another number you'll get a completely different image even if you keep everything else the same if you set the seed to -1 or click this button it would randomly generate a number for you when I'm making new images and I don't really care about referencing anything I've done in the past I usually just leave it at negative one but for this tutorial we're going to tweak all of these other settings so you can see what they do but in order to get a valid before and after comparison we'll need to keep the seed number the same throughout all these tweaks all right let's jump back to the sampling method this is basically the algorithm that is used to generate the image each of them have subtle differences so it's pretty much trial and error to see which one works best for you and the checkpoint in general Euler a tends to be the fastest but the face doesn't really look realistic these other two tend to give you the best quality results I'll run these three for you so you can compare and contrast and see for yourself we'll use the same prompt and keep the same seed so you get a valid comparison so here's ruler a here's plus plus 2m Keras and finally here is sde Keras and here's a comparison so you can decide for yourself which one works best for you next up we have sampling steps this is how many rounds you want the AI to go through to generate your image each round adds slightly more detail and definition to your image but at a certain point if you have too many rounds then it kind of over trains and you're going to get some weird results like noise and sharpness generally a value between 20 and 30 works best I'm going to keep everything the same but only adjust the sampling steps so you can see how that affects your image let's do five steps and then 10 steps 20 steps 40 steps and finally 80 steps and here are the results let's Skip hi-res and refiner for now we'll go back to these in a second below that we have width and height this defines the dimensions of your image pretty straightforward so instead of 512 by 512 if you set this to 800 by 512 then the image that you generate will be 800 by 512. next we have batch count and batch size both of these Define the number of images that you want to generate in one go if you set this to two it'll spit out two images if you set this one to two it'll also spit out two images so what's the difference well if you want to get into the details first of all you can think of this whole image generation thing as like an engine of a car batch size is the number of images to generate simultaneously from just one start of the engine if you set the batch size to 3 the engine starts once and creates three images in parallel this option is slightly faster but you'll need to have enough vram to handle this if you don't you can use batch count if you set the batch count to three basically you start the engine to create one image then you start the engine again to create the second image and so on in other words the images are produced one after another in a series so again if you have enough vram or memory then batch size would be slightly faster if not you can always use batch count next up is the CFG scale or guidance this is how much you want the engine to follow your prompt if you drag this all the way to the left it'll follow your prompt less if you drag it to the right it'll follow your prompt more but sometimes too literally and it'll give you some strange results generally a value from 6 to 8 works the best let's try it with a few values so here's the same settings but I'm going to set the CFG to 1. and here it is with three and then let's set it to seven and then 10 and finally 30. and here are all the results finally let's go back and talk about hi-res fix and refiner high-res fix would upscale your image by a multiple so if your image is 512 by 512 and you select upscale by two then your image would be increased two times or it would be 1024 by 1024 for us let's go with 1.5 you can choose different algorithms for upscaling again each of these have subtle differences so you'll need to play around with each one to see which works best for you Let's test out a few right now so here's latent here's Sr again hope I pronounce that right and then here's RS organ high-res steps is similar to sampling steps above this is how many steps of upscaling you want if you set it to zero it's not actually zero steps it uses the number of steps above which in our case is 20. like sampling steps the more you have the more defined your image will be so if you set the high-res steps to one only you're not going to get a very defined image usually a range of 15 to 25 steps would be best so let's set it to 15 and see what we get and then denoising strength is how much you want the upscaler to follow your original image if you drag it all the way to the left it's not going to make any changes to the original image if you drag it all the way to the right you're going to get a very different image so keeping it at around the middle is a good balance between retaining your original image but also adding some additional details and elements let's try testing a few values of denoising strength while keeping everything else constant so here's the noising strength at 0.1 you can see it's very similar to the original image and here's the nosing strength at 0.5 and finally let's set it to 0.9 as a final note on hi-res fix though I would not recommend using this to upscale your images here that's because if you generate a lot of images and you have this option on it's going to upscale all of those images it's going to take up a lot of bandwidth and a lot of time whether you like the images or not it's best to generate tons of small images and then choose only the ones that you like and then send those images to the extras tab where you can upscale it further it's exactly the same settings that you see here okay next up is the refiner window this is only for sdxl which is a newer version of stable diffusion sdxl is actually quite early and the quality of results isn't as good but if you're curious how this refiner thing works it adds more details to your image by running your prompt through an additional model but again this refiner option is only for sdxl which isn't really mature yet as of right now so it's better to keep using the standard version of stable diffusion and leave this option off final thing I want to mention is Laura's so at the beginning we chose a checkpoint which defines the style of the image we want to generate well we can actually add more models to this to Define our images even further these models are called loras which are basically smaller versions of checkpoints they're usually trained on generating certain types of objects to find and download loras simply go back to civitai and then in the models tab click the filter drop down and select only Laura for example let's say you want to add a capybara to your image chances are your original checkpoint was not trained on any images of a capybara so it wouldn't even know what capybara is let's actually try that out so let's add capybara to our image without using any Lora and see what it gives us you can see the result isn't great next let's download this capybara Laura and save it into our Laura folder in models going back to our interface here in this Laura tab you should be able to see the capybara Laura if not try hitting this refresh button and if you still don't see it try restarting the whole interface now to add and use the Lora to your image you need to click into the Laura and then after that you can see the following text added to your prompt you'll notice a number next to the Laura this is the weight you want to set the Laura this is how important you want the lower to be or how much emphasis you want to place in the Laura usually a value between 0.3 and 0.8 works well if you set it to one or two it's going to be too extreme something to note is you can also set a negative value so if you set this to -1 for example it's going to really exclude Capybaras from your image think of this as like a negative prompt but for your Laura for us let's try 0.6 now it's also important to add in trigger words for some loras you need a word or set of words in your prompt to activate galora going back to the page on civitar where we found this Laura you can find the trigger word here which is capybara so let's go back and add the word capybara to our prompt now all else being equal let's click generate and see what it gives us it looks pretty good to me so here's a side-by-side comparison without the Laura and with the Laura all right we covered a lot today but that is all the basics you need to know to create awesome images from text prompts note that there are a lot of other capabilities in the other tabs such as image to image and Extras so stay tuned for tutorials on those if you found this video helpful remember to like subscribe and stay tuned for more content also we built a site where you can search for all the AI tools out there check it out at ai-search.io
Info
Channel: AI Tools Search
Views: 8,755
Rating: undefined out of 5
Keywords:
Id: hwsvcbFeUTs
Channel Id: undefined
Length: 17min 22sec (1042 seconds)
Published: Fri Sep 22 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.