Improve Stable Diffusion Prompt Following & Image Quality Significantly With Incantations Extension

Video Statistics and Information

Captions Word Cloud
Reddit Comments
Greetings everyone. In this tutorial, I am  going to introduce you to a new extension   SD Web UI Incantations. So this extension will  allow us to use the very newest algorithms   Perturbed Attention Guidance. It is known as PAG  Scale and this attention mechanism allows us to   generate images that follow our prompts better as  you are seeing right now. Multi-Concept T2I-Zero   / Attention Regulation. This also allows us to  generate better images that follow our prompts   better. You can read their papers. The links are  all here and also seek for Incantations. So how   we are going to install and use this extension.  This is my Automatic1111 Web installation folder.   So first I will update it to the latest version.  So I started CMD and I do get full. If you don't   know how to install Stable Diffusion Automatic SD  Web UI, then follow this tutorial. I will put the   link of this tutorial into the description of  the video and you will be able to install and   start it from scratch. Moreover, I also have  an automated installer for Stable Diffusion   Web UI. I will put the link of this into the  description as well. So how we are going to   install this extension. Let me start my Web UI  currently. This is the latest version as you are   seeing one point nine point zero and it started.  Let's go to the extensions and from here I will   install from URL you see here and copy the link.  I will put the link into the description. Don't   worry and click install. This extension is amazing  to inpaint your generated faces automatically to   improve the faces quality. I will also put an  install. Then follow what is happening in here.   Since I already have After Detailer installed it  gave me error after doing this operation. Close   your terminal and start your Stable Diffusion  Web UI again. I prefer you to do this operation   so it will be better. It will install necessary  new packages and our installation is ready so we   can see the new extension here Incantations.  There is also After Detailer extension and we   can see Perturbed Attention Guidance Multi-Concept  T2I-Zero and Seek for Incantations. So how we are   going to test them. This is my CivitAI profile.  I am going to test them on these hard to generate   images. These are hard images and these images  contain PNG info. These images generated with my   one trainer fine-tuned model of myself. I have  recently published this amazing tutorial. It   is over two hours. You can watch this tutorial to  learn how to train yourself if you are interested   in that. However, to use this extension you  don't need a custom model. You don't need a   trained model. You can use this extension  on any model simply. So let's right-click   and save this image as into the downloads.  Go to the PNG info. Let's load the image.   Send text to the image tab. I don't know why but  Automatic1111 Web UI activating high resolution   fix even though I didn't use it. So just turn it  off and the sampling method and schedule type is   selected. Schedule type is now displayed here. I  find that DPM++ 2M SDE Karras as best. These are   the default values that I have used. This is the  seed. I also need to enable After Detailer. Let me   show you my After Detailer settings detection.  I detect only the first face in the image mask   processing. I don't change it in painting. I only  change inpainting denoising strength to 0.5 and I   use separate steps to get better face quality and  then there is Incantations. I didn't use them yet.   So let's first get the original image. Generate  the speed is currently 3 it per second. Which is   generated. However, it is not exactly the same.  The face probably this is because the scheduler   is changed. So use a separate sampler select  SDE from here and also Karras here. Then I will   generate the image again to get the exact same  image and I got the same base image as you are   seeing right now. So time to test the improvement  of Incantations. First, let's try Perturbed   Attention Guidance activated and generate the  image again and we got the output. Are there   any differences? So let's save this image as  well. I say with the previous one. Then let's   go to Make a new album. Let's load  the images first one and the second one to see   the effect of the Perturbed Attention Guidance.  So here it is. Let's make it full screen. Okay,   the left one is the original and the right one  is the new improvement. Are there improvement?   Yes. For example, look at the hands here. This is  the original and this is the new one. You see the   head is automatically improved. I can see that.  And there is also some other improvements in this   part. So overall, there is an improvement without  anything else. You see this is a free improvement   to our images image generation. And I like it.  Why not use it while we can get an improvement.   So this was the first test. What happens if you  increase it as you increase it more the quality   degrades unfortunately. So the one is a sweet spot  probably. Then let's try Multi-Concept T2I-Zero.   There are so many options. I tried all of them.  However, there aren't that many improvements.   So let's make the ending step 40 because we are  generating images with 40 steps. Then there is   one thing that I have found improving the quality  which is EMA Smoothing Factor I make it one this   really improves. And that's it. Let's generate  the image. When you activate both of them,   both Incantations and Multi-Concept T2I-Zero, it  doesn't bring improvement. Actually, they conflict   I think so you don't get better images and we are  getting the results. Okay, the image is generated.   Let's go back to the image SLI. And let's make  another comparison default or to print attention   guidance. And let's also add the Multi-Concept  T2I and upload all. Okay, let's make full screen   again. So this is default versus Perturbed  Attention. And there is also now Multi-Concept.   Let's see the difference. So the right one is  Multi-Concept and the left one is the default.   There are some improvements such as the hands  are better. The overall image is I don't know,   looking good, not much improvement, but looking  good. And like this maybe didn't work on this   image very well. But I see some improvements. For  example, the eyes of the dinosaur are certainly   improved. You see from this to this, this one has  like some error, this is the eye and is the eye   but this has better eyes and the environment  I think environment is also better as you are   seeing right now. So these also improved. Let's  compare with Perturbed Attention guidance and   Multi-Concept text image zero. So the right  one and left one, as you are seeing right now,   I don't know which one is better. It is like a  personal opinion, but yeah, from left to right   from left to right. So it is up to you, you can  use either of them to get some improvements. This   is pre improvement. So you can decide which one  to use both of them is good. So what happens if   we activate both of them at the same time? Okay,  I have activated both Perturbed Attention guidance   and Multi-Concept text image zero at the same  time. And let's see what happens. By the way,   my speed is degraded greatly right now, because  both of the algorithms are activated. And this   is the output when both of the algorithms  are activated. Let's also compare it. So   this will be both. Let's make a new album default,  Perturbed Attention, Multi-Concept and both of the   attentions. Okay. And here the results. So let's  compare both and default. Okay, on the right,   there is both and on the left original. Let's see,  it doesn't look like there is much improvement   here. As you are seeing the eyes are worse. I  don't know the teeth are looking decent. But   this is it. So you can play with it and see which  one is working best. However, this is supposed to   improve all of your images without making anything  extra. There is also Seek for Incantations. Let's   also try it. I couldn't get better images with  Seek for Incantations. Also, this may give you   some errors. I think it is using the BLIP model as  like a prompting like IP adapter. So let's see the   result. Currently, only active is selected. There  is also append generated caption and Deepbooru   Interrogate. So we will see. And this is the  result of Seek for Incantations. You can see   from left to right image. I don't know if there is  much improvement. Actually, there looks like some   degrade, but it is up to you to decide. So let's  try the other options. Let's also append generated   caption. You can see both of the images from here  you see from this and this one you see. Okay,   let's also try append generated caption.  When append generated caption is selected,   we can see the generated caption here you see  dinosaur writing. So these are the generated   captions. And I think they're appended to my  original prompt. So let's see what we will get.   Okay, this is the result when we appended more  captions. I don't know if there is improvement,   but you can see it. I think the face is now not  very good compared to the original face. The hands   don't look better either. And the position  and nothing is looking better if you ask my   opinion. So let's also use Deepbooru Interrogate.  There was an explanation here regarding Deepbooru   Interrogate uses Deepbooru to interrogate instead  of CLIP. And there was some tips if I remember   if through will append. Okay, it says that for  Deepbooru Interrogate recommend that disabling.   So when this option is selected, we will disable  this and try again and see what will happen.   Deepbooru style of tagging is like this one boy  against three animal bamboo bamboo forest blue   shirt branch and these other tokens added to the  prompt so it is going to generate a very different   image. And you see it is like this. So on the left  original image and on the right, this is the new   image. I don't know why this exists, but in some  cases, this might give you better images. However,   in my dream boots trained model image generation,  it doesn't. But in this image, we can notice that   the environment is better than the original image,  I think. So you can play with these options and   see which one is working best for you. Also, on my  CivitAI profile, you can find a lot of prompts and   amazing images. All of them have PNG info embedded  so you can download them and look in the PNG info   tab and see the prompt and which settings do  I use to generate such images. I hope you have   enjoyed please subscribe like and hopefully  see you in another amazing tutorial video.
Channel: SECourses
Views: 2,428
Rating: undefined out of 5
Id: lMQ7DIPmrfI
Channel Id: undefined
Length: 11min 52sec (712 seconds)
Published: Mon Apr 15 2024
Related Videos
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.