Improve Stable Diffusion Prompt Following & Image Quality Significantly With Incantations Extension

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Greetings everyone. In this tutorial, I am going to introduce you to a new extension SD Web UI Incantations. So this extension will allow us to use the very newest algorithms Perturbed Attention Guidance. It is known as PAG Scale and this attention mechanism allows us to generate images that follow our prompts better as you are seeing right now. Multi-Concept T2I-Zero / Attention Regulation. This also allows us to generate better images that follow our prompts better. You can read their papers. The links are all here and also seek for Incantations. So how we are going to install and use this extension. This is my Automatic1111 Web installation folder. So first I will update it to the latest version. So I started CMD and I do get full. If you don't know how to install Stable Diffusion Automatic SD Web UI, then follow this tutorial. I will put the link of this tutorial into the description of the video and you will be able to install and start it from scratch. Moreover, I also have an automated installer for Stable Diffusion Web UI. I will put the link of this into the description as well. So how we are going to install this extension. Let me start my Web UI currently. This is the latest version as you are seeing one point nine point zero and it started. Let's go to the extensions and from here I will install from URL you see here and copy the link. I will put the link into the description. Don't worry and click install. This extension is amazing to inpaint your generated faces automatically to improve the faces quality. I will also put an install. Then follow what is happening in here. Since I already have After Detailer installed it gave me error after doing this operation. Close your terminal and start your Stable Diffusion Web UI again. I prefer you to do this operation so it will be better. It will install necessary new packages and our installation is ready so we can see the new extension here Incantations. There is also After Detailer extension and we can see Perturbed Attention Guidance Multi-Concept T2I-Zero and Seek for Incantations. So how we are going to test them. This is my CivitAI profile. I am going to test them on these hard to generate images. These are hard images and these images contain PNG info. These images generated with my one trainer fine-tuned model of myself. I have recently published this amazing tutorial. It is over two hours. You can watch this tutorial to learn how to train yourself if you are interested in that. However, to use this extension you don't need a custom model. You don't need a trained model. You can use this extension on any model simply. So let's right-click and save this image as into the downloads. Go to the PNG info. Let's load the image. Send text to the image tab. I don't know why but Automatic1111 Web UI activating high resolution fix even though I didn't use it. So just turn it off and the sampling method and schedule type is selected. Schedule type is now displayed here. I find that DPM++ 2M SDE Karras as best. These are the default values that I have used. This is the seed. I also need to enable After Detailer. Let me show you my After Detailer settings detection. I detect only the first face in the image mask processing. I don't change it in painting. I only change inpainting denoising strength to 0.5 and I use separate steps to get better face quality and then there is Incantations. I didn't use them yet. So let's first get the original image. Generate the speed is currently 3 it per second. Which is generated. However, it is not exactly the same. The face probably this is because the scheduler is changed. So use a separate sampler select SDE from here and also Karras here. Then I will generate the image again to get the exact same image and I got the same base image as you are seeing right now. So time to test the improvement of Incantations. First, let's try Perturbed Attention Guidance activated and generate the image again and we got the output. Are there any differences? So let's save this image as well. I say with the previous one. Then let's go to imgsli.com. Make a new album. Let's load the images first one and the second one to see the effect of the Perturbed Attention Guidance. So here it is. Let's make it full screen. Okay, the left one is the original and the right one is the new improvement. Are there improvement? Yes. For example, look at the hands here. This is the original and this is the new one. You see the head is automatically improved. I can see that. And there is also some other improvements in this part. So overall, there is an improvement without anything else. You see this is a free improvement to our images image generation. And I like it. Why not use it while we can get an improvement. So this was the first test. What happens if you increase it as you increase it more the quality degrades unfortunately. So the one is a sweet spot probably. Then let's try Multi-Concept T2I-Zero. There are so many options. I tried all of them. However, there aren't that many improvements. So let's make the ending step 40 because we are generating images with 40 steps. Then there is one thing that I have found improving the quality which is EMA Smoothing Factor I make it one this really improves. And that's it. Let's generate the image. When you activate both of them, both Incantations and Multi-Concept T2I-Zero, it doesn't bring improvement. Actually, they conflict I think so you don't get better images and we are getting the results. Okay, the image is generated. Let's go back to the image SLI. And let's make another comparison default or to print attention guidance. And let's also add the Multi-Concept T2I and upload all. Okay, let's make full screen again. So this is default versus Perturbed Attention. And there is also now Multi-Concept. Let's see the difference. So the right one is Multi-Concept and the left one is the default. There are some improvements such as the hands are better. The overall image is I don't know, looking good, not much improvement, but looking good. And like this maybe didn't work on this image very well. But I see some improvements. For example, the eyes of the dinosaur are certainly improved. You see from this to this, this one has like some error, this is the eye and is the eye but this has better eyes and the environment I think environment is also better as you are seeing right now. So these also improved. Let's compare with Perturbed Attention guidance and Multi-Concept text image zero. So the right one and left one, as you are seeing right now, I don't know which one is better. It is like a personal opinion, but yeah, from left to right from left to right. So it is up to you, you can use either of them to get some improvements. This is pre improvement. So you can decide which one to use both of them is good. So what happens if we activate both of them at the same time? Okay, I have activated both Perturbed Attention guidance and Multi-Concept text image zero at the same time. And let's see what happens. By the way, my speed is degraded greatly right now, because both of the algorithms are activated. And this is the output when both of the algorithms are activated. Let's also compare it. So this will be both. Let's make a new album default, Perturbed Attention, Multi-Concept and both of the attentions. Okay. And here the results. So let's compare both and default. Okay, on the right, there is both and on the left original. Let's see, it doesn't look like there is much improvement here. As you are seeing the eyes are worse. I don't know the teeth are looking decent. But this is it. So you can play with it and see which one is working best. However, this is supposed to improve all of your images without making anything extra. There is also Seek for Incantations. Let's also try it. I couldn't get better images with Seek for Incantations. Also, this may give you some errors. I think it is using the BLIP model as like a prompting like IP adapter. So let's see the result. Currently, only active is selected. There is also append generated caption and Deepbooru Interrogate. So we will see. And this is the result of Seek for Incantations. You can see from left to right image. I don't know if there is much improvement. Actually, there looks like some degrade, but it is up to you to decide. So let's try the other options. Let's also append generated caption. You can see both of the images from here you see from this and this one you see. Okay, let's also try append generated caption. When append generated caption is selected, we can see the generated caption here you see dinosaur writing. So these are the generated captions. And I think they're appended to my original prompt. So let's see what we will get. Okay, this is the result when we appended more captions. I don't know if there is improvement, but you can see it. I think the face is now not very good compared to the original face. The hands don't look better either. And the position and nothing is looking better if you ask my opinion. So let's also use Deepbooru Interrogate. There was an explanation here regarding Deepbooru Interrogate uses Deepbooru to interrogate instead of CLIP. And there was some tips if I remember if through will append. Okay, it says that for Deepbooru Interrogate recommend that disabling. So when this option is selected, we will disable this and try again and see what will happen. Deepbooru style of tagging is like this one boy against three animal bamboo bamboo forest blue shirt branch and these other tokens added to the prompt so it is going to generate a very different image. And you see it is like this. So on the left original image and on the right, this is the new image. I don't know why this exists, but in some cases, this might give you better images. However, in my dream boots trained model image generation, it doesn't. But in this image, we can notice that the environment is better than the original image, I think. So you can play with these options and see which one is working best for you. Also, on my CivitAI profile, you can find a lot of prompts and amazing images. All of them have PNG info embedded so you can download them and look in the PNG info tab and see the prompt and which settings do I use to generate such images. I hope you have enjoyed please subscribe like and hopefully see you in another amazing tutorial video.

Info

Channel: SECourses

Views: 2,428

Rating: undefined out of 5

Keywords:

Id: lMQ7DIPmrfI

Channel Id: undefined

Length: 11min 52sec (712 seconds)

Published: Mon Apr 15 2024