Insane AI Upscaling in Stable Diffusion! Easy tutorial. Google Colab included

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hello everyone! In this video I'll show you how to get from this to that, in other words, how to upscale an image by a factor of 64, which as you can see looks absolutely stunning. Additionally, we'll discuss how to avoid losing the original image information during upscaling, as sometimes upscaling can completely alter your initial picture. Let's start with basics of how upscalies work to understand what kind of result to expect and which one are most suitable for you. To simplify greatly, there are two types of upscaling. The first one works through mathematical smoothening. It's like looking at a point on a graph and assuming that the missing point is probably somewhere around here, essentially, the same thing happening with the pixels. Algorithms predict what pixels should be based on neighborhood pixels. There are many algorithms, and they can be combined to achieve decent results, but only slightly improving the quality, making the image a bit clearer. The problem with the first method is that these algorithms lack information about the object itself. They can't recognize it and depend it on the object as something that wasn't there. For example, small details on the skin, they just don't understand that a skin, they merely manipulate pixels among themselves. This is where another approach comes in, a generative one, or AI one, where we use a model like a stable diffusion or different upscaling like in order to fantasize about what the image could be and simply add new non-existing before information. This approach allows for incredible results in increasing resolution resolution almost infinitely. We're only limited by our modern knowledge about the object and computational resources. That's it. As usual, open Google Cloud Notebook with a link, which I provide in the description under this video. I made that notebook special for you and it still works perfect without any disconnections, which to be honest still surprised me. So we need to run first cell, then download our model, step 2. I prefer Realistic Vision V5, but you can use whatever you like, just past proper link here. Then we need a ControlNet model. A bit later I explain why. This cell downloads a lot, a lot of ControlNet models, I think almost any you need, so it takes a bit more time, but to my mind there is not too much difference between 3 or 5 minutes of waiting. Then we need to run step 3, run stable diffusion and just wait a little bit. First glance it seems quite simple. Just change your resolution here, example let's use Full HD format and you'll get proper result with much more high resolution. That is not the case, because there's a problem. Stable diffusion models, especially 1.5 as a realistic vision in this case, are trained on images with specific resolution 512 by 512. So our model just doesn't know how it looks cat in a Full HD resolution. If you use high resolution, which is not suitable for this specific model, you might get some artifacts, mutations and some strange pictures. As you can see here, that doesn't look okay. Even you slightly change your resolution here might produce a bit distortion in anatomy, in faces, it will be quite hardly noticeable, but it definitely will be. And if you meticulously look at images in the civit.ai, you probably noticed that sometimes a lot of images quite distorted. So this method doesn't work. Let's use our basic resolution. So in order to upscale our images, we can use this method, high res fix. And here we can change our upscaler, let it be latent and our resolution. For example, we can use upscale by 2. The final images will be 1024 by 1024. Let's generate our image. So how does it work? Stable diffusion generate images with basic solution 512 by 512. And only then using specific upscaler like a latent or whatever you like, upscale the image to your final resolution 1024 by 1024. Because we use latent upscaler, our image might be changed significantly during generation. So you need to know that if you generate images with control net, it's significantly changed the final results. To my mind, in my relative cases, it's not usable with control net. So let's discuss the situation when you already have an image and you need to upscale it. So we need to open image to image tab, then drop our image here. Let's choose this one. And here you need to enter your prompt. If you don't know prompt for this specific image, you can interrogate prompt by pressing this button. And after that, stable diffusion model, define prompt for you for this specific image. But I know prompt for this image. So I just pass it here. And then we need to choose our sample method. I prefer Keras plus plus 2M. But it's once again strongly depends on specific images for art for anime style, for real like images, there will be different sample methods and different upscalers also. Let's change our resolution 1024 by 1024 and very important parameter denoising strength. If we use low denoising strength, we will get almost the same images. It will be better image in terms of resolution, but still there will not be too much details. As you can see now, that looks nice. But if you compare this image with first one, you may noticed that there is not too much change. It's almost the same images slightly better, but it's not what we want to make. So if you change your denoted strands significantly, you completely change your image. For example, let's try 1.7. We've got good results, much better than previous one in terms of detalization in terms of resolution. But as you can see, it's completely different images. So what you can do in this situation, the majority of youtubers advise you to find the specific points of denoising strength which allows you to get perfect results. Sometimes in my experience, it's almost impossible to find these specific parameters. So there is probably can use another way with the control net. So in order to use control net, we can add our photo here, change our preprocessor, canny, which process edges of the images, enable it. And let's try generate this image with high denoising strength and with the control net. Here we go. We've got our result. And to my mind, it looks better. There is some strange artifacts on core hairs. Let's try to use lower denoted strands. And also let's use lower control net rate. Here we go. We've got our results. And it looks better. Yes, it looks way more better. You can experiment with control net with different models in order to get you desired results. Because as I said before, it strongly depends on your specific images, on your particular case. Sometimes you can just find this specific point, specific parameter or denoted strands and get good results without using control net. But as I said before, to my experience, very good to know that tricks with the control net. So let's save our image and then let's add our image here. Once again, in this case, it's not important to have prompt just write something like a detailed and then in the script, so we should choose upscaler, SD upscale, we can upscale our image using the same methods. Once again, even in a case with a Google collab, we have strong restriction in our VRAM memory only 16 gigs, which is not so big, surprisingly, so we should try to use SD upscale. It works quite tricky is divide all images on small tile and process each style independently, which allows you to generate images with a much more high resolution, but using your small amount of VRAM tile overlap. It's evidently how we left that specific tiles. So you can choose different parameters and it will depends on how much details in your images. In my case, I think it's not obligatory to use high parameters and scale factor. Let's try to use for a hope that our VROM will be sufficient for that. And here we can choose different upscalers. And once again, it will depends on your specific images. Let's try to use basic one long goes. Don't forget to change the noise and strengths to lower magnitude. Oh, point one will be sufficient, I suppose, and press generate. Here we go. We've got our results. And here you can compare the final but still it looks nice to my mind surprisingly better than the previous one, especially I like Lancros because here a face looks more natural and the final one it's anime style and you evidently can say that there is too much difference. It looks completely different. So as I said before, best upscaled strongly depends on your specific user case. In this case, with this image, the best one was to my mind, Lancros the most simplest upscaler and for 4X ESRGAN 4X looks also quite good. But I was surprised because I thought I was sure that 4x+ ERSGAN will be my favorite one because in majority of cases, it looks way more better. And as a matter of fact, you can also upscale your images here in extras in the same way. Just add your image here. You can combine different upscalers like Lancros, for example, and nearest. You can adjust some parameters like code former visibility, which is responsible for fix faces as I remember, and you can combine these images as parameters in order to get best results. But to my mind, it's not the best option because there is not tiling. So you strongly restricted by your video memory. And I think most likely you won't be able to upscale your images by 4000 pixels. So to my mind, the best option to combine upscalers just to use Photoshop or similar software. And let me show how I always do that. We need to choose the best upscalers for our cases with best parameters. To my mind, this is ESRGAN 4X Lancros, I suppose. Definitely not Anime. So open Photoshop or similar software, it doesn't matter. And let's open our images. So once again, we can compare upscaled results. And we can choose which is the best for us. To my ESRGAN 4X+, we have the best hairs. So we can use hairs from this one. Just use eriser tool. It doesn't look good. I know that. But it's okay for our user case. We activate our Lancros. And now we have our best hairs from ESGAN Without them, and with them, looks much more better. But here we can see that artifact, which is not so good. Let's fix our artifacts on her face. For some reason, the same artifact, we can see almost on any upscaler. But it is not serious problem because we're in a Photoshop, we can fix it using different instruments like the clone, for example. It's not the best instrument, but it just, for example, how it might look. Let's fix her eyes also just a little bit. Now it looks better with our best hairs. And what about outpainting? I use Photoshop Beta, so I can use our painting tool in order to generate her hair properly. Let's out paint our image. The negative fill here, and just leave empty prompt. That looks better. To my mind, this is the best one. And we can also try to generate lower part of her body, but I suppose it will not work because of because of NSFW filter or Photoshop. But let's try. Who knows, maybe it'll work. But I'm sure that it won't. Definitely. Yes, I was right. Violated user guidelines. So let's add our initial image to Photoshop in order to compare it. There is our initial image, and you can see the difference between sizes, which is just insane. Here we go. Now we can compare our results. So what we have before, our initial image, and now our final results. Voila. To my mind, that looks way more better. It's not ideal. There's some little artifacts, some problems, hairs, but you can work on that. You can fix all that problems quite easily. It's just illustration how far you can go with stable diffusion in an upscale process. So I think that is success. That looks really nice. There's a lot of videos in YouTube how to upscale your images with a stable diffusion, but I try to make a bit different because quite hard to create upscale version of your image, which looks the same as initial one. So with the help upscaler, is the upscale tiling with the help of control net, you can achieve that result, which I think is really, really nice. I hope that you liked this video. Let me know if you find this video helpful by pressing like button or writing your comment. Bye bye.
Info
Channel: marat_ai
Views: 8,546
Rating: undefined out of 5
Keywords: stable diffusion upscale, automatic1111 upscale, 8k image upscale, stable diffusion hires.fix, Photo upscale stable diffusion, Image enhancement tutorial, Photo upscaling guide, Image upscale tutorial, Photo enhancement tips, Upscale photos effectively, Enhance picture quality, Improve image resolution, High-quality image scaling, 4k photo upscale, upscale photo google colab, stable diffusion google colab, how upscale image in stable diffusion, ai upscale
Id: G3orvT6USPg
Channel Id: undefined
Length: 14min 25sec (865 seconds)
Published: Wed Aug 16 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.