Get Better Results With Ai by Using Stable Diffusion For Your Arch Viz Projects!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Hi guys, today I will show you all about Stable Diffusion. It is a deep learning, text-to-image model released in 2022 based on diffusion techniques. It is primarily used to generate detailed images based on text descriptions. Many AI tools are not really usable in real work yet, but Stable Diffusion is a different story. Guys from Vivid-Vision showed us how they use it in their workflow during our studio tour, it was really inspiring. If you haven’t watched it yet, the link is in the description. ### All the calculations are done by GPU. You need a computer with a **discrete Nvidia video card** with at least 4 GB of VRAM. An integrated GPU will not work. Working with AI requires a lot of trial and error, so working with a good GPU will speed up your process dramatically. Luckily, I was sent the NVIDIA GeForce RTX 4090 by the sponsor of this video - Nvidia Studio. Here are some benchmarks. More iterations per second mean faster results. As you can see this card is the top GPU right now. NVIDIA is currently the only supplier of hardware for AI. Now it’s a great time to get started, as the demand is very high and growing, because results are speaking for themselves. ### Next, let me show you how to install it. It’s not as easy as installing the standard software, that’s why I’ve included this part. Also, I have created the blog post with detailed explanations, links, and things to copy and paste, so here, I will just quickly go through it. Don’t worry if it’s too fast, the link to the blog is in the description. Go to the link provided in the blog post and download the Windows installer. Make sure to download the exact version I’ve linked to, as newer versions will not work. It’s important to check this option, and then click Install. Download Git using the provided link. Install it with default settings. Next, we have to download Stable Difusion Automatic1111. It’s not a downloading you are used to. First, we have to open the Command Prompt. If you want to install it in the default location, that’s fine. If you want to choose a specific location, navigate to the chosen folder, and type cmd here. Here you go. Now copy and paste the code from the blog post. Click enter. And that’s it, as you can see, everything is downloaded here. Next, we have to download a checkpoint model to this folder. I will explain what it is later on. Here is the website where you can download it, but I have provided a direct download link in the blog post. This step is quite long, it will take about 15 minutes. Run this file, it will download and set up everything. Once it’s done, you will see the URL. Copy it and paste it into your browser. And here it is - Stable Diffusion Automatic1111 interface. If you want to have it in dark mode, add this to your URL. By default, it uses the browser mode. You can type something in the prompt section to see if it works. Great. The last step is to modify the WebUI file. Copy the code from the blog. Right-click on the file and choose ‘edit’. It will be open in Notepad. Replace the content with our code, save, and close. What it does, is it enables auto-update, make it faster, and allows access to API. As all is done, let me show you how to open Stable Diffusion next time you open your computer. If you just use the URL it will not work. First, you have to run the WebUI file. I like to create a shortcut and place it on my desktop for quick access. Run the file and now the URL will work. It’s the same URL every time, so you might want to bookmark it in your browser. ### We have a lot of types of models available. I will cover just CheckPoint Models which are the most popular and the ones you really need, others are optional. Model CheckPoint files are pre-trained Stable Diffusion weights that can create a general or specific type of image. The images a model can create are based on the data it was trained on. For example, a model will not be able to create a cat image if there are no cats in the training data. Similarly, if you only train a model with cat images, it will only create cats. These files are heavy, usually between 2 and 7 GB. Here are 4 versions of an image generated using exactly the same prompt. Each is using a different model. As you can see they are extremely different. The default model is not really good, and you shouldn’t use it. Here we have a very realistic result. This one is a bit more moody. And finally, here we have something totally different. I think this proves that choosing the right model is essential. I have included a few links to popular websites where you can download models in the blog posts. Here is the Reliberate model. You can see it’s based on Stable Diffusion 1.5 This website is great because you can see examples generated with a model with prompts included which is great for learning and testing. You can download it here, make sure to place it in this folder, the same one we used in the installation, step 4. When you restart SD you can choose the model here. You can mix the models. I will show you an extreme scenario. Let’s use the realistic model with this cartoony model. Here we can choose a multiplier, if we choose 0.5, both models will work equally. We can also add a custom name. Click here to merge the models. You can now choose the new model from the list. Let’s generate the same images to see the difference. Here is the result. Basically, we have something in between. It’s a really useful feature. ### Let’s move to the interface. I will start with the prompts. Write it here and click ‘generate’ to create an image. Each time we get a different result because the seed is set to -1, which generates a random seed each time. If we change it to 1 for example, each time we will have the same result. We also have a negative prompt section. Whatever we write here will not appear on the image. Let’s exclude grass. And a collar. You get an idea. ### ### Btw, this is real-time, I haven’t sped this screen recording up. As you can see the images are generating extremely quickly thanks to my new RTX 4090 card. Btw, VRay GPU is also crazy fast. You may remember the VRay GPU tutorial I posted a while ago. If you want to know how to use it, you will find the link in the description. Here are the render times for the 3 graphic cards I used for tests + 2 render times using CPU. As you can see GPU rendering is way faster. NVIDIA Studio is not only hardware. Such great results are possible because NVIDIA Studio is cooperating with Software developers like Autodesk or Chaos, for example, to optimize and speed up the software. On top of that, we have NVIDIA Studio Driver available to download in the GeForce Experience app. It’s more stable, which is super important. Recently I had issues with my video editing software, which was crashing all the time. The studio driver fixed the issue immediately. ### Let’s go back to the interface. Here you can open the folder with all the generated images, they are saved automatically. Also, there are text files that contain the prompts and all the settings, which is super useful. This option is turned off by default, I highly recommend enabling it. To do so, go to the settings, scroll down, and check this option. Here are other options to save the files or send them to other tabs. If we click this icon we can clear the prompts. And with this one, we can add the last used prompt back. Then, let’s say you use some part of the prompt very often. You might save it to styles. Give it a name. You can now reuse it by clicking this icon. Then, just add a specific prompt and generate. Next, let’s cover the sampling steps. Basically, it controls the quality of the image. More steps mean better quality. It doesn’t make sense to move the slider to max, because the render time is longer. If we increase the steps by 15 - from 5 to 20, the difference in quality is huge. If we increase it by 15 again though, the difference is not even noticeable. The sweet spot is usually between 20 and 40. It doesn’t make sense to go higher as the difference will be tiny, but you will have to wait much longer. Next - the sampling method. We have a lot to choose from. It’s quite complicated and I haven’t really dived deep into that, as it’s not necessary to know. I did a test and generated an image using the same prompt and settings in each sampling method. From what I found most people use this method. Let’s change it and generate. I also think it gives a nicer result. Unfortunately, we cannot generate high res images. If we increase the size here, we will get a messed up result. That’s because most models have a max resolution of 512 or 768 pixels. Stable diffusion is generating 16 images in this case and tries to stitch them together. With this model, we can do a max of 768 pixels. Let me show you how to create a larger image. We have to enable the ‘hires fix’. You keep the resolution at 512 and use the ‘upscale by’ option. If we use value 2 we will get 1024px, with the value 4 - 2048px. Denoising strength controls how similar to the original the larger image is. The lower the value, the more similar the image. We also have to choose the upscaler. I recommend this one, you have to download it though, the details are in the blog post. For now, let’s use this upscaler. First, let’s use a high denoising. You can see the image is way different. Let’s decrease the denoising. Now we have a very similar result. Next, let’s cover the batch count and size. If we increase the batch count you can generate 8 images at once. They will be generated one after another. It’s great because you can generate a lot of images, go grab a coffee and when you are done you can choose the best result. If you like the image, you can check the seed here to be able to recreate it. Here it is. Batch size does the same thing but images are generated at the same time. In my case, the results are quicker. Lastly, let’s cover the CFG scale. The higher values will make your prompt more important but will give you the worst quality. Lower values will give you better quality, but results will be more random. I think the sweet spot is between 4 and 10. As we have the basics covered, now let’s move to the fun stuff - Image to Image. We can start in Photoshop. Here I want to improve the 3D people, as you can clearly see they are not real. As you remember the max resolution we can generate is 768px. I will crop my image to this size and save it. In Stable Diffusion, let’s go to the image to image tab, then choose the “inpaint” option. Drag and drop the image into the editor. Here we can turn on the brush and paint over the areas we want to generate. Let’s write a prompt. I will set up a max size of 768 by 768. Also, I will set this option to mask only, this way the quality will be better as only the mask area will have a size of 768px. Then it will be shrunk down to fit the image. Let’s generate the image. The result doesn’t match our scene because the denoising value is quite high. Let’s lower it. I will generate more images, then I will choose the best one. Let’s place the best image in Photoshop. It’s way better, isn’t it? Especially the hair and towels. I will mask the people to make sure the connection is seamless. Then we can uncrop the image. Now we can have the best of both worlds, ease of use of 3D people and realistic results. Let me show you another example with greenery. We can also place a large render directly here without cropping. This is the image I did around 4-5 years ago, let’s improve on greenery. Let’s paint over this tree. I will add a prompt. I will increase the size also here. Btw, this image is 5K pixels horizontally. Keep in mind, that the generated area is that small. If you paint a larger area the quality will be lower. Let’s decrease the denoising. I will show you why you don’t want to use the ‘whole picture’ option here. This way the whole image is 768 pixels. Let’s change it to mask only. Now we have 5K images, and the generated part is 768px. I found out that the default sampling method works better here. With this one, the tree is too similar, there is almost no difference. I will change it back. Now the best part, we can clear the mask and drag and drop the generated image into the inpaint. Let’s paint another area and generate again. You can repeat this process. Here is my result after spending around 10 minutes on this image. I love it. The result is way more natural, soft, and photorealistic, while the shapes are almost exactly the same. The difference between the foreground trees is huge, the model wasn’t really good there. I hope you find this video useful and I save you some time researching all of that stuff. If you want to learn all about architectural visualizations in 3ds Max, check out my courses. Also, here I some videos on the same topic you might find interesting. Bye, bye.

Info

Channel: Arch Viz Artist

Views: 44,975

Rating: undefined out of 5

Keywords: tutorial, architectural visualizations, cgi, rendering, render, 3d image, 3d visualization, architecture, 3ds max, 3d render, interior architecture, architectural visualization artist, 3d artist, stable diffusion tutorial, ai art, stable diffusion, stable diffusion ai, stable diffusion install, stable diffusion controlnet, stable diffusion animation, stable diffusion img2img, stable diffusion prompt guide, ai art tutorial, ai artificial intelligence, stable diffusion models

Id: 4Na4JOgX7Yc

Channel Id: undefined

Length: 15min 44sec (944 seconds)

Published: Wed Sep 13 2023