ComfyUI: Getting started for Stable Diffusion AI Generation for Design and Architecture (Part I)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
p is an amazing flexible node-based interface to create complex stable diffusion workplace without needing to code anything it can be used for a range of image generation workflows along with video and animation generation to create a variety of designs its flexibility comes with a variety of custom nodes that allows for Unique workplace and greater control over your designs it is a node-based variation of the well-known automatic 11 11 interface and for designers out there familiar with visual scripting it may remind you of grasshopper or blueprint workplace however if you're not familiar comp Pui is much easier to get to grasps with and to get final control over your AI generations and for this reason I'll be creating a series of videos covering key aspects of using this with architecture and design first to cover installation there are three main ways to do this if you don't have a computer with high specs you can use a paid cloud service this is an easy fast alternative as you can access pre-installed models and extensions when diffusion is one of these Services where you can select different environments such as the classic automatic 1111 or the com UI which I'll go through today the second method which I use and recommend is to install locally but using the free app Pinocchio which is a browser that lets you install run and automate many open source AI applications and models without any technical knowledge and fast I did a previous video on this so please take a look and the thir is also a free method but requires some technical knowledge which is installing manually this has the greatest control and let you understand more of the process and what is happening I'll briefly go through this in case you want to try it out for yourself otherwise you can Skip and go onto the comp UI walk clue by this we will start by going to compy UI page page on the GitHub can find this in the link below in the description if you scroll down you'll find here installing comp UI and you can find a direct link to download you click this you'll get a zip file and once downloaded you can extract it comp from within this order I also mention that if you're migrating from the previous automatic 1111 you can configure PA to link to the previous models so you don't need to download it all again you can do this by going into the windows portable and then going down here towards the extra model paths do example you just need to delete the do example and then open this up your notepad and here you just need to configure the base path so by default OBC stable diffusion web UI you can just save this and close if you don't have any models downloaded you can just ignore this step then if you go back you can run the batch file to open your com UI I'll be using the Run Nvidia GPU you will then have a tab open up with the default comy UI interface before I go into detail with this I'll show you how to install the compi manager as is quite useful to have installed for this you need to go to the GitHub page with the link below in the description and here there's some info about how to install it but the easiest way to do it is through get which allows you to put command plumps directly download install software or models so if you click this get for Windows download and install by your operating system you then just need to copy the git clone command and then you need to go over to your custom nodes folder in the comy UI compy UI and down here you see custom modes you click in the B type in CMD we open up the command prompt and you paste in the command here you can see it's clearing the compu manager into your compi folder you then need to just restart your compy UI and you see and the bottom right you have now the manager installed so this is very useful for updating all of your models installing custom loes and you'll be using it a lot in the future now to the main interface as expected you have these nodes which have a defined functionality and they connect the other nodes through these wires you can drag and then attach to the matching ones they're colorcoded and they're labeled here to help you match them up you can create new nodes s by double clicking and then searching for name the flow of the algorithm goes from left all the way to right and to run it you can hit control and enter or press the Q pump button here to get a better understanding of how stable diffusion works you could take a look at the science behind this the easy way to do this is if you look at the stable diffusion Wikipedia page for example here you can see the den noising process used to generate images by iteratively the noising random noise Guided by the pre-trained text encoders and this will be happening here within this SL space I feel free to read through this now that you have an overview of the works space I'll go over the default setups as it gives you the primary nodes here which you'll be using often so the first one here is a load checkpoint this is where you select your train model which will greatly affect your generated images once you have them loaded in you can select them from we the drop up a great place to get these models is civit AI you can have fun exploring the Crea models out there and a reliable model for creating real istic architecture images which I often use is called realistic Vision so you can click this and then the best way is to read in the descriptions here and the show more how it works and what options to use so for example here you can see what kind of pumps use for the positive and the negative ones for a base add then some of the settings that we can use later on for these models you can always download from up here and then you need to paste them into your models folder in the directory which you can find by going to your compy UI folder and then within this app if you go to models you can find the checkpoints so you can see here I already pasted some checkpoints such as the realistic Vision so you feel free to download as many as you want put them here and switch them out as you can see they're very large in file size them being 10 GB you download them and restart you can then click on these check points so I'll choose a the realistic vision and here you have the two prompt boxes which are called the clip text encoders it is div Vision encoder to identify objects through the prompts so these connect to the positive and the negative PR set in the case block start of the prompts it's a good idea to have look at the model that you're using so for example in the cibit AI AO recommending to use this promp so you can simply copy and paste it in and then same the negative one however since we're not generating people architecture you can get rid of a lot of these such as like deformed irises pupils hands and things like gra then you can add in the description of the space you want to create so for example for the positive one we could write a Modern urban shopping center during the day so normally the negative p is the opposite of what you want and you can see that there's some brackets here here and that you can add emphasis or ratings to certain words so for example if you went to daytime we put this in Brackets you can put semicolon and put like 1.5 and this will add NIS that is the daytime next is the latent image technically speaking the latent space captures the underlining structure or hidden relationships with winded data in a lower dimensional space to allow for creation of a new image that's quite a lot to take in but essentially this is just the starting point for your image generation so for example if you have to start with a photo you could put it here otherwise you can leave it blank and just set the resolution that you want ideally you don't want to go above 1024 or one of the dimensions and you can always upscale the image data in your workflow which I'll go through in a future video then we can go onto the course sampling settings over here in a case sampler at the moment I have here the controls set to randomize so each generation is something different but you can always fix this and set a seed for the steps it's recommend to have between 20 to 40 so I could stick with 30 and this also affects the time takes to generate CFG stands for classifier free guidance scale so the CFG scale controls how closely the results matches the input prompts the sampler and the scheduler have a range of different options which play a pivotal bow on the image generation there's a lot of literature out there or which best outputs to use it's a good idea to go look back at the model using and there recommendations down here but I often use for the sampler name a DPM PP 3M all these are generally quite good are you stick with the GPU and the schedular Caris e theoa this sets the balance between the detail preservation and the noise reduction in images so a den noising strength of zero add zero noise so your output will look exactly like your input so if you remember that's the noising diam back in Wikipedia this is essentially what's happening goe so we the noise extrem value of one will completely replace your input with noise so the input image will have no effect but we start with no image so one will be fine at the moment you get a better understanding of this as we go through examples and you also see that we have Have a BAE over here and the one at the start which we downloaded from our model so this is the decoder to convert a bitmap image into a Laten space which is then denoised for the image generation within the sampler we have set you can see here we have a b code because when it's being created in the generation we need to encode it and then decode it back to an image cuz in here there's no images that we can understand see this in action you can hit control enter or Q prompted and then you can see the flow process and then the final image will appear over here there we go so it's quite useful to have this preview of the lat in space if you want to see this you can go over to the manager and and then in the preview method it's St showing sometimes by Auto you can go to latent to RGB although Auto sometimes SE it if you want to use a base image then you can delete the empty one double click and you can type load image if you then choose a file to upload then we need to encode this image so it can be using in the case sub so you drag out from image and you can actually see the suggestion sometimes so for example it's the VA en code and then we can match the pink latent with the latent image here and you see here there's a bae missing so you match them up too that's quite intuitive and then we can hit Q prompt and then you can see from the generations that uh you have your shopping center here but it doesn't resemble exactly this image here as the noise is quite high you can decrease this to something at 7 and then run it again now as you can see in the lat in space it is more closely following this space so you can play around with the D noise and sometimes the CFG here and along with your props you can see we had this emphasis on daytime and it is indeed during the day this just covers the basics of promy UI workplace to get acquainted with the new interface however we want even more control using other models and extensions such as controling it the compy take advantage of this workload especially since this tour offers much more custom ability compared to M Journey I'll go through more of these steps in the next video so I hope to see you there
Info
Channel: Urban Decoders
Views: 2,951
Rating: undefined out of 5
Keywords:
Id: me30yQnh5jU
Channel Id: undefined
Length: 14min 57sec (897 seconds)
Published: Sun Feb 11 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.