ComfyUI와 함께 보는 이미지 생성 기초! (comfyui 그대로 따라하기)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello. This is Neural Ninja. In this video, I prepared a basic video of ComfyUI. ComfyUI's strength is that it allows you to create and share various functions by combining nodes. The nodes and settings are well organized, so it is easy to understand the Stable Diffusion Image itself. I will explain the process of creating an image and how the settings are used to create the image. ComfyUI is divided into minimal functional nodes. So, things that were divided into separate UIs in other tools only have one node that calls an image in ComfyUI. If you understand the basic creation flow in ComfyUI, you can easily understand the creation process of the remaining functions. The way to create an image in Stable Diffusion Image is to create a specific image by gradually removing noise from the noisy image. This is the first noise image given. Here, we create a specific image by removing noise step by step. I am removing noise like this using Sampler. In ComfyUI, KSampler plays that role. Since the image is created by removing noise little by little from the noisy image, a completely different image is created depending on the noise. What determines this is the seed value. Noise is generated depending on the seed value. The same seed value always creates the same noise, so you get similar images. Sampler has several ways to remove noise. For example, there is Euler, which is fast but has slightly lower quality, There is something like DPMPP that is slow but has good quality, and there are various types of samplers. There is also a scheduler that determines which parts to focus on and when the Sampler gradually removes noise. Usually, I bundle the sampler and schedule together and select them together. Depending on which one you choose, the quality and speed of the image will be affected. In general, when creating quickly, use Euler + Normal, When creating details, I use DPMPP + Karras. Steps determines how many times noise removal will be performed. 25 steps means removing noise 25 times. If it is too small, less noise will be removed, creating an unclear image. If there are too many, noise removal will continue even after enough images have been created, resulting in unnecessary work. When the sampler creates an image while removing noise There are weight values that are used as a reference to create a specific image. The collection of those values is the checkpoint model. Checkpoints are created with a huge amount of images. An image learned through a process called learning. It only has weight values that lead to its creation. And Checkpoint has many different models. The type of checkpoint determines what style of image will be created. Using a checkpoint model created with weight values for animated images, An animated image is created through a noise removal process. Using checkpoints created with weight values for real-life portrait images, It is created as a real-life portrait. Depending on which checkpoint model you use, It makes the biggest difference when creating images. Since the values are created from a huge amount of images, the capacity is also very large. One thing to keep in mind when choosing a checkpoint is the base model version. Depending on this version, the image creation size and prompt style will vary. The versions of models that will be used additionally in the future must also be set to the same version. Currently, the two most commonly used base models are SD 1.5 and SDXL. KSampler is supposed to receive checkpoint model input from other nodes. In the ComfyUI, the value set directly within the node like this There are two types of input values: values received from other nodes. Select the checkpoint model from the node called Checkpoint Loader and By connecting nodes, you can input the checkpoint model to KSampler. Now, the final and perhaps most important prompt. Enter the prompt as text. When creating an image, refer to this prompt and create an image that matches the prompt. Two types of prompts can be entered: positive prompt and negative prompt. Positive Negative It literally refers to elements that should be reflected in the image and elements that should be left out. A typical prompt, for example: masterpiece, gokull, high resolution, beautiful girl sitting on bench outdoors Worst quality, low color, black and white If you enter this, an image is created while removing noise by referring to the prompt conditions when removing noise. Positive prompts and negative prompts are received as positive and negative inputs from the basic KSampler. It is input in the form of conditioning, and it is data that contains the values converted from the prompt text into a manageable form. The node that converts text into a value used by Sampler and puts it into conditioning is the CLIP Prompt Encode node. If you enter a prompt like this, you can include it in conditioning and input it into KSampler. When converting the prompt text, a CLIP model is required. The CLIP model is built into Checkpoint, so it can be obtained from Checkpoint Loader. sameEven if there is noise in the seed value, the image is created by removing the noise according to this prompt. The CFG value is a value that indicates how strongly the entered prompt will be applied. The higher it is, the stronger it is applied. Usually around 7 or 8 is set as default. If it is too high, the image will be distorted, and if it is too low, the image will be blurry. Now, let's create an empty image, determine the size of the image to be created, and create the image. Set the image here to be created by providing an empty image like this. The currently selected checkpoint is the SD 1.5 base model. For SD 1.5, the optimal image size is 512 pixels wide and tall. Generally, one side is created up to 768 pixels long, so it is often created in a rectangular shape, such as 512 pixels wide and 768 pixels tall. In the case of SDXL, the width and height are 1024 pixels, and the rectangular shape is used in more diverse ways. In Sampler, images are created in the form of latent images. Unlike images with typical RGB values, you can think of it as an image that has been processed into a good form with a different sampler. 8x8 pixels are expressed as 4 real(float) numbers. To convert this latent image into a regular image, you need to use VAE. In ComfyUI, you can convert a latent image into a regular image using the VAE Decoder Node. VAE used in Decoder can be loaded with VAE Loader. Depending on the checkpoint model, a VAE model is also built in, and in this case, the model within the checkpoint can be used. If you convert it to a regular image through Node, you can now save or view it. We took an overall look at the settings used in Stable Diffusion Image along with the basic workflow of ComfyUI. In the next video, we will learn how to use the Image2 image, which is almost identical in shape, upscaling, and the basics of ComfyUI. I hope the video helps. thank you
Info
Channel: 뉴럴닌자 - AI공부
Views: 951
Rating: undefined out of 5
Keywords:
Id: P2UKCgDA_0o
Channel Id: undefined
Length: 8min 37sec (517 seconds)
Published: Mon Mar 25 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.