이 사진은 합성이 아닙니다. Google의 미친 기술 드림부스 개념과 사용 방법

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hello, this is JoCoding. These pictures you see now, It looks like I took a selfie with celebrities, I brought a picture of my face and edited it to look like it was synthesized. BUT It's NOT. And the picture of me in military uniform, The picture of me in a doctor's gown, The picture of me in a police uniform. I didn't take this. I didn't even synthesize my face or my clothes. This image is like my face is in a 3D animation. This wasn't made by a 3D modeler. Everything that comes out here is based on stable diffusion. It's made by artificial intelligence. Isn't it amazing? But how did you get my face to look so accurate? I'll learn how to do that in this video. If you watch this video until the end, you can make this image. In this video, we're going to learn the technology called Dreambooth. Let me introduce you to the wood car. First, I'll briefly introduce what Dreambooth is. Prepare the data to learn, I'll try to fine-tune the model through practical practice. First, let me briefly introduce what Dreambooth is. Dreambooth is a fine-tuning method that fine-tunes the artificial intelligence model released by Google Research to suit the purpose. This is what it is. For example, when my dog, Nureong, is here, If you use a few pictures of this Nureong and learn an additional artificial intelligence model, A newly created model, Nureong is in Acropolis. Nureong swimming. Nureong sleeping. etc. It's an image that didn't exist in the world. It's a fine-tuning method for the subject-driven generation that makes Nureong come out. Let's take a quick look at how this is possible. First, prepare an image of the object we want to learn. With the class name, the concept of the object, Fine-tune the existing model in the way of Dreambooth. Then, using a word to express your object, A new text-to-image model is created. Then, with this new model, I learned, Here, the name of the object called V, If you put the class name together and complete a prompt, This new model expresses the subject that corresponds to V well. It creates an image that expresses this prompt. What you need to be careful about here is that V should not use English words in reality. For example, when you say you're learning Nureong, If you use words like yellow in reality, It's mixed with the image of the existing model, yellow. Nureong can come out in yellow. It can also affect the creation of other yellow things. It can create a yellow image. So, the word that corresponds to V here is, In reality, it's completely meaningless. EFZLA, something like this. You have to put in a completely ridiculous word here. With that word, Nureong is learned correctly. In addition, it's not just the object I want to learn. There's a class name. The reason for this is that if you learn only with the image you want to learn, All objects become yellow. It's called over-compatibility or language drift. So to stop this, I'm trying to learn. A wider concept, a class. The image created in this class. And my yellow image. They say they're going to process it together and share the weight. There's a yellow one of these dogs. This is how it's over-compatibility or language drift. It's a role to prevent. When you're practicing from behind, It's called a regulation image. There are images of the class. It was an explanation of why this was needed. Please refer to it. Next, I'll prepare the learning data for DreamBoss. I'm going to learn my face. I've prepared 20 pictures of me. Here's a tip for preparing a picture. If you prepare a picture with the same background and same composition, It's hard to learn. So in this way, Various accessories, Various backgrounds, Various facial expressions, You have to learn the pictures that contain them. When you're picking out the results, You can pick out a little more variety. And the image size is all different. 512 x 512 It's best to match the size. If you want to cut a picture at once, There's a site like this that resizes bulk images. It's convenient to resize here. I'll leave this address in the comments. So if you put the pictures you've prepared in here, It's uploaded to the site. On the right, I'm going to set the width and height to 512. I'm going to adjust the parts that don't fit. I'm going to set the overall size. And scroll down. Image format I'll unify it as JPEG. I'll set the quality to 100%. And if you press Save As Zip, The compressed file is downloaded like this. If you release the compression, You can see that the image is cut to the same size. I've got the data ready. I'll try to practice it myself. But the Dream Boost released by Google is It's a paper written based on Google's AI. I've implemented this directly with Stable Diffusion. There wasn't much code. But many people read the paper. To apply it to Stable Diffusion I implemented it and put it on GitHub. So as these things happen, Many people look at each other's repositories. I'm going to optimize the code and modify it. In a lot of repositories, You've implemented the code for how to apply Dream Boost to Stable Diffusion. So this is not an official repository. There are many ways. I'll leave all these GitHub links in the comments. Among them, we're the easiest to use. You can follow it without a separate graphic card. I've selected a repository based on Collab. This video is made by The Last Ben. Let's do Collab version of Dream Boost Fine Tuning. I'll leave this address in the comments. Come here and log in to Google. Press the drive copy button. Then this Collab file will be copied to my Google Drive. This is how the repository is made. I'll try to run it here. First, run the runtime. Press Run Time Type Change Check if this is set to GPU. Save it. And I'll run the cell one by one. First, the first is the code that connects to Google Drive. As I mentioned in the last video, It's connected to Google Drive in Google account. Google account with important data can be dangerous. When you connect to Drive, I recommend you to use an empty account. If you run this cell, There's a message that allows access to Google Drive. If you connect, Collab and Google Drive are connected. And it's a cell that sets the environment. If you click this, it's more specific. If you just press the run button, the environment settings will be automatically set. If you press the code sign, you can check the detailed code. Through these commands, You will go through the process of downloading and setting the required files. If you click on this folder, You can see that it's downloaded and installed here. If this is complete, the next code will also run. Next, you need to download the model to learn more. This is the same as the last video, You can proceed with the Collab version of the stable diffusion web UI. If you received a token from Hugging Face, If you copy this and paste it here, Basically, you download the stable diffusion 1.5 version model. You will learn here. Please refer to the last video for this part. Or if you want to download another model in Hugging Face, If you copy the repository you want to download and paste it here, It will download this model. In addition, as I explained in the last video, If you have a separate model file, You can enter the path here. Or you can enter the link address of Google Drive. In addition, when there is a problem with the compatibility of the model, If you check this, The code that converts the non-compatible stable diffusion model into a diffuser. Models that do not apply well to Dreambooth You can check this checkbox and compatible. I put in the Hugging Face token Using the most basic model, Stable Diffusion 1.5 Let's do fine tuning. If you press the run button, Automatically download the model from Hugging Face. If you download this model once at first, You can skip this when you do it later. Next, I'm going to enter the session. Here, the Dreambooth process is Because you can learn for a long time, When the session is cut off in the middle, You can save it so that you can continue to learn. Then the name of the session I'll write it like this. This is the name I use when I save and call. You can write it freely. And this second section When you continue to fine tune, You can put the session link here. It's my first time doing it now, so I'll leave it empty. Next, Contains Faces This is an image used for regularization It's provided in this repository in advance. If you select No, it will proceed without regularization images. Here, if you select female, male, or both, If you select the boss, Each one of them will bring the saved image here. I use regularization images. I'll select male and proceed. And run it. And before we move on to the next one, We have to go through the most important process. The images to be used for Dreambooth fine tuning It is said to change the name. So if you go into the example, In this way, It's a weird word that's meaningless to use. It's a number. We need to prepare images and put them in. Then I'm going to change the names of the images I prepared earlier. To change the names of many images at once, Select the whole block and press the F12 button. There's a window to change the name like this. We need to change the name to a meaningless prompt word. I'm going to use the abbreviation of YouTube. I'll set it up like this. If you just type this and press Enter, The whole image will increase from 1 to 20. As the number increases in order, You can see that the name has changed at once. Get ready like this. I'll upload the image in the next step. In the instance image tab, Here's the first option. If there's already an image, If you check this, The existing one is deleted and uploaded again. Second, put the images on Google Drive. If you put a link in here, You can get it. Basically, you can just run it and upload it. And the crop image is... We've already cut it to size 512. I'll uncheck this and run it right away. Then there's a window to upload like this. Press File Select and select images. Open it. Then you can see that the image is uploaded well. Next, in training, If you have to redo training, Check this and run it again. It's going to be reopened. And the second training steps are... It's 3,000. This is the image for learning. It's better to multiply it by 200. I've prepared 20 images. I'll proceed with 4,000 steps. But this step is... Actually, it depends on what you're learning. You can try and change. If you don't like it, I recommend you to increase it by 500. And the seed value is the seed value when creating an image in WebUI. If you don't have anything, you can create it randomly. If you want a seed value, you can put it in here. And the resolution is... We cut it to 512x512. I'll set it up like this. The higher the quality, the more memory issues there may be. So if you check this, It's slow, but it's used effectively. There may be memory issues. I'll check it for you. And if you check this, it'll go faster. If you uninstall this, it takes twice as long to train. The model capacity is 4.7GB. Next, you can set up a text encoder option. If you lower this percentage, Style transfer is better. It requires more training steps. If you set it to a high percentage, It adds more to the learning image. It creates a stronger result for the few steps. It's hard to stylize. I'll set it to about 80. Next, save checkpoint in every nstep. Every few steps, It's a checkpoint. It's a model file. For example, if you check this, Every 500 steps, We're doing a total of 4,000 steps. Before you get to this, Save the model every 500 steps. That's how you set it up. This is a pretty big capacity. Google Drive capacity is fast. So I'm going to uninstall this. If you have enough capacity, If you check and get the model in between, If you increase the number of steps too much, Even if it's too much, I can get the model in front of me. It's good to do this. I'm going to check it out to save capacity. Then I'll set it up like this. I'll run it. Then the training will go on like this. When training is complete, The ckpt model file is saved to Google Drive. Google Drive's fast Dreambooth sessions. If you go into the project name earlier, You can see that the ckpt model file is well made. I'll right-click this and press download. If you run StableDiffusion on your computer, Go to the StableDiffusion Web UI folder you received earlier. Models, inside the StableDiffusion folder. If you put in the model you just trained, If you run the web UI on your local PC, You can use that model. Or this collab itself. It's made to be integrated with the web UI and tested. Here, the session name, the code we set earlier. If you enter and run it, When the code is all run, The URL address comes out like this. If you click this, If you click Click to continue, Through the web UI in the collab, We can see the fine-tuned model. Let's create an image with the keywords we've set. photo of ytjcd Use this keyword to upload it. And then the class name. I've set it to man. a photo of ytjcd man That's all. Let's create it first. It's a little broken, But it looks like me. So in this case, To try various methods, Apply xy plus to the script below. X axis is the sampler. I've put all the sampling methods in here. Y axis is the cfg scale. 6, 7, 8, 9, 10 When you change it like this, I saw how the image was created. Overall, the images are all broken. I can check it out. This is because the model is too good. I think the text encoder is a little big. I think so. So turn this off. Run it or lower the number. I think it's a good idea. But first of all, Of these methods, If I pick the best one, In the DPM adaptive method, When the cfg scale is more than 8, I think it's a little more natural. I'm going to change another variable with this method. Let's see how it comes out. But I made a mistake with the prompt. Regulation Images The name of this class is Man is not man, It was a capital MEN. So when you put it in here, I think it's right to put it in like this. I put it in as man earlier, I think it came out more strangely. So I put it in like this. The DPM adaptive method that came out earlier, I turned it from cfg scale 8. You can see that it came out much better. Then I'll add a little more prompt here. I've been trying to change the prompt a little more. I'm a little unlucky, But I think there are still a lot of faces that look like me. In addition, I'll give you a tip. Checkpoint merger If you merge a model with a model that is too good for you, There are cases where it comes out more naturally. So try fine-tuning in various ways. I hope you can combine the models and get a better result. And if you create a selfie with a celebrity's face, This is how it can come out. In addition, you can change the outfit as you want. Combined with other models, Expressing in 3D form like this, Based on the face I've learned, You can create a lot of different images.
Info
Channel: 조코딩 JoCoding
Views: 85,711
Rating: undefined out of 5
Keywords: 인공지능, ai, 생성ai, 사진 생성, 이미지 생성, 드림부스, dreambooth, 스테이블디퓨전, stablediffusion, stable diffusion, google, 구글, AI, 그림ai, 노블ai
Id: vL1t2UcE998
Channel Id: undefined
Length: 16min 25sec (985 seconds)
Published: Sun Nov 20 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.