AI Fashion: Apply Clothes to Characters with ComfyUI & IPAdapter!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Hi everyone, and welcome back to the channel. I've seen a bunch of videos floating around on the topic of getting articles of clothing onto AI generated characters, and I've seen that there. There's several techniques on how to do this, some being better than others. So I decided to start this mini series to explore all the options on how we can take articles of clothing and apply it to a character, starting with IP adapter on the UI. That's what we'll be covering today. Then we'll be looking at how we can use segmenting and masking to isolate parts of an image so that we just change specific articles of clothing. And finally, we'll be looking at the OTD collection of nodes, the applications, and being able to take an article of clothing in a clean background and apply it to a character are tremendous, especially in the e-commerce space. And with that, let's get plugged in. So today we're going to look at the IP adapter method of dressing up your characters. As usual, I've prepared here a workflow that takes an article of clothing with the white background. As we can see here, it takes a reference character on a plain background, and we feed it through IP adapter to end up with a result that looks like this. And as you can see, the jacket adheres to the character fairly well. In fact, if we drag it over here as a reference, you can see that the character is wearing almost the exact same jacket. Spot on. There are a few niggling issues where this jacket looks like it's zipped up on the side, and this one is probably zipped up the middle, but that is a minor detail for the most part. Everything else is the same. The little details here of the buttons on the lapel, the double collar, even the little button details on the shoulders, which are not identical, are fairly close. And that's the thing with IP adapter is you're still using an article of clothing as a reference. And while the final output will show you a result that is very close to the original one, it's not going to be 100% the exact same thing. You'll also note that with IP adapter, we are using this character as a reference, and she's leaning up against the red wall in the final image. The wall has changed. The pose of the character is different, and overall there are several differences with the reference image. And that's the thing with IP adapter, is that the images that we're using are effectively pumps. So the final output is going to be very similar and very reminiscent of it, but not the exact. And as this series goes on, we'll actually compare this method with the other available methods such as outfit of the day, as well as using masks to try and get the character, pose and background to be the same while changing the clothing. Each method has its pros and cons, and we'll explore them as we continue on through the series. So to understand what's going on, we have here the two images that are loaded up as references. We are feeding them into an IP adapter, and I'll go into the details of how we do that shortly. Then from the IP adapter, we feed it into case sampler along with some prompts to help it out, such as. In this case, we are saying that it's a woman wearing a leather jacket setting against the wall. I am then fitting that into a face replacer and finally into an upscaler fairly standard. So what is happening with the IP adapter to make this happen? Because I'm using two reference images instead of one to feed into the IP adapter, I am feeding them in in the form of embeds, and we can see here that I have a node called IP adapter combine embeds. This allows me to take multiple reference images and feed them into the same IP adapter. To achieve that, we first need to convert the two images into embeds. So we do that by feeding them into an IP adapter encoder. Each one is assigned to the respective image, and because we are converting these images into Conditionings, we have here a clip vision which takes the image and turns it into conditioning, analyzes what's there and converts it into an embedding. And then of course we have the IP adapter model which connects with the encoders as well as the IP adapter node, so that they're all using the same IP adapter. And that's pretty much it. The IP adapter then feeds the model into the case sampler. And we're off to the horses. If by the way, any of this is confusing to you and you are struggling to follow along, I suggest you check out my ultimate IP adapter guide, which covers all of these concepts in depth. The only other thing that's worth noting here is that each embed has its own weight. So if we are feeling that a particular image is not being referenced aggressively enough in the final image, or vice versa, it's being it's being referenced a little too aggressively. We can lower the weight and adjusted accordingly. So now that we've seen what these two reference images do, let's test them out with several others. So to make things easier, I've just moved the reference images here with the upscaled image so that we can see what the different inputs due to the output without scrolling all over the place. So sticking with the leather jacket, let's try a couple of other reference images. Now here for reference, we have another image with the red haired woman standing against a white background. If you'll remember, in the previous reference image, she was wearing a denim jacket. This time she's wearing a white t shirt and blue jeans. Last time she had the denim jacket and the dark pants. Let's see what happens. And there you have it. Let's have a look at the details of the image. So if we remember reference image, white t shirt, blue jeans, red hair and white trainer style shoes, here we can see that she's wearing the leather jacket, blue jeans, red hair, untrain her style shoes. Now the image is not perfect. We can iterate on this a little bit and tweak things. The face is not great. I'm not too happy with the jacket that's coming out, so this is a great opportunity for us to find you. So to keep things consistent, let's now change our seed to fixed. So I'm going to run it one more time to make sure that the image is fixed. And then we're going to start tweaking the weights to see what that does to the leather jacket. Great. So now that we have a fixed seed, which actually couldn't have given us a better result to experiment on with the weights, we can see again that the reference person is pretty correct. White background, right hair, right t shirt, right jeans, right shoes. However, the leather jacket almost blends into her and is white. That is definitely what we don't want. So let's go back and tweak the weight of the leather jacket. Because I moved the reference images above the upscaled one. They're not here, but I remember that it is the top one. So let's crank up the weight to 1.3 and see what we got. That took quite a bit of effort to clean up and get to the result that we have right now. Let me walk you through what I had to do to achieve this. So the first logical step was to increase the weight of the leather jacket and decrease the weight of the person to try and let the leather jacket come out a little bit more strong. I stuck to numbers around two and 0.9, and what that resulted in was the character. The character wearing the leather jacket. However, because the leather jacket is an upper body piece of clothing, the image would crop the character to about the waist length. And of course, considering that the other reference image we are introducing is a full body character, we then had to play around and balance the weight to try and come to a result that still gives us the full body character with the main elements of what she's wearing the trainers, the white t shirt, and so on while applying the leather jacket. So to try and bring back that character, I also increased the weight of the character, coming back up from 0.9 to 1 to 1.4, anything lower than 1.4, and I wasn't getting the full body. I also came over here to the positive prompt and added full body at the beginning with double parentheses. Again, that helps force the model to give you that full body. Finally, while I was getting relatively good results with the weights that I had, I still wasn't happy with the image. So I came over. I adjusted the weight type and my embed combiner to average. I believe the default is concat, which adds the two together. So I said average which averages the weights out, giving them a little bit more even prioritization. And if you want to know what weight type does, I once again recommend you to go and check out that ultimate IP adapter guide. I have a phenomenal tool that allows you to see and explain how weight types affect the way the model produces the image. Finally, I also tested out the embed scaling to can be. If you want to know what embed scaling is, I suggest you like and subscribe because I will be covering that in a video soon. So by combining all of those tweaks and adjustments, I was able to get this reference image and this leather jacket to come together to create this image. I hope this this section was particularly helpful in helping you troubleshoot images that just don't want to behave. And now we're going to jump into one more level of complexity. So far, what I've shown you is a collector standing and all we're doing is applying one article of clothing. But what if we wanted to apply two articles of clothing and give the character a little bit more complexity? Well, just like applying clothes to characters when using IP adapter, we have a few different ways to do that. Today I'm just going to focus on one. So if you're using this workflow, which is available to everyone, if you want the one with three reference images, that is available on my Patreon. But if you're following along, you can add on the third reference like this. We're going to select the IP adapter encoder and press Command or Control C, and then Command or Control shift V, and that will create a new encoder with the models already LinkedIn. Then we'll just add in a load image node and connect it up to the IP adapter encoder. Finally, we'll take the positive embed and bring it down here to embed number three in the embed combiner. And that's it. That's all you have to do to add a third image. Of course, we now have to play around with the way to try and get the result. But let's plug in something and see what we get okay. So now what we're going to try and do is we're going to try and take the red haired character and have her wear the dress and the jacket. Now link with many things in IE. The quality of your output is dependent on the quality of your input. So I'm going to switch this image to the one that gave us a better result, which was this one. And that should help inform you that the image you introduce to IP adapter makes a very big difference in what it can do with the image. We're going to change things up here and keep full body. Red haired woman wearing leather jacket and white dress standing against a wall. Let's not touch any of the weights or embeddings and just run it and see what we get. Okay, so the first result is not too bad. We have our red haired character, we have the leather jacket, and we have a dress. Now the dress is not the same dress as the one in the reference image, as the bottom half is black. And this is what I was referring to about the way it's taking over. I'm guessing that the bottom of the dress is black, because in this reference image she's wearing black pants and a black t shirt. So since we know that this image is related to this weight, let's drop it a smidge back down to one and see where that leaves us. Interestingly, by lowering the weight of the reference character, we still ended up with. So let's give the white dress a little bit more weight. Now, this is particularly challenging because this reference image has a character as well and she's blocked. So we want to try and find a point where we are getting enough of the dress to show in the image, probably with the jacket. What we're going to be looking for is the lace while not bringing in the miss. So right now it's set at a weight to 1.5. Let's crank that up to two and there we go. We've achieved it. We now have our redheaded character wearing a white dress with a leather jacket. Now, when adding multiple inputs with IP adapter, obviously the challenge becomes in balancing out the inputs to try and get the result just right. If we look closely at the image, there's a little bit of spillover and there is probably opportunity to continue to fine tune this. Some of the issues are the leather jacket actually has the print coming in from the dress, which actually looks pretty cool, but it's there. However, the details of the leather jacket continue to remain. We have the double color, the buttons, and it even looks like it's zipped up on the side. The side zippers are there, so still really good attention to detail. On the dress side, we have the overall shape of the dress more or less right. The only thing here is that the top area here is quite lacy. She is wearing the sleeves. We don't really see that because of the jacket, but if we come in closely, we can kind of see the pattern reflected on the dress. Now to achieve that, I had to crank up the dress to 2.9, the leather jacket up to three, because what was happening was the leather jacket was white. I dropped the original character reference to 0.57 because she's wearing a lot of black already, and I found that that was spilling over into the dress, making it look black. So it was important to scale that back white considerably. In this particular occasion. With the three items, I found that normalized average worked best as the embed combiner and the weight type for linear. And that's pretty much it. That's how you're able to combine multiple articles of clothing and a person and have them applied to the final image. As you can see, to get strong results, it takes a little bit of tweaking in fine tuning. There are other ways to achieve the desired results, and we're going to be exploring those in the next videos, specifically looking at how we can use attention masks and other forms of masking to take the reference image, which in this case would have been this character here and actually not change anything except the clothing applied to it. Finally, in the third video, we will actually look at the OTD nodes, which do the same thing in a slightly different manner. Finally, we'll do a final video where we compare all the methods so that you know which is the best one for your needs. And welcome back. I hope you guys found the video helpful. As usual, please don't forget to like and subscribe. It really, really helps the channel out and press the bell icon if you want to. Stay tuned for the upcoming videos in the series. If you have questions and want to learn more, please come and drop by our discord. We're happy to answer any questions. In the community is growing every day. I've also started to run a competition where you'll be able to win points, and I am planning to have those points be usable for something down the line, so don't miss out. Finally, if you want to support the channel and get access to exclusive workflows including the three article of clothing workflow as well as the version with the Upscaler, along with a few other IP adapter adjustments I've made. Please come and check out the Patreon as there are exclusive workflows available there. Or if you just want to support the channel because it really helps me out. I'm always trying to do more videos and I'm trying to expand the topics that we cover here on the channel, so your help is incredibly useful. I hope you found this helpful and I'll catch you guys in the next one.

Info

Channel: Endangered AI

Views: 1,406

Rating: undefined out of 5

Keywords: stable diffusion, ai art, segment anything, ip adapter, stable diffusion tutorial, ai influencer, comfyui tutorial, stable diffusion tutorial for beginners, comfyui tutorial sdxl, stable diffusion img2img, stable diffusion controlnet, ai video, ip adapter v2, ai ecommerce, ai fashion model generator, ai fashion lookbook, ai clothing mockup, ai clothing model, ai fashion, ai fashion design, ai fashion model, ipadapter, ipadapter comfyui, ipadapter stable diffusion

Id: LtcEJ3Hp434

Channel Id: undefined

Length: 14min 18sec (858 seconds)

Published: Tue May 14 2024