How to use ControlNet in your AI Art - Stable Diffusion Tutorial 2023

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hey there if you've been anywhere in the AI art space in the past couple weeks you will definitely have heard of control net it's new extension for stable diffusion that gives you unprecedented control over your results and makes it possible to get results like this from input images like this and results like this from input images like this in just a couple minutes you'll be able to do it too let's get started in order for you to understand this video you do need a fundamental knowledge of how stable diffusion works and how to use the auto 1111 web UI so the thing that looks like this if you don't know that I recommend checking out my other videos and getting really familiar because otherwise this isn't going to be very fun I'm going to use an extension on top of all that so I do expect some fundamental knowledge first now what I'm going to be using today is the aeoni mix model so this stable diffusion checkpoint which just out of the box makes better images in my opinion with stable diffusion so go get and install that and a vae so this is just kind of a partial update to stable diffusion that makes it a little better and especially using this model it's essential to get good results so also make sure you know how to use vaes I've included a guide in the video description on what they do and how to install them so that's really easy just go and install those as well and then we come to the extension that's the whole point of this video control net controlnet is an extension you can install with this link from GitHub also in the description so copy this go into your stable diffusions extension tab install from URL put the URL in there and click install then once you've done that you can activate it here in the installed tab just click the check mark there apply and restart UI and then you'll find in your text to image and image to image tabs a new tab down here called controlnet where you can have all kinds of extra settings that I'm going to explain a little bit later what you're also going to do is activate multi models so you can use several control net models stacked on top of each other which is a setting here in your settings tab you scroll down to control net and then multi-control net you can set it to two three four or however many you want I'm going to use two today and then again apply settings and reload UI to make sure you have several model tabs now I've been saying models right what are those models well you have several control net models that you can use that you can install here again another Link in the description you have the canny model the depth model and a bunch of other ones including the pose model now what can he does it's named after the canny edge detection by a guy named Kenny and what it does is it takes an image like this this is the Wikipedia explanation and detects edges in the context of stable diffusion what that means is you can put in an image convert it into these edges and that way tell stable diffusion what your image vaguely looks like like what the elements are how it's composed and what the open pose model does is very simple you give it an image of a person it detects how that person is posing and converts it into a little image that looks like this and that way if you have characters in your scene it'll pose them in the exact same way that your reference image does again like in the thing I showed you I took two pictures of myself posing in different ways and perfectly transferred that onto these two guys so as said I'm going to be using the canny model and the open pose model so go ahead and download those two and when you have them go into your stable diffusion web UI folder extensions sdwebui control net models and drop your safe tensors in here then reload your why and you should be able to see them down here in your model and your preprocessors so canny canny and open pose open pose and with all that said let's get going before we even begin in stable diffusion what I'm going to start with is a photo bash so a very rough combination of photos that represent what I want in my image as a result now for that um you can just open any photo editing program you like I am going to use Affinity photo because it's a one-time payment rather than you know the subscription that adobe makes you do so I am going to create a large image 3072 pixels wide and 2048 pixels High Why well because stable diffusion was trained on 512 by 512 pixel images and this is exactly four times the height and 3072 is 6 times 512. so we have a nice big landscape image that I'm going to create DPI it doesn't matter 300 is good for digital what am I going to build I think I'm going to do a little bit of a Sci-Fi scene some kind of stranger things inspired government experiment going wrong you know really um kind of action scene so we can use the full potential of control net so as a background I'm going to check a free stock photo site called unsplash and look for something like mission control that kind of thing so this kind of room is good exactly it's pretty much the only one it has of that so let's download this let's look for um government room this could be cool okay I'm gonna need a portal that goes wrong some kind of you know Machinery that that's failing this is cool and tentacles let me pull all these in this could be kind of a good basis for the room and the point with this is be really quick you don't have to be detailed that's the cool thing about stable diffusion like Staples is going to do a lot of the work here you just have to give it the vague Vibe of what you're going for you can adjust the perspective a little bit you may know the image to image method which I used in a lot of my early videos which uses a lot of color information the great thing about controlnet is it won't so if you want color information you can still integrate it with the image to image workflow but you don't have to so what I'm going for here is more the texture the lines the composition of the image the colors do not matter and that saves us a ton of time here in prep so I'm not going to edit the colors all that much I'm just going for content okay good so this is the background what about the foreground well I want to put in some guys reacting to this whole thing which is why I photographed myself uh into perspectives one like this with my coffee mug running away from the whole thing so I'm going to put that in there and what I tried to do was wear things where you could really tell the outline so instead of wearing just a white shirt I have kind of this traditional like Bavarian shirt with dark buttons so controlnet has the ability to recognize the shape of those buttons and then it's probably a shirt and I held a coffee cup so you wouldn't see my hands too well and you'll see in the other picture I'm making a fist as I am here because you know hands and AI don't play well together hands and artists don't play well together in general and one more me I also have this and now those guys that were running that I also downloaded I'm going to put into the tentacle we'll see okay good so that's a great rough start and we'll see uh what control net and stable diffusion makes of it we just export it normally as a JPEG in full resolution and as per usual I like to iterate through these images so I'm just going to call it one now we go into stable diffusion and check out our settings so unlike image to image we start in the first tab the normal text to image Tab and I'm going to use my usual settings so sampling method ddim sampling steps is 20 for now is good batch count I do around four usually and make sure your width is 768 and your height is 512. so the same aspect ratio as the image we just made now control net down here you'll see your control net settings you have control model 0 and control model 1 make sure those are both enabled and then load in your images I'm going to load in one and this will be my canny Edge detection so you set the preprocessor to canny It'll recognize the edges of your image and the model to canny as well weight can be one for now we'll see and make sure your canvas width is also 768 by 5 12. now your second one will use the same image but this time we're going to see the poses of these guys so we're going to set the preprocessor to open pose and the model to open pose as well make sure your canvas settings are correct too and weight can be one as well and now we're going to um prompt so this is a longer prompt that I've just kind of figured out for a while so I'm gonna just put that in the description and you'll copy paste it in here or read it from here but just go copy paste it now we describe our scene a NASA laboratory mission control with a portal tentacles in the portal to scientists wearing suits running away and now we'll just see how far that gets us maybe the results will be incredible maybe they will suck let's try this might take a while to load because it has to load multiple models right not just our stable diffusion model but also the open pose model the candy Edge detection model so it pre-processes our images and then generates the prompt so far very promising isn't it doesn't that already look pretty much exactly like what I wanted not to hear about the tentacles yet but otherwise damn so here we go you can already see so there's a portal with a lot of tech and stuff in it our guys worked really well so the poses work the suit worked um has a couple different options I'm not seeing any tentacles yet but that's fine this is already a very consistent look the lighting is cool and here you can see why these are the edges it detected while generating so you can tell it has all these phones and Mission Control elements that it took over that's why that's so consistent and these are the poses so it got our guys poses perfect you can tell it's almost a little too constricted and that's where the weights come in so as you may know um you can you know of course wait in your prompt and the same thing applies to control net so the weight I would say of our candy Edge detection is a little too big I do want it to be a little looser so we can go down to 0.5 the poses I do like so we can maybe set those down to 0.8 you know give it a little bit of Freedom there but not too much and now uh where I'm not seeing any tentacles so I'm going to wait the tentacles in my prompt a little more and I'm going to do two scared scientists wearing suits running away is good and you know the coffee mug we're going to fix that later with in painting for now Let's uh try again excellent so as predicted it went a little more creative with the images because it didn't rely on the candy Edge detection quite so much we still have the basics of the composition the dudes are still in there the tentacles are there but it was a lot Freer a little too free for my taste so you can tell they're no longer scared they're no longer really running away they're just kind of posing I do like the general Vibe a lot more though so again just adjust your control net settings until you have a result you roughly like and then we'll move on to the next step great so we're slowly getting there now I'm starting to see elements that I really like in this image so this monster for example is fantastic and this is where we can start combining images back in Affinity so I'm going to copy this one and bring it in here we don't have to upscale or anything yet right this is still a very very rough step and start masking so Photoshop has the same exact feature you can open a mask I'm going to invert that and I'm just going to start painting in the parts that I like so the monster is really fantastic the portal will see we have our monster we can save this image again and bring it back in here so I like my guy here he's kind of cool in his blue suit so I'm going to bring him in here as is the monster so this is the best one so far but I don't love this guy here so I'm gonna return him to my or myself we're not looking at colors yet we're still looking at composition this guy has the portal I like and I want to bring back the details because I don't love Mission controlling it so what we're doing now is basically I've kind of fixed the composition of my original image right it kind of makes a little more sense now the the perspective is better than my original Photoshop so that's kind of what we're iterating here so we can slowly turn up the candy Edge detection step by step until we have the exact um composition we like so this is number three gonna bring this in here also update the pose if we want we don't have to because it's pretty much the exact same thing with the weight and now I'm going to turn this back up to one and edit my prompt so portal can be a little more important it isn't in every single image tentacles is very tentacles is very important and the suits are as well so I I like the blue suit jackets and white Hertz collared shirts so let's see what that gives us we can also turn up the sampling now a little bit give us some more of that detail let's go up to 40. in this latest round we're getting there they're both wearing the blue suits I asked this one's kind of lighter than that one he doesn't look like he's running away yet but the general Vibe and colors I love so this looks like the experiment is going wrong there's light coming from behind um I'm going to add some more details that make it look like there's you know fire lightning some containment exploded or whatever but for now this is this is our base for the next steps so I'm going to upscale this because we need a higher resolution I'm going to send it to extras set the upscaler to our ESR again four times and resize to four generate this should be pretty quick there we go it's a pretty good upscaler you know it kind of has those artifacts that we know from upscaling with AI but we're going to fix all that with um in painting so what am I missing I still want to have my scared face that's important for the next steps so I'm going to mask this image and get that right back in there as well as the coffee mug I'm holding because that's kind of a funny joke and now we're gonna get to in painting which is why I upscaled it because we don't want to lose those details this is iteration number four and this time we're going to go to the image to image tab now because I do like all my settings I'm just going to pick a random image send it to image to image and that copies all the data over but I'm going to replace it with my number four here same thing with control net I'm going to use the open pose one now and that's the only one I'm going to enable because Kenny Edge detection has done its job with the composition so preprocessor open pose open pose again canvas width 768. and now we're going to start in painting so I want to replace my dude first my running guy and see if we can get him a little more stressed out and now very important the in paint area that I want is only masked so then it will generate only that area in full resolution and give me back my fully upscale my fully scaled image back so it won't have the whole thing as the maximum resolution and just try to make a tiny little edit no it's going to use the full power just to create that Mast area which is what we need very important click all the others are okay you want to input the Mast you want to use the original content and my denoising strength can go to around 65 or so I don't want to change too much sampling method ddim and restore faces because we're working on a face right and since this is kind of a vertical area I can change my height 768 and my width to 512. that's okay and now I'm going to replace my prompt with a scared man wearing a blue suit jacket holding a mug of coffee and see where that gets us okay we can see that I forgot the running right he's not really scared he's just kind of standing there casually posing with his coffee which is not what we want but otherwise everything else worked pretty well so you can see there's now a whole ton of detail on just this guy his uh shirt is getting a little blue there here's our action star but the mug worked everything else worked pretty well so let's add some more terms fearful angry man and try again much better the expression is so much better he's a little confused but in general you know I have the mug of coffee I have this one I kind of like this guy the most he's the most stressed I love the mug I love the pose and we just need some uh some you know manual editing later in Affinity so I'm going to copy this and you'll see immediately if I paste it in it's the right size it's big so that's the important trick with the in painting to have those settings correct now let's keep editing with this image because we're going to edit some stuff that won't impact our guy so we don't have to immediately um jump into photoshopping there or affinitying so I'm going to choose this send to image to image and just continue my process so because this creature is more in the background I'm gonna go for it first I'm going to clear my in painting brush Strokes here and just select all of this dude now change my width and height make sure this is much smaller so if you send an image to in painting back that's really big it's going to do way too big Dimensions so you have to make sure to get these down so this is a horizontal image so I'm going to do 768 width and 512 height again and probably going to turn up the denoising strength a little bit I I will allow it to be a little more creative because there aren't very many details there anyway so so this is now a tentacle monster octopus coming out of a portal and add some stuff like explosion fire glass destruction just see what happens and experiment this is all a very loose process anyway and we no longer need control Nets because we're not we don't have poses or anything we're going to do that later I love love these this one has the most detail so I'm going to take this dude here and just clean up a little bit just you know have a nice back up and get a get a nice clear image at this this were a client project you know you could do some more editing down there if you wanted to now I'm going to try to get some like a nice coffee Splash in there Now using the same procedure as before I'm going to do this guy and again make sure you have open pose enabled again because we have a man we want to um get here and it's more vertical set the height and width to that and adjust our prompt okay okay you can spend as much time as you want on this obviously um I'm just going to leave it here for now and have it be the experiment uh at GQ Magazine here take this angry looking boy here and the advantage of having backups on occasion is that you can fix stuff like what's happening here right so this is obviously an artifact from in painting that we can easily edit as cleanly as we like with a normal mask here so and now using this workflow you can just go through this entire image there's a lot of artifacts down here that still look very AI generated and I'm going to take those on next get more higher rest tentacles and monster stuff in here Etc see you on the other side excellent so this is where I'm going to leave it this is my my final artwork for this tutorial you can do many more details if you want you can see time spent in it equals higher quality to compare this was my rough input this was the composition and this is the result everything I wanted is in the exact place I wanted it the octopus tentacle monster the dude I added him later again but uh he's almost in the same place I wanted him me you know who I just kind of posed for with my own phone I took a picture of myself in the same positions all the adjustments that are made I wanted to be made so you can easily control different elements of the image um I'm pretty proud of this I hope it was informative for you and I hope you have tons of fun doing this for yourself this is such a promising technology and I'm sure we'll see uh even more incredible developments in the next weeks let alone months and years we haven't even been at this for a year if you learned anything in this video the best thing you can do for me is leave a subscription a like and a comment joining the over 11 000 people now who subscribe to this channel I couldn't be more grateful thank you so much you're making this possible with that said see you next time I'm Albert bosazan and I hope you have fun with stable diffusion and control net

Info

Channel: Albert Bozesan

Views: 208,072

Rating: undefined out of 5

Keywords: ai art, concept art, stable diffusion, midjourney, dalle, open source, artifical intelligence, controlnet, auto1111

Id: dLM2Gz7GR44

Channel Id: undefined

Length: 24min 10sec (1450 seconds)

Published: Fri Mar 03 2023