Create consistent characters with Stable diffusion!!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
there is one way to start with nothing and end up with a trained Laura of a character we created completely from scratch I will divide this process in three parts and guide you through them the problem with AI generated characters is that they are non-reusable the moment you click generate again the character is gone forever unless you use already established names like Michael Jackson or use a trained Laura or embedding no matter how good your definition of this character is it will get lost people did find a way to create the same character in different poses though using a character tournament you can make a turnaround of the same character or if you want more specific filters you can use control net with duplicate open pose rings then stable diffusion will try to generate the same character in all of those this might already be enough if you're making a game that only uses like three or four poses of the same character for animation but we are going deeper this will be the first step of our process creating a nice and clean character sheet you can use the embedding I talked about before but I didn't see much difference when not using it what I will use is this preset of open pose rigs that I made in Photoshop for the main turnaround I used the reference from Civic Ai and for the smaller poses as I went to boost my art and extracted them directly from there you can create your own if you want or use the ones that I will leave in the description I will talk about this more later but for now we are looking to use this space as efficiently as possible with variation in sizes mix head shots and full body shots as well as Dynamic and static poses I will also make two very similar variations that change just a few poses we want to create a spreadsheet that has a lot of variation when keeping enough quality per pose so don't go too small and with this mate we can finally start with stable diffusion the character I will create with you in this tutorial will be a woman wearing a kimono with pink hair and blue eyes even though I will show you that it can be done with other characters and styles for those of you who are tired of anime input your open pose sheet into control net using no preprocessor and with control net is more important as our control mode we are looking for a clean character sheet with a white background try using a high image quality the max size will probably be 1024 and iterate using different prompts and different models you can also play around with this CFG scale if you feel like your prompt is getting ignored my final prompt was white background character sheet one woman break being her break blue eyes kimono long sleeves medium length hair and high detail when you find a character sheet that you like freeze the seat and activate high-res fix to upscale the result even further of course using the same width and height you originally had then activate high res at part 2 even though you can use more if you want and use your preferred upscaler in my experience the best ones are 4X Ultra sharp or langzos that allowed me to create a good enough image in my experiments like this one for the King character and this other one for the realistic character sheet and these are the problems I use in each case you may already know where we are going with this our goal is to create a data set from this one image and then train Laura with it keep watching though even if you already know how to do this I will provide with some cool tricks that will help you in every step of the way example when you find a character that you like and want to continue with it like this image activate the extras checkbox next to your recycle seed button then increase the batch count to something that you're comfortable with two is usually enough this will allow us to make variations of the image use it at a very low strength so it doesn't change the whole character in my experience the best parameters are about 0.15 to 0.3 making me these variations of the same character and the reason we do this is to make our cleanup job easier but more on that in a second if you aren't getting a white background no matter how hard you try and even trying different models then you will probably have to use image to image use a reference with a white background like the one I will leave in the description and then also use the control and model as we were doing before and prompts normally use the noising strength of one this should do the trick even if your result has no less definition you can also try to in-paint more poses into your spreadsheet but it is usually pretty hard to get good results and now that we found the base image that we like and created some variations using the extras option we can try to create some new poses this isn't easy to get right but trying can be worth it if your initial images weren't super perfect and some poses got lost like this ones in my case what I will do is use the image we created and add it to control net in reference only mode lowering the weight a little bit then change the first open pose sheet that we had for reference adding a new one with a little variation in my case I only changed the small poses a little bit during it again with the same prompt and if possible seat even if some part of the characters get lost we will have some new references that we can use and with this done it's time for cleanup and some retouching keep in mind that for the king for example I did absolutely nothing so there is no need to over complicate stuff it will highly depend on how important the result is for you for example if you're making a comic and this is a character that will only appear in a few minutes you can just create the open both references with the poses you will need and then generate until you find the fitting character and there you go from this you can just cut it out and put it in there however in if you care a lot about the character then import everything we made until now into a photo editing software the first thing we want to do is look at which images have the most consistency in characters and what poses are the easiest to read for example in my case after looking at the images I had I really like the lighter blue tone on this sheet I'm going to use it as a base and then replace some poses that might be or too repetitive or hard to understand so I will take the full close-up portrait and I will change this profile pose for a sitting one I'll change these weird typos for the one with hands in her face and I will also use the full body shot of her from behind and now I'm going to retouch this image to get as much consistency as I want or need starting with the most obvious miss the AI made the kimono only one girl looking back for this I'm just gonna grab the color they had and paint them in without many concern for Shadows or light next I'll erase the parts that aren't the main character like this bug thingy or these other floating Parts there is also some stuff that isn't the same in alternation for example most black shirts don't have this broom looking snot so I'll take one and paste it over and over again on the places that need them maybe the forming some to give a little motion to the poses I like this ribbon more than the one in the turnaround so I'm taking it and re-pasting it all around you can also add stuff too since the character is very standard right now I'd like it to have some kind of ornament on the kimono so I just painted some fast scribbles and copied and pasted them in each pose making sure that they were on the same spot every time of course something else you can do here is change the colors let's say I didn't want the kimono to be blue I can take the Hue layer then click on the colored blue and change the Hue to something that matches the color I want most in this case I'll leave it Bloom I'm just going to lighten the poses that feel a little darker than the others there is no need for you to do all of this do it just for the parts that you feel are important for the character because you could stay here all day if you wanted to just make sure to not overdo it though like with details because some things might get lost in the next step for the realistic style image making I just added the two variations together and fix the back the main image had some golden part in a few poses so I used the second image to remove them could also just have painted them out of course like I did with the forearms that had the same problem Now we move on to upscaling and this can get a little tough these are the usual parameters that I use in case you are interested using imaged image open control net and input the image you want to upscale the same image that you have in the image to image tab use the entire resample model and change Ctrl net to contournet is more important then I also use ultimate as the upscale leaving pretty much everything the same except for changing the upscaler to 4X Ultra sharp of course you can use another one if you want and using padding pixels of 42 with Sim fix activated that is what I used in the king generation and it went pretty okay some messed up faces here and there but that's fine and I also use this for the realistic duration even though I used none as an upscaler but not for any particular reason just because I forgot the first upscale I did for the anime chick was alright but it had a lot of artifacts and the eyes were a little bit off you could clean this up and just in paint the eyes again but I tried re-up scaling this time using a different upscaler for X Ultra sharp not having clip Skip and activating after retailers so it could Auto detect the faces and change them for this prompt also lowered the noising strength to 0.34 and made the image get upscaled directly to 4096 make sure to upskill this from an original image that was already pretty big in my case since I already upscaled by 2 using high-res my image is my image is about 1500 point being that it might take some attempts to get the right upscale so experiment and also Nothing Stops you from just upscaling more than once and compose the best parts of each one together sometimes I upscale once for details using a high denoising strength like 0.6 and another one at 0.3 just in case there are some parts that the first upscale messes up if you already have the upscaled version you can start creating the regularization images these are in short references that we will use to train our model and give it more flexibility I wouldn't have even known about this if it weren't for the help I got from on Discord but more than that later this is an optional step but can help you give your characters more flexibility on the final Laura we want images of this same character class so in my case it would be woman for the anime 1 and men for the realistic one in my case I used cyborg for a reason that you will see later the aim is to create variations of the class in different poses angles expression light setups Etc to optimize our time the most we can do the following we will use an extension called Dynamic prompts I'll make a video on it eventually but it is very good for things like this you can install it from the available stuff for now we will just use the function that allows you to create a list of Concepts and then let the generation choose one of them in every different batch you have you will need a text file ordered in this format I will give you some files you can use for this purpose without having to re-prompt every single generation just get the files and move them to extensions Dynamic prompts and wildcards I will leave one for camera angles and one for character orientation and then prepare three prompts full body Dynamic poses face close-ups and upper body shots don't use enhancers like high detail or anything over complicated we want the images to be as simple as possible just focusing on the main class if you for whatever reason still have a lot of details in your image you can use the add detailer Laura a negative value like -1 and that should take care of the issue and of course thanks again for the tip we will be using the prompt from text file a script now get your prompts here and type this gamertag looking format Ubuntu weird brackets type 1 and 2 Freedom signs then use the underscores the name of the file that you're referring to in this case camera view and then two more underscores and close the weird brackets put that in each prompt and also add the orientation one if you want do it however you like the most also activate the iterate seat every time option calculate a total of about 60 images make sure to use batch count and not batch size this will avoid bleeding I make these images 1024 by 1024 but kind of depends on your PC if you want to make them 512x512 that's fine too don't worry about it taking some time because this will be running in the background while we work on the next step preparing a data set we will take our upscale image clean the parts that we need to fix up and then separate it into different images to create a data set that will be used to train Allura so let's head to it this is the data set I created for my previous character it was a total of 15 images and I didn't use all the poses also most of these images aren't clean at all the most cleanup I did was in painting the faces that looked really bad just prompting something similar to what I needed something that I haven't tried that could work is using the Deep fake extension rope using the close-up portrait image on the spreadsheet as a reference to make sure it's the same character did clean up some parts though even though nothing crazy the first step is cleaning the part that broke with the upscale mixing the best part of its upscale if you have more than one as well as fixing some details in this case I had to add the broom looking stuff in some parts of the image that I forgot to add before upscaling but that's fine I also repainted some parts by either copying and pasting from other poses or just painting normally with the brush the final result I got from the anime image is this and that was pretty fast so now I'm going to separate each pose into a new image to populate our data set I just masked every single pose one by one creating a copy of the original image it is a tedious process but really worth it it is faster to do than it looks now it is very important to separate the main turnaround like I'm doing here I will only take two of the full bodies as a single image the one on the left and the one from behind the front-end profile views I will divide into five different images one for the main core of the body without the head and up to the hand I will do this for both the profile and front view I will also cut the front and profile view of the legs and lastly I will also take the profile picture of the head won't do the same for the front view head because we already have a larger pose right here then just separate the rest of the poses normally fixing the artifacts that might appear on them I just used a healing brush or paint over it also if there are other character Parts in the cutouts erase them with the mask or painting them with white and now we have a layer with each and every one of the poses from the original image The Next Step will be to create images from this cutouts the big poses with more resolution I'll leave with a white background into a 1024 by 2024 file and exporting them normally as PNG then the full body turnaround cutouts I'll export them with white background as well but this time in a format where the whole body fix without having to downscale it too much or at all finally I'll make a quick composition for each and every one of the rest of the poses this step is kind of optional depending on how good of a result you want you could do this with just a white background or with a poor Photoshop version but I wouldn't recommend it but I do like putting some effort into this so they look like images that we're taking on a real setting this will avoid having issues with white lines surrounding your character later on or even the character feeling out of touch with the background for this I just import the base cutout into a 1024x1024 file then take out the white background and using stable diffusion I generate a possible setting for the pose I'm looking at beta Dance Floor a landscape or some concrete wall I don't really care that much just something quick that kind of makes sense the only important part here is that the character doesn't get lost and the shape is well seen if you don't know if it is then look at the image from further away and try to see your character it should be easy to spot it is very interesting to have different and light affecting your character and for this I go to clip drop relight here you can import an image and re-light it as you want it does a good enough job and it is really fast to do just try the Delights match the colors of the background this next step I do is completely extra but like always it helps to add more clarity on what the main character is I like to blur the backgrounds a little bit some I blur a lot some I don't whatsoever also I add the reflection of the colors from the environment this is very fast to do and helps with overall cohesiveness but honestly do what you feel looks best these are the Omega side examples I made for the realistic version I'm kind of using this as a stress test to see what results you can get if you're not putting any effort into the process for better results I'd recommend either upscaling the faces again or spending more time in cleanup these are the kinda sad examples I made for the king and these are the sure examples I made for the anime version the anime is the one that took the most time with about 1 hour for the full data set preparation something important to know is that you can create this data assets however you want you can for example use my journey to create character sheets that aren't so human or that are more stylized and if you know how to draw you could draw them by hand or even posing a 3D character in different poses this is just a quick template that I made and that worked for me but it isn't the rule on how to make your images and has a lot of room to be optimized so as long as you end up with a consistent data set like this one you will be able to train AI with your own characters like we are about to do because our job here isn't done yet it is time to train a learner model so we can put this character into any pills and any setting we want this will be the final steps to make our character a reality you will need koias as but I'm not going to go about how to install it in this video because if you don't already have it then you probably need to watch this video first there I explain how to install it and how to use it also following this data set with a slower pace and more detailed explanation so you are not missing anything you can kinda think of this as a second part of this tutorial before leaving please watch the end of this video first because I have something very important that I would really like you to know see you there for those that already know how this goes let's speed up starting with renaming put all your images that will be used for training into a single folder then select all of them and rename them to whatever you want but make sure all of them have the same name and format so PNG jpg or JPEG but just one of them you can also make another one for the regularization images like this now that we have our images sorted we can caption them AKA describe them with tags if you are using a realistic model then caption using the clip format which is with a little more fluent language we are talking for anime so we will use the dborough format which is separating everything by commas I will use an application called the Bro tag manager or something like that not necessary but really nice to have I'll leave the link in the description just in case first I will Auto caption everything in the folder and then clean it up without getting into the details find the name for the character you want to make and it will act as a keyword to trigger your Laura then we want to describe everything that isn't part of our character using the king's images as an exam sample the character name is kgnn4t so in a white background image there is only a white background and our character if you think the crown forms part of the character don't add it in the text but if you think it is an accessory that can be detached from it then tag so tuck everything that you want to be able to change later on in my case I will just separate the crown knowing that I took all the tags describing my character that the auto captioning made and I will add scary as the keyword it is very important to add your keyword at the beginning of each caption for something that we will see in a second another important thing is adding the word cropped on the images that are just part of a body like these ones here where you don't see the full character once you have added it to one it will appear on the right the next time you need to add it just double click this one and it will be inserted automatically this step is the same for the images that are on the realistic style the only thing that changes is that you will use the Goyo SS clip Auto captioning and then you will try to use a more natural language style of describing don't worry too much the next step is creating the folders setup for the training in Korea I'll go to dreambooth Laura then click on tools and folder preparation at the training images folders path as well as the path for the regularization images if you have them of course for repeats I'll use 4D on the data set and one on the recolorization now hit create and copy to folders tab this will make the folders needed for training as well as copying the info into the training tab unless you're like me and don't realize that you're not in the dream of Laura section and you'll have to input them manually here we will choose what model we train the character on it should be one that is either the same one you used for creating the character or one that generates the same style in my case I used counterfeit to make the base image it is anime style so the model I got recommended for training is any Loren I'm gonna use that one for the realistic one I will use edge of realism or realistic version if you are in doubt you can always use the base stable diffusion 1.5 model now for the training parameters I change just a few things haven't experimented with creating loja or low con so maybe you have better like trying those but I wouldn't recommend for Lura beginners like me I'll just use standard as captions extension output.txt as that is what my caption files are you could use a fixed seat if you want but I won't for now for training batch size it comes down to how many batches your PC can handle for me it will be four at a time so let's leave it at that this will make the training way faster as far as epochs go I put 5 because that's what it usually took for the AI to understand my character more than that it was usually over trained and lower than that it was under trained it will of course depend on your training uses bf16 here but I can't because it crashes so I'll use fp16 which is also okay I tried rmw and lion even though none of them are date bit because they crash for me but you can try them though they are supposed to be good for 8mw I left everything as it was this is the optimizer that got me the best consistency even with using Contour man the LR scheduler I left as it is and changed it to 5 warm up most Guides Online tend to change the LR scheduler to constant and 5 warm up so you can try that too I got better results like this though from here I'll just change the network rank to 64 and network Alpha should be half of that so 32. for max resolution you can use 512x512 and it will be enough no need to go 768 enable buckets as we have images that are in square and that pass the max resolution for the advanced configuration I will activate gradient checkpointing it is supposed to make the training faster also activate Shuffle caption but remember putting keep tokens at one so the keyword that we put at the beginning in my case is scary stays always there for the captioning length you can let it be unless you have some really long captions then use clip skip 2 when training an anime model and if it is realistic then leave it at one and finally we will use another hack the wise one told me using the sample images config I will put 40 in Sample every end steps I'll use the prompt scary one girl soda standing looking at Beaver you can not the negative prompt if you want with Double Dash and then prompting this will make an image every 40 steps so we can see what is going on and where the training is at go on models folder then samples and here we can see what the model is learning of course the first images will be very bad but the idea is that it improves little by little over time if it finishes and you'd like where it's going but you think it is under trained you can import the last Lorna you made in here and use it as a base to train upon every data set will have some variations on the settings but if you have a data set like mine then you will 100 be able to train your character even if not at first try in my case I asked for help with all this training stuff as he knows a lot about Dora training and he carried me through this whole process basically flexing on me with his results being perfect at the first try and mind being trash he was absolutely necessary on this final part because he knows what makes a good data set and how to train it properly so I really wanted to thank him I think it is very important to tell you that I don't know everything about anything so having people like him that are willing to share their knowledge is super important and for that very reason I created a place where we can all help each other and share what we learn I'm really happy to announce that we now have a Discord server where you can find the Wise ones like and even become one of them yourself I actually created a section dedicated only to this character creation problem where we can experiment together to find the perfect solution I will make sections for very popular AI problems that haven't been solved yet so if you are interested in advancements on this topic or other AI related stuff make sure to get in and learn with everyone in this super cool Community as you can see the results were pretty nice fairly consistent and flexible I tested it with prompt the Wise One used which are the ones that you see on screen they are really nice to see flexibility and facial expressions and poses as well as settings and angles overall it works better than what I actually anticipated keeping in mind that we created this from one single image of course this is a very simple character the more complex than detailed yours is the harder it will get there is a very interesting idea that was presented to me by the wise one which is to retrain again using the new generated images to improve your Laura infinitely choose the best results and add them to your data set from here you can retrain your Laura from scratch or work upon the one you had also if you want to switch styles for example the realistic model you can generate with a higher Laura value like in this case 1.2 then I impainted using original and a load noising strength and now we have our trained character on a realistic style we can do this over and over again to prepare a super flexible and precise data set keep in mind though that like everything AI related lures have their own limitations they will be extremely difficult to have asymmetry as well as complex patterns of course the more you work on it the better chances you will have but there are some things that at the moment are pretty much impossible even with infinite images in contrast with the realistic model I made the absolute worst possible data set but upscale bad faces No cleanup but subject separation and random image patching that has no coherence with the character and no sense of scale or that to make it even worse for the captioning I only used auto caption and just cleaned up the the blue and orange texts adding the keyword of course oh yeah and really bad regularization images so being the laziest you can be preparing the data set and doing a pretty bad training job you could end up with a Laura with this level it is pretty bad but what did you expect of course the better you are at training stuff the better you Laura could be I'm pretty sure that could ruin a pretty decent Laura even with this data set but as I said everything has its limits creating a non-humanoid spreadsheet is very hard as well as having accurate descriptions or even characters that use items or half assets with them this method is no more than a prototype idea to solve the character consistency challenge it still has a lot of room for improvement for example we could find what's the optimal number of images to train a learner with as well as what types of variations and poses are the best then we could focus on optimizing the spreadsheet for that getting a perfect combination of images that are the best for training and that at the same time allow AI to create the same character consistently when creating the basic spreadsheet The Prompt also needs some work as well as we could find what are the best models for this the dream option would be to have a control and model trained so that you input a character and then it allows it to use it when other poses expressions and whatever but training a controlled model is incredibly hard and it needs a very large data set maybe finding a more optimized way to train Aurora is the easiest option but I'm sure that with time and sharing information we will find a way for now this is the idea I'm presenting for this challenge I would love to see what the results are as well as your ideas and propositions so if you find this video useful in any way please subscribe hope to see you in the Discord and thank you for watching
Info
Channel: Not4Talent
Views: 189,930
Rating: undefined out of 5
Keywords: stable diffusion, sd, stable difusion, promting, prompt, tutorial, guide, stable diffusion prompt guide, stable diffusion tutorial, ai art, ai, crate ai art, beautifull ai art, create good prompts, talk to ai, talk to stable diffusion, VariationSeed, BREAK, characters, midjourney, makecharacters, charactercreation, spreadhsheet, spreadsheet, consistentcharacters
Id: iAhqMzgiHVw
Channel Id: undefined
Length: 26min 40sec (1600 seconds)
Published: Thu Jun 29 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.