Kasucast #13 - Stable Diffusion: How to train LoRA for Style | LoRA Part 1

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome back today I'm excited to share my exploration in the world of stable diffusion and Laura for Style Laura or low rank adaptation for fine-tuning is the new up-and-coming technique that's taking the diffusion models to a whole new level Laura was initially developed to teach models new Concepts and it's been mostly used to train characters so far however if you really think about it subject and characters are Concepts and artwork styles are Concepts as well anyway the advantage of low rank adaptation for fine tuning is that it's smaller than dream Booth models while also being more modular to a certain degree as someone who's always been interested in the style aspect of diffusion models I've been focusing my experiments with Laura with the purpose of replicating or combining art styles and if you're curious about the theory and Mathematics behind Laura don't worry I'll provide a link to the paper for you to dive deeper for me I like to learn abstractly through Hands-On induction first before doing the Deep dive so that's what I'll be doing today before we get into the procedure let's make sure we have everything we need you'll need to have an installation of the automatic 1111 web UI along with the specific models you plan to train on for me I like to use the most vanilla based models so that's the stable diffusion 1.5 EMA pruned and the anything 4.5 pruned models next you'll need to prepare data set with text files storing captions or tags depending on the model you're training on trust me this is a key component of the process so I'll explain more in depth later finally you'll need a repository or notebook that can trade in Laura but don't worry if you don't have a powerful GPU you'll still be able to follow along in fact I'll be using two different methods to train the lore style models for the first half of this video I'll be using the user-friendly GUI implementation of koya's SD scripts by B multi on GitHub you can find the link to it in the description below he also has a YouTube channel so please check him out as well I'll be training the stable diffusion 1.5 EMA prune model using this method for the second half I'll be using the Google collab notebook version of the Koya Laura trainer which you can access through the link in the description as well for that experiment I'll be training the anything 4.5 prune model using the knowledge I gained from my local installation experimental runs as I mentioned before I'll first be using the user-friendly GUI implementation of koya's SD Scripts don't worry the installation process is fairly straightforward once you get clone the repository to your local machine simply navigate to the directory and start up the gui.ps1 file like so and once you do that it'll start up really quickly and then it'll give you a local host link that you can go to to see everything so 7860 here we go before I go any further I'll try to be as transparent as possible with my hardware and Equipment as always I'm using my RTX 3090 card with 24 gigabytes of vram so if I type in Nvidia Dash SMI you can see my vram right there this means that I don't have to worry about vram costs and I can train my Laura models with ease now if you want to follow along with my settings but you don't have a powerful card that should be no problem you can adjust the settings in the Koya trainer to ensure that you don't get any errors while training unfortunately I don't have a weaker GPU to test out lower training under low vram conditions but with some tweaking of these settings I'll list here on the screen depending on you should probably lower all of these you'll be able or turn them off you'll be able to run their training locally with no errors continuing on I'll share some insight on streamlining the data set preparation initially I was going to turn exclusively on graphic paintings such as the work of Taiwanese artist vaufon these are some examples of this work if I can find them this this and he's known in the anime artist Community as an illustrator that paints with light so you can see that it's not very animation except for the proportions and you might recognize his R from the monogatari series light novel covers as he Illustrated all of them so this is oshino ogi I think and before we proceed with captioning let's rename and crop the images a useful tip is to ensure all your images are in the same file type like jpeg or PNG to prevent any duplicate numbering during batch renaming here's an example of what to avoid or look for when creating your data set on Windows select all images click on the rename icon up here and choose a name for all images so I'm going to say graphic underscore LS and press enter and voila all your images are renamed with the only difference being the number in parentheses however there is a gotcha here if we take a closer look you can see that there's actually repeating numbers like so this is not good we don't want that so you can see this image and this image are both named graphic a list one but one of them is a JPEG and one of them is a PNG like so so and next we have graphic list 2 and graphic list two three and three four and four five and five we want to avoid this but how there's a simple fix for this what I did to avoid the duplicating numbers inside the parentheses for the image name was to open up a command prompt and then navigate to my folder with the images so if we look in here you see all my images are in here and what I want to do next is to convert all the jpegs into a PNG or all the pngs into a JPEG just make sure everything in your folder is one file type so you can type in r-e-n for rename and then the star.jpg and then another star dot PNG like so now press enter and everything in this folder should be a PNG because we renamed all jpegs to PNG and let's verify you see all these pngs here great now we're going to go back and try the battery name once again so back in the folder Ctrl a to select everything and then click on this icon up here rename once again graphic illust enter and now this for me that we don't have repeating numbers anymore we have one two three four five six so on and so forth up to 79. of course this is not the final data set I used but it is a large portion next up move these images into a cropping software or tool for me I'm using the berm.net or burme.net I don't know how to pronounce that still I found that my GPU can support a square 768 pixel resolution for training so I'll be cropping to that resolution so make sure you can detect the focal point and nothing special here and sometimes your browser might not put all the images in here so it stopped at 50 and we don't want that it's probably my browser's problem I'm using brave but if I add in the rest of them I go up to 79 which is the correct number so just move everything into place so and yeah once you're done doing that just save a zip as always next up is captioning your images once your crops are done head over to the B multi GUI implementation at your localhost probably 7860 go to utilities tab right here and there's a blip captioning tab right here all you have to do is put in your image folder to caption here like so and as I mentioned before if you're training only on a single artist style you can add a token at the beginning of your file if you want to add a prefix like that you could type in vo fan style if I was only training on vofan's images but I'm not I'm training on multiple artists so I won't add that prefix to all the files for me I found that it doesn't really matter to have a token or not because of the way I prompt which I'll also elaborate on later so you can change the max length of your token I found that 75 is fine because I'm going to be editing it anyway so once everything is here and it looks right to you just go ahead and press caption images if we opened up my Powershell here you can see that it's captioning files right now and initially it would probably download this dot pth file but I've already downloaded it because yeah I've gone through this process already and we can just watch it run it's pretty fast okay so it looks like the captioning is done and if we want to check our progress on that we can just go back to the folder here I believe okay so here we are and now we have a text file for each of these the software that I like to use to edit the captions is Visual Studio code you can split the screens so open up your Image First click this icon in the top right corner and then in your second window just open up your text file the question now is what do we caption for if we were training a character we would have all the images be of the same character then add a token in so that it would be recognized so let's go to the top and instead of a girl I would just say Hatsune Miku with long green hair is flying through the air and all the images would be of Hatsune Miku but obviously that's not the case here so we want the style to be implicitly learned so we will describe everything except the style of the image we won't use terms like an illustration of or a photo of we won't do some we won't do things like that instead to basically we want to caption everything that we want to change this might be a little difficult to understand at first it was pretty hard for me to wrap my head around this as well I'll try to explain it as intuitively as possible so in this image a girl with long green hair is flying through the air that's pretty accurate but it's not accurate enough what I would do is a young woman no actually a girl a teenage girl with teal eyes and long teal hair and a in Twin Tails fashion twin tails so that would be part one and I would delete the rest of this flying through the air I'll put at the end so what else besides uh what I wrote is in here so she's worrying um I would say wearing a white blouse white collared blouse so um a white collared blouse like that and next I'll say that she has a white skirt sweater tied around or black sweater a black sweater tied around waist and she also has a blue dress to dress I guess and also I want to describe the uh what parts of her body are exposed or not so I would say exposed legs and teal roller skates next up I'll go with blue sky with white clouds and then finish off with flying through the air so the reason I'm describing everything in this image is because I don't want the eventual Laura to nitpick and say like oh if I don't describe teal hair then that means I want the Lord to learn that everything produced here should have teal hair if that makes sense or if I don't if I don't describe teal eyes then whoops if I don't describe teal eyes then all the characters will end up having teal eyes so make sure to caption everything that you don't or everything that you want to be able to be changed okay that was a mouthful I will do another example this is the second example image I will attempt to correct if we take a look at the caption generated by blip you can see that it says a woman sitting at a table with a book and a cigarette so clearly some of these things are not here in the prompt so we'll have to manually go in and change it okay so let's go up here first of all I might as well just delete everything here so DD okay and let's see a young woman or a young woman with a medium length blonde hair looking and okay next uh I'll say looking up for now but okay I'll add that later so um see closed eyes and next I'll describe her clothing wearing a white collared blouse green dress school uniform uh desaturated [Music] light brown ribbon okay so now that I've described most of the character I'll go into maybe like a pose so looking up to the side to the right side next I'll describe a book in the foreground and lastly the environment blurry background blurry building in the background and that should more or less do it I could describe the green hair ribbon in hair yeah that sounds like a good idea so let's go back um green hair ribbon Okay so I think that basically covers it and so it's pretty time consuming you have to go through all of these texts and captioned them appropriately if you want to get a good data set so I'm going to go ahead and do that and then start the training okay so now I'm back in the GUI implementation for koya's SD scripts head over to dreambooth Laura tab as that's what we'll be using let's start with the source model as I mentioned before I'm going to be using the stable diffusion 1.5 EMA prune model for training so I'll go over here and click this paper icon and usually it's in my models folder for stable diffusion there it is and on a side note there's these two checkbox options below called V2 and V parameterization those are stable diffusion 2.0 options so we're using a 1.5 model or a one point X model so keep these unchecked next up is the tools tab the Koya folder structure is a little unique and it has to follow a certain convention I'll bring up an example of it that I've done already so it should be here and if we open up any one of these folders these are called project folders and inside the project folder there's an image folder a log folder and a model folder so the image folder holds your data set folder and once you go inside there's another folder here with some strange naming it starts with the number then has an underscore string of words and then a space and then another string of words when you create the folder you can either do it by yourself or you can use the tools tab in the dreambooth lore tab I'll be using the tools tab and the log holds the um the tensor board file and then the model is obviously the models for the Laura okay let's go back to the tools tab now okay in the tools tab starting off you can add this instance prompt and class prompt I found that these don't really do anything if you aren't using regularization images which we aren't so you can put whatever you want in here I usually just call it um okay I called it this thing lrgr list so just Laura graphic illustration and made it a shorthand and class prompt I just put the words artwork style it really doesn't matter what you put here and now you have the directory for the training images put that in wherever you stored them mine are here at this path training images Laura and for repeats this is the number of times the images inside that folder will be repeated for example if you have 10 repeats and 100 images and only one concept folder then you'll have a thousand steps in total for me I will be using 15 Steps or 15 repeats here and next you can add this destination training directory this will be the project folder so go ahead and put whatever you want for the path here okay I'll make this lower example and now I'll click prepare training data if we look at the Powershell you can see that it says done creating Korea SS training folder structure and let me quickly find that okay so you can probably see it now here is the new folder structure and if we click on image yes it's created the data set folder for us and now we can press this button here copy info to folders tab this is pretty useful so I'll just click it and then explain what happens go over to the folders tab now and you can see that we don't have to manually type in the path I just showed you it gives oh it does it for us already and lastly for this I'll just call it or example oh one for the model output name this should be it for the path setup next up is training parameters if you open up the training settings these should be the default settings however since I have around 157 images in my data set these default parameters won't do very well after extensive experiments I found that using modified settings from my configuration file gave me vastly better results so I'll go ahead and load them up and here we go on normal Laura training you probably won't have so many images like I do so you can probably raise your learning rate I have a pretty low one for the unit next I'll briefly go through the settings the first one is train batch size if your GPU has below 8 gigabytes of vram I suggest keeping this at two or even one this parameter represents the amount of parallel computations you can run at one time next up is Epoch Epoch stands for the number of times you do forward and backward propagation on a data set you can play with this number but I highly suggest keeping it at least greater than one because this essentially becomes your snapshot file or saved Laura files I should have 15 Laura files at the end one for each epoch and you'll need these various lore files because they represent different points in time and by graphing them using the XYZ plot and stable diffusion you can gauge your results to see if you're overfitting or not next up is precision if you can use the bf16 Precision mode use it if it gives you errors and training just use fb16 for the scheduler I'll be using constant if you use constant then you don't have any more up steps because well everything is constant for learning rate the first learning rate up here is the default value if your text encoder or younet learning rates are blank it'll just use whatever values here next is Network Rank and network Alpha generally you should keep these two values the same or equal for a higher resolution images such as 768 by 768 pixels it's a good idea to use a higher Network rank for style training on a non-anime Model it might even be more advisable to go up to 200 or Beyond from my own experience I started getting very bad artifacts around 256 for the network rank so I suggest not going that high or if you do go that high you might have to lower your u-net learning rate even more next up is Advanced configuration so if you open up this box you can see what I'm putting there you can just follow what I have here since the stable diffusion 1.5 models were trained on clip Skip One keep the slider here at one on anime models such as novel AI or anything V3 use clip skip 2. I also enable Shuffle caption and if you end up having some color issues with your Laura training you can turn on this color augmentation but then it'll turn off cache latent up here so if color really becomes a big issue you should try turning on this checkbox there all in all of these settings are what I used so I can start training the model but before I do that I want to mention one small thing one thing I want to point out is the computation for the number of steps that will show up on the console output if you're going with the train batch size greater than one the training steps that will show up will be different from what you expect the ground truth number of steps or the total amount of steps is calculated by the number of images you have for me it's 157 multiplied by the amount of repeats you have which is 15 and you multiply that by the amount of epochs you specified so for me it's also 15. so my final number of steps will be 157 times 15 times 15. thus if you multiply them all together I have a grand total of 35 325 steps however I have a trained batch size of six so I have to divide the total amount of steps by six remember I said that this is the amount of parallels you can have going at one time so if you divide that number by six you'll get around 588 steps once I start training this should show up as 588 steps and not the grand total of thirty five thousand something steps if all seems to be in order that start training press train and then I'll show my Powershell so this is what my Powershell is going to look like after I press the train model button so here we go as you can see if you look at MAX train steps it's at 588 which is what we calculated by hand next up it's doing its thing loading my checkpoint and then probably cash latents so yeah here we go yep good good good and now it'll calculate the amount of steps for me the total training time was around 12 hours I believe or 15 hours but that's for 15 epochs and for such a large amount of images 157 and it's also at a very high resolution 768 as well as a very high Network rank at 200. so don't worry if you think this is the standard it's not Laura training is probably around 30 minutes to two hours if you're being reasonable I was very unreasonable if your training finishes without error congratulations go to your project folder you specified then open up the model folder and you should have as many epochs here as files so I have 15 epochs I should have 15 files select all yes 15 items now take all of this and copy it to your stable diffusion Laura folder so here all Ctrl C I'll go to stable diffusion models Laura here and then paste it all in here I've already gone ahead and done this so yeah make sure it's in here otherwise you won't be able to do the next step next it's time to analyze our Laura models the way I do this is to First choose a fairly standard and simple prompt along with a negative prompt to actually activate and use your Laura go to this red button here in the middle click it and then go to the lore tab remember what you named your Laura mine was graphic illustration 12. your list and then 12. so I'm just going to pick one and your Laura will show up in this format and next the way I prompt using the Laura is to cast everything using the in the style of Laura so to clarify what I mean let me copy a prompt like so and see it says close-up portrait female and the style of all that and yeah this way I tell stable diffusion to load the Laura's implicit learn information in as a style I'll leave this like this for now and just put in a boiler plate negative prompt so and I'll close this next I'm just going to change the sampler method to something I'm used to using so that 640 height okay that looks good next I'm going to open up the XYZ plot option in the scripts here to vary the epoch version as well as the Laura strength remember the epochs are saved in the format Laura name so Laura colon graphic illis 12 then a dash then five or four zeros followed by an integer the strength or weight of the Laura follows after the next colon so here I want a grid that displays not only the variation in Epoch models but the change in lower weight as well to do this I'm going to first select prompt Sr for X for X type so prompt Sr which stands for search and replace and what I want to replace here is the model numbers or number so I'm just going to change this zero zero zero zero seven to this variable num and what I'll do next is in X values I'm going to type num and then 001 all the way to zero zero zero fourteen this means that instead of going from num it'll go from 0 0 1 all the way to the end if you understand this logic then it should be fairly straightforward to understand what I'm going to do next which is also going to the Y type setting this to prompt Sr again and then this time I'm going to go ahead and do strength and instead of the one up here I'm going to change that to strength like so this way when I prompt I will be getting at least a 14 by 10 grid or 10 by 14 grid that shows me the variation in value so yeah I think this looks fine don't forget to put the model to what you trained on so I'm already on 1.5 prune EMA only so depending on what you used you might have to put in something different here and yeah I'll go ahead and prompt okay so it looks like the grid has finished generating if I click on it yeah I think I'm going to go into the outputs folder and bring this into Photoshop so we can get a better look at everything right now this this is too small to tell if uh there's overfitting or not so here we are in Photoshop on the x-axis we have the different epochs excluding the final Epoch increasing incrementally from left to right so here we go one and then 14 at the end and on the y-axis we have strength increasing it incrementally from top to bottom so 0.121 now we can start using our eyes to determine what Epoch we should test further a rule of thumb that other users recommend is that Laura's at strength one should not be overfitted signs of overfitting could be glitches or artifacts appearing or even just straight up Abominations luckily for us or not so luckily because I kind of Brute Force this with a number of experiments we don't have any super damning results as far as I can see next up is focusing our attention at the bottom right quadrant of this image grid like so this is where we can see the effects of the Laura at higher levels of strength as well as in epochs so more epochs means the training has been progressing for a longer amount of time so around Epoch 7 is where I personally want to start analyzing the shapes are becoming more defined and at the full strength of one if you know what the video game the world ends with you is it kind of reminds me of that feeling or art style despite not having any of those images in the data set there's even this kind of hatching Mark going on so it's a nice happy accident next up if we keep increasing the epoch count you can see that there's a bit of confusion going on now so in these two images there's dripping mascara or eye shadow around epochs 8 and 9 but it ends up disappearing in Epoch 10 here from Epoch 10 to 14 on the strength of one we start getting a very nice graphic illustration effect with it's very nice Punchy colors and it's actually doing what the data set wanted it to do or what I wanted the data set to be represented as in the Laura so this is wonderful news however around Epoch 14 which is the second to last Epoch we do get some artifacting so epochs 13 and 14. these are kind of going out of control if we take a cursory approach to results in analysis we might just throw out the last few epochs all together on account of overfitting yet if we do take a closer look the Laura on Epoch 14 at strengths 0.8 and 0.9 are doing quite well so personally and I guess based on the experimental results I want to actually go ahead and test the final Epoch 15 in the web UI on several different prompts to see if it's actually overfitting or if I just hit a bad seed number on generation so let's get to it here we are in the stable diffusion web UI as I mentioned before I want to test the final Epoch so it will be epoch 15. and it won't have any numbers appended to the name of it as it is the final epoch so let's go ahead and generate the same prompt on random seed with the strength of one okay so these are our results personally I am pretty happy with this if it's not as clean as you want it to be you can modulate the strength of the lore by turning it down to 0.9 or less so why don't we just go ahead and do that to see what's going on so let's keep the seed the same and turn this lower strength down to 0.9 and let's see what happens okay so same seed 0.9 in the lower strength and you can see the artifacting just disappeared from the first image or from most of the images and it's not bad of course the artifacting or the various parts can always be improved with impainting so I'm not too concerned with the text to image portion so now we have this why don't we try seeing if we can change the other aspects of the image right now it's on white here can we change it to blonde here because this is how we can tell that our lore is not being restricted if we cannot change the properties of the output with our prompt that means the Laura is overfitting which is another thing we should be looking for if this gives us blonde hair we should be good to go and yes we get blonde hair and green eyes fantastic so I think this is it for testing the Laura on characters but how will our Laura do on things that it wasn't trained on so this is the purpose of the style or how can we tell so I will prepare some prompts right now to test that okay so I will get rid of this character stuff here and I'll bring the strength back to one I'll change the seed back to random seed and instead of portrait female why don't I do an environment first so an outdoor environment um if you guys know the dolomites mountain range it's a pretty famous mountain range in Italy Northern Italy I believe let's go ahead and see if we can get some graphic illustration on the dolomites mountain range I hope stable diffusion knows what the dolomites are okay okay this is good this is good this is what the dolomites actually look like if they were um stylized okay so dolomites Mountain rage good how about a a study of books since we did of books not looks a study of books since this is a outdoor environment we can try an indoor environment now a study of books in the style of Laura graphic illust epoch 15. and just wait for this to keep going a study of books okay so yeah this is what um this is one way of looking at it maybe it would be better to say a library of books okay so I tried a library of books next and yeah we're getting this very graphic illustration Style almost almost like it's not cell shaded but just very Punchy colors okay lastly let's just do something completely random like have you guys seen the Korean drama called extraordinary attorney like Boo or something like that she has this interest in whales and they're always flying around so flying whales in this stuff yeah okay why not we have to go as bizarre as possible to see if Laura style is actually being implemented and we're not just hallucinating will it work okay these are actually kind of cool actually I enjoy this so if this is too strong for you you can always pull down the strength but even that strength of one there's no artifacting so I really I really like this this is nice let's just see what happens if I just pull down the strength a little it's just a experiment Okay so yeah this is still like fine for me depending on what you want the Laura is not overfitting from my analysis so far Perhaps it is I just haven't found out how to make it overfit but um yeah I think this is a good initial trial to run so next I will start with the anime style lore training which I'm sure plenty of you will be interested in as well before we start getting into the Laura style training for anime we'll have to start with the data set as always the data set I've chosen here is artwork created exclusively by the Japanese illustrator red juice he's also known as shiru in the Japanese art community sure also means soup so like miso shiru or tonjiru anyway I have the data set here cropped at a resolution of 768 by 768 pixels if you want to know how I prepared this data set please refer to the first part of this video so you can see that they all have the same name and the only difference is the numbering and they're all of the same file type PNG next since I assume that you probably won't be using the Koya local installation and instead using the Google collab I've changed the captioning process here to use the web extension tagger in the automatic 1111 web UI so if we go to extensions and look here I've already installed the tagger from this repository here I'll add a link to it in the description once it's installed head over to the tiger tab right here and you'll see this interface the captioning process here is different from the process I used in the first part of this video which was blip captioning since we're going to be training on a clip skip 2 model this time usually an anime model and in this case the anything 4.5 prune model you'll have to make sure that you're using Dom borders style tags you'll see what I mean soon go ahead and go to the batch from directory tab and in your input directory place the correct path so this is the path to my images if you want to add a tag in front of all the other tags such as red juice style or something like that you can type it in here a juice Style for me I personally don't use a token to reference style so I'm not sure what the best practice is on this and it seems to be working fine for the lore style if you checked my to check the first half of this video so if everything looks good to you go ahead and press interrogate so I press this button and then I'll check the console it's loading this file and it's chugging along like so and it says it's done so let's head over to the data set file or data set folder to see if the captions are there and now if we go back to our data set folder we can see that there's text files for each of these corresponding images that's great however we're not quite ready yet let's open up vs code again and I'll tell you why so here I am in vs code and I've opened up this sample image from the data set on the left if we look at the captions for this corresponding image we can more or less see that the tags are correct because it says bridal veil gloves elbow gloves Tiara white dress stuff like that is all in this image I didn't spend that much time trying to fix all the tags or spot errors so I may have made some mistakes in the final tagging process however there is something that needs to be changed in these captions notice how in the tags there are character names referenced here like Mikasa I think is in here Mikasa Ackerman Annie Leonard on the right um there's yamir I don't think she's in here and then historia oh I mean uh Krista wrens yeah Krista so there's characters names in here and that's a problem for us remember in the first half of this video during the data set captioning process I said that we want to caption everything that we want changed however if we leave character names in here that character word or token will be associated with the final style Laura which we don't want we want the Lord to only be comprised of a singular intrinsic concept from each image so to remedy this simply remove the character name if it does exist from each text file so we don't want Premiere and Krista whoops yeah and Mikasa and Annie I think those are other names here so we don't want any of the character names to be in any of these text images so maybe I can find another one um not that one I don't I don't think that character has a name hmm okay so one sec 54. uh okay in this image it's a image of a character from monogatari series and it's the teenage version of uh kiss shot so you don't want her in here so delete oshino shinobu and also kiss shot here so when you're putting together the data set make sure that you check the tags and also make sure that you don't have any character names in them if you're trying to make a style for Laura and I guess it's good to know the names of Anime characters if you're trying to build something so yeah be cognizant of the fact now that our data set is properly captioned let's head over to the Google collab notebook which is here this is what it looks like and its usage is fairly straightforward but I'll go through it anyway because better safe than sorry first off in the first part it says to clone the Koya trainer so I'm just going to press this play button and it's telling me that it's not authored by Google it's okay run anyway so the cell has finished running if you look up here there's a green check mark That's how you tell next is installing dependencies and yeah click this play button again and magically it will start doing its process of installing the python packages this does take a bit of a while so let's just wait for it to get through everything okay so the dependencies have installed and if you check down here it says it took 3 minutes and 39 seconds which is a while next up is section 1.3 log into hugging facehub make sure you have a hooking face account and then make a token with the right rule w-r-i-t-e not read and I'll just go ahead and paste my token here press this play button yes my right token has been saved unfortunately for you guys you can't use this right token because I will have deleted it so don't even try anyway next up is mounting the drive sure I will mount my Google Drive connect to Google Drive and yes I will accept or allow and it should show up on the left once it's done okay it's finished and it's mounted at content drive if you go to the right or the left click on this folder button you'll see that drive is here okay next up is open special file explorer sure and next is section two this is where we start downloading the models we want to use for me I mentioned that I'll be using anything for five 4.5 pruned so I'll just pick that and press play we'll go ahead and download it I assume from hugging face so the download of anything 4.5 prune is done it took 24 seconds next up is 2.3 you can skip 2.2 because you're not using a custom model just using one of these default ones 2.3 is downloading the available vae since we're using a anime model let's just go ahead and press this play button for the that cell okay it's done now next up is Section 3 data acquisition to find the data of your training data or Define the location of your training data so let's see I'm going to go with 10 repeats I believe and I'll just change these these really don't matter as I mentioned before call this sure illust like so and if I press play on this it will create folders on this side so let me check where that is call your trainer or is it it's in one of these okay so it's in the dream Booth folder and it's created the folders and this kind of style which is matching the local installation next up is uploading the data set so if we we can skip all this the rest of section three and section four because we did our data set preparation and tagging locally next up is defining the important folder so section five on Project name let's call it shiru list again pre-trained model we'll have to change this to uh anything 4-5 and it's not a safe tensor it's a checkpoint for vae we have one and it's at this location folder and then if you want to just copy this path you can click on these three dots copy path and paste it there the trend folder directory and regulation look fine to me and it'll be output here for me I want to change the output of the model to my own local drive so I'll just click this check box okay and now that I'll go ahead and click this to Define my folder and remember that we need data set images in this folder because we skipped everything up here so basically just copy all the files from your data set locally and then just drag them into here so here we go all right it's just doing its thing uploading it fairly quickly and I've also done this training locally so I'm just going to reference the training settings that I did on Koya so here is my Koyo training parameters I did 20 epochs with 1 times 10 to the negative 4 for learning rate and I did 160 dims so if I go back to here for Network dim I'll just put 160. and 160 for here next up is learning rate so I put 1 times 10 to the negative 4 here remember if you leave unit empty it'll just inherit from learning rate but let's just go ahead and put it here as well we're not using a cosine for scheduler and yeah that seems okay so once you've done that remember depress the play button to activate that cell okay next up is starting the lower dream booth uh for me I used bf16 but I'm not I don't think the collab is the collab's GPU can handle um bf16 so I'll just leave this at fp16 and for resolution you can bring this up to 768 like I did in my local training but this will crash because it'll go Cuda out of memory so keep this at as low as you can if you can uh hopefully it's at 512 it says 640 here and for a number of epochs I just go with 20. and for Save and epochs type change this from save every n epochs and then change this to one so it'll emulate the format I had in my first half of the video so it'll go zero zero zero one zero zero zero zero two all the way to how many epochs you're right here so I want to do the graph grid again comparing epochs and Laura strength which is why I set this to one and yeah I think that looks fine I want to check the code again just to make sure the resolution is at 5 12. okay so yeah and now if everything looks good I will go ahead and press play on this cell and it will go ahead and do its thing hopefully it'll start the training okay it's doing its thing here um okay so there was an error okay so we have an error which is good I make mistakes too so let's see what the error is repo ID must be in the form repo name so apparently it thinks I'm trying to pull from hugging phase which I'm not which means that my local path name must be incorrect so let me go back to the pre-trained modular path anything for okay I didn't write pruned also it's better to just copy path so you don't make um these kind of stupid mistakes like I do okay and also after pasting it there rerun this cell and I'll just rerun everything for good measure and go ahead and restart this cell again let's see if it will run if there's more errors that's great this is a learning process okay it's preparing the accelerator okay the unit and the vae have been loaded and now it's downloading a pytorch diffuser I think downloads finished vae loaded also see how the clip skip up here is at two because it's an anime model and those were turned on clip skip two so not clip skip one like the previous video or the previous portion of this video caching latents it seems to be preceding along correctly and if it starts doing the epochs then we can rest easy and just leave it to do its thing let's see it should say it's like oh Epoch one and it's during this amount of training steps so let's just give it some time okay steps zero percent and okay it's chugging it'll probably take around an hour hour and a half so I'll just leave it and let it do its thing just a small update the training is progressing as expected and smoothly you can see that it saves the checkpoint to your Google Drive every Epoch which is what you want so let's just be patient and it will be done soon so the training has finally completed and it took around an hour and 15 minutes and it finished at 2 29 am and it's around 2 40 right now for me jst and if you're wondering where your files are head over to your Google Drive you will have a dream Booth folder under your my drive and everything should be in this output folder here so I've already gone ahead and done two other trial runs with the collab notebook which is why I have two different folders here I just renamed these folders from output to these names and if you click on the output folder you can see all of your epochs here and your final file is the one without a number appended to it so if everything looks to be an order you can select everything and then right click and download and once you do that you'll save a zip file unzip this and then copy everything to the Laura folder in the stable diffusion automatic 1111 web UI so I'm going to do that and then I'll see you in the web UI I'm in the webview ira now and I've already loaded in Laura so mine is called shiru elist and all of these and I'm just going to go ahead and choose the final Epoch so the one without a number and go ahead and just run it to see if it's working so first I will run it without this and I'll generate one image and then I'll save the seed to that so this is the seed this is what the image looks like simple blonde girl and I'll add this in now for the Laura I'll save the seed and copy it over here so this should be giving me a stylized image of what I have here now okay so this is what we have and yes you can see that the image does resemble reduces Style and how about this just bring up the batch count to four and do random seed once again and see what we're getting okay so this is the generation it's a little bit rough but not really it seems to be working so now we just have to go ahead and do what we did before which is create the image grid but before that let's just see if we can change parts of this image so I'll change from blonde hair to red hair and hopefully it'll change the hair color if not we have a problem okay so looks like we can change the hair color and that kind of effects in this image it changed the dress to red as well but it's fine we just have to be more specific with our prompt later so let's go ahead and start creating the grid so for this let's go back to script XYZ plot and let me just go ahead and copy paste the values I used before back here so numb and remember that I actually have 20 epochs for this one so I'll have to go up to 19. so this like so and for strength I will copy the same thing over once again like here and turn the batch count back to one and make sure this is all properly named so num and strength okay so that looks good and we can go ahead and generate so here is the image in Photoshop if we follow the same process from before we just look at the last few rows to see what's happening and around Epoch 9 is when the Styles really starts to kick in epoch 10. and around Epoch 14 I believe is where we see this drastic change and then somehow it doesn't really take an Epoch 15 and then when we get to Epoch 19 this is when the style is fully baked in so I might as well just go ahead and use the final Epoch once again it seems that that might give the best results okay so I'll go back to stable diffusion and use the last Epoch to do some generations back in the web UI I have my Laura loaded in it has no numbers on it so it's the final epoch I'll do some generations just to see what I'm getting and yes it does seem to be taking to reduces style pretty well sure it might be even a little underfitted it's possible I could keep going huh we'll see so next what else can we do well one of the powerful things about lore is you can combine it with other loras so if you looked at the thumbnail of this video you saw that I had a Lucy from cyberpunk Edge Runners character in reduces style so why don't I just try that here and let me go ahead and paste this in here and in the original generation for this thumbnail I was using my own Laura trained locally and not the collab but I want to go ahead and see if the collab version performs well as yeah okay so here we go combined if you add the lower weights up to equal one and this is a good principle to follow to make sure you're not overcooking your image so there you go Lucy and red juice Style let's do some more batches just to see if it's working or not so the generation has finished and yes it does seem to be working however you can see that you could probably go a little bit harder on the training and maybe do more repeats or more epochs because I do get the feeling it's a little little underfitted if you go ahead and try my locally trained version which is it was trained at the resolution of 768 we can go ahead and punch this in here like so and I do think you would get more of the style baked in because this feels a little underfitted okay so this feels a bit better but it also might be because I'm using a vae and here we go this is with my own Laura and yeah not too much of a difference but it is there so I'm going to do a second example Now using the original collab Laura with another character Laura both of the character Laura's were made by lycon he's the person who made dream shaper on Civic AI and this time the character will be Aqua from Kingdom Hearts so let's do a little preliminary generation and see if we're getting something Okay so yep that looks like Aqua to me and change the batch count to higher number just to see if I'm getting consistency okay so here we go the generation has finished Aqua and it does look like red juice drew it so there you go and if you if you want that desaturated look that red juice has going on you might have to unload your vae so I'm going to try with none first and see if that changes anything because this does feel a little saturated you can always turn it down on your own but let's go ahead and see okay so yes I unloaded the vae ran it on a new random seed and yeah you can see that Aqua is kind of um generated here you see these pink or purple straps they're not taking in some of these images so you either have to lower the style strength or increase the aqua character Laura strength but yeah I think more or less that's it so I went back and did a generation with a batch count of four on Lucy once again this time it was with the vae or Auto encoder turned off and I can see that there's more of that red juice style in here now anyway all the lauras are trained and displayed in this video will be available for download at my Civic AI page listed in the description hopefully I went over everything that you'll need to train Laura for Style of course I'm still experimenting as well but I do feel like this process I took did give me satisfactory results and I hope it will for you as well once again thank you for watching and I'll see you all next time bye [Music]
Info
Channel: kasukanra
Views: 76,761
Rating: undefined out of 5
Keywords: ai art, stable diffusion, digital art, machine learning, midjourney, photoshop, character art, concept art, character design, textual inversion, automatic1111, webui, how to make ai art, kasukanra, hypernetwork, workflow, dreambooth, ShivamShrirao, nerdy rodent, style, dreambooth style, training dreambooth for style, redjuice, vofan, anime illustration, anime art, training LoRA for style, LoRA, low rank, LoRA: Low-Rank Adaptation of Large Language Models, artstyle
Id: 7m522D01mh0
Channel Id: undefined
Length: 75min 18sec (4518 seconds)
Published: Tue Feb 21 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.