How to Train, Test, and Use a LoRA Model for Character Art Consistency

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I think we have gone through a lot of content around using kind of like a base models you know using just the tool using invoke there's a lot to consider when you start thinking about training your own model and how you compose the data set how you teach the machine effectively what the different concepts are and how you tease out and structure what it what it exactly is that it's going to learn [Music] when we're working with customers I think one of the first things that we do is we we talk about model strategy and thinking about what are you using the model for I think that's the number one thing you have to start with when you're creating a model is you need to ask like what what is this going to do for me what am what am I looking for this model to do what tools do I need in my pipeline if you think about um what you're really trying to do is you're trying to teach the machine your language you're trying to teach it to understand what you mean when you say a certain thing so you can Tech Train you know better understanding of your prompt terminology for general purposes but you can also teach new things to the model the analogy that I used um to to an artist that I was talking with the other day was if you think about the kind of the landscape uh and prompting uh as as kind of like the prompt is coordinates and the model is kind of like the map or the landscape The Prompt is kind of telling you where on the map to go but if there's nowhere there to go the prompts not going to work so if I'm prompting for something that doesn't exist inside of the model it it has no those coordinates don't lead you anywhere interesting um and so part part of this whole question is do I do I just need better coordinates do I need a better prompt uh to get to where I I'm looking to generate or do I actually need to train new content inside of the system and these these can be done at the same time if you watch the previous training video that we did we we had another analogy another explanation for it and I think a lot of this is really just starting to understand what you can do with this when you do have that capability to train the model so I'm I'm going to start talking today about one specific use case of model training now if you have a ton of artwork or if you've got a ton of intellectual property you've made a a game before and you have all the character artwork and you have a ton of stuff and you're just trying to train on top of that that is not what we're going to talk about today because that's that's relatively well-trod territory we know we can train good models when we have a lot of data when we have uh robust set of data to to generate that from well what happens when we're just starting out and we're trying to think about how do we build a model for our use case how do we build something from scratch if we're just coming up with a new idea or we're just trying to figure out how to get started with creating a custom tool that that helps us in the generation process um but isn't ultimately like a um T taking a lot of Ip and just consolidating it into a prompt term that's what we're going to focus on today we're going to focus on this idea of crafting a tool crafting a Laura that that is usable in the context of a character for this use case but you can apply a lot of the same principles to to the process so uh feel free to ask questions along the way um I'm going to jump into training first and show you a model that I already created uh this is something that I kind of talk through the process here um I'll share my screen so you can see this the process that I took was I found a way to generate a roughly consistent character in invoke so I created this data set it's entirely synthetic the filtering is I picked the things that matched what I was going for I I left that in the data set and anything that didn't really fit the data set I removed out and that's kind of what you can do with synthetic data sets is you're serving as a discriminator in some sense you know if I'm prompting for a very particular style and I generate 10 images and only five of them match the style those five go in the data set and if I train something on that and I can generate more than my data set grows I can build a better model but I'll show you this character that I created this is probably this first example is probably the worst one as far as consistency goes uh we could probably debate whether I should take this one out but I like this style so I left it in I'll go through the captioning here but I want you to get a a feel for the character that I'm like what are the features of this character that I'm like trying to capture um see this guy's you know got more of like the mustache uh rough rough beard there this gu's got kind of like a comic book style you you can see a little bit of the same character istics you know beard uh kind of pretty sharp nose it's got like longer not long but medium length kind of wavy hair right and I'm I'm capturing this type of concept in different contexts someone someone said uh they can't see the screen just confirming can people see the screen screen is fine okay cool um so there are there are features here that I'm trying to cap capture now you'll notice that I've got a bunch of different styles right like there's this comic Booky style there's this more like painterly style um I even have one here that's like much much kind of like more ill illustrative and detailed this one's kind of more of like a rendering and a 3D rendering style and basically what I'm trying to do this is another example of one that I probably could take out but I'll tell you why I left it in um what I'm trying to do is I'm trying to show it this character in different contexts and this is part of the strategy for when you're teaching a model something you are trying to create enough diversity so that it sees what is the same in the data set now the way that I've captioned this is I've captioned it all a picture of z43 care short for character uh dressed in a green code brown shirt yada y y uh this one should probably have like a style associated with it so this is kind of like a I'm looking at the data set and seeing ways that I could improve it um but this is the consistent piece is a picture of z43 care and if I go over here I see this one I see a picture of z43 care there um that's the consistent piece every single piece of data in this data set has that trigger phrase right it's a picture of this character and what we're trying to do is include enough detail of all the other stuff that should not be associated with our trigger prompt we are trying to train it to understand what this character is we are not trying to train it to understand a specific style we're not trying to associate our character with a specific style we want our character to be able to be generated in any style and so the the thing that I'll call out is if I had created all of the data for this character in this style this was the only thing that was in my data set it is it is creating a relationship it's understanding the character in the context of this kind of painterly domain and it assumes whenever I ask for this character it needs to be a painting right so what I want to do is I want to create a diverse and kind of like uh context independent understanding of what exactly is this character so that it's more useful as a tool so that I can do more things with it because if I only am ever ever able to prompt for this with a specific set of clothing and a specific style it's going to be a lot more limited than if it really understands this character deeply and I can do whatever I want with it so you're trying to find a diverse data set that can show the same concept you're trying to train in different different contexts if it's a character you probably want different attire different clothes if you can get even different genres I mean you're like you're in really good spot uh different styles th those things help it really understand that character now it's it's not to say that if you don't do that it'll be terrible uh it's a spectrum and we're just talking about what are ways that you can structure your data set in order to improve that quality um so I'll take kind of a couple more Peaks at some of the rest of the data uh and the reason I include the left this one in even though this guy on the left really doesn't look as much like our character is that I was trying as hard as I could to get something that looked like our character with a different expression because this guy's got this like you know stoic looking expression throughout all of these and I was trying to get some variation there prob probably could take it out but uh I left it in there's also one in here where he's like screaming and it's like a I think it's a pretty bad one as well maybe I'll find it here ter this guy this one's he's kind of angry but it's bad because he's got like three rows of teeth so uh that was a little bit janky but you can even see I tried to create like a photo realistic version of it and so if we if we're playing the game as humans um of what are the features that we see that are common between all of these like different characters it is likely that we are finding like you know some kind of mid midlength hair it's not you know uh buzzcut there's like a beard component uh he's he's kind of got this like stronger broader look frame and face those are the characteristics that I see common to all of these things that I've generated and that is effectively what this Laura now generates and I mean I think I I I'll I'll admit that this was not like the um I didn't spend the most time perfecting this Laura um because I just wanted to create something really quickly for us to kind of talk about but we'll we'll we'll show kind of what happens from this I don't think this data set is particularly great I think it's okay um I think the Lura that came out of this is useful to an extent it is more useful at guiding it towards this type of character I don't think it's perfect by any means and I'll show you where it fails and I think it's kind of predictable where where it fails um but we'll jump into the tool and kind of start looking at that um I am going to read a couple questions before I do that just to make sure that we are really uh answering everyone's questions um someone wanted to drop in some questions uh wants to get really good at detailed Capt caping creating consistent data structuring for captioning Understanding Variables and structuring data training details style and subject training things to do and not to do understanding the training user interface we're not going to spend as much time in the training user interface today uh I there is a video that we recorded recently that goes through that we've done a little bit of like data capturing and stuff like that um if people have questions around stuff that I just showed this is probably going to be like the last time that we spend a whole lot of time in the training UI or the captioning piece so if you've got questions on that let's go ahead and get those in and we can answer that um and then we'll we'll talk about kind of other other Downstream uh questions around like how do we use these things that we've generated and what are what are the use cases or what are like the ways we should think about how we create these and how we improve them over time um someone called out having a diversity style background time of day lighting clothing accessories composition all lead some more flexible Laura yes uh I think this person's uh one of the more like ma machine learning uh minded people in the the audience um so I think they're they're relatively well versed in all this stuff um someone asked wouldn't image to image with control net be good for creating pose and expression variations maybe maybe um you you could do that uh you can get like if you had a control net and and uh image to image process you Pro probably could strategically create a pretty decent set of like expressions and character poses and things like that so that you've got a diverse data set I didn't do that I just kind of like you know quickly created a synthetic data set but we can we can talk about how that might work as well um it's definitely definitely a a tool in your uh toolkit if you want to do that um um yes and somebody called out um the interesting part is when you use this initial Laura that you train to improve the synthetic data to train the next Laura and we'll talk about that as well and kind of the the strategy of using this stuff so let's let's jump into invoke um I've got my character Laura uh data set here and you can kind of see um it a lot kind of clearer here because we can see all of it at once in the gallery um and I'll just kind of show you what it does now so this is the um the Laura I think actually misnamed the Laura name but it is uh z43 care as we can see in the training data set that's our trigger and we want a picture of this character uh and we'll take out some of this other stuff and see what we can do to get some prompts in so if we take a look at our data set here um I'll show you how I created it after this um we'll do like wearing a jacket and maybe we'll do like a just use my Sy art style uh wearing a jacket standing in front of a wooden wall uh and I've got this set at about 0.56 I played around with it felt like that was the right spot when you train a Laura you're not always going to get aura that just like is perfect at one a lot of times the reason why when you're downloading custom models you've got very specific instructions is because that person has taken the model run it through its paces and figured out like what they were going for is best realized at about um this piece uh somebody said they want uh me to move the camera I don't know what that means um in any case um we uh somebody asked about this uh trailing negative I'm actually this is a down weight so this is not an accidental negative here um this is me down weting the ti the the synth art style and it's mostly just to make sure that it's not over over indexing on on anything there um we'll we'll generate this and just see what we get how about that uh and maybe I'll generate three I think it's just loading up the model now so uh take a second we'll generate three and we'll see what we get about that and I'm expecting that there will be some inconsistencies I'm expecting that there will be you see like we've got uh this person here that's not all right think it might be a style let's try um old concept art uh watercolor and we really want to get this character let's up this weight so it's making sure that this is like this person and we'll see if we can get this working here there we go yeah B base model bias is definitely uh definitely an issue there uh so we've got this character we've upped the weight of our Laura so that we're actually prompting for our character this specific term if you don't have the Laura and I I didn't I didn't test this out before I used use this prompt trigger I probably should have but this gives some like weird like uh like sci-fi futuristic armor kind of stuff so it can kind of like push it in weird ways uh we've got like our character he's got like mid midlength hair uh he's got that kind of like open jacket look with you know tie on the inside uh this one's just a leather jacket standing in front of wood right so we've got the character uh there's a little bit of a style and this might just be the the watercolor and ink here we should change this to um painterly oil painting uh yes uh somebody called out that uh part part of this that's helpful is standing against a wall right a lot of the training data if we look back um I probably should have had more diversity here in this in this prompting um but we have in the data set this kind of like brick wall component so it's it's definitely does better with him in front of a brick wall let's try let's try putting him in front of an open background right um I think there might have been some examples in our data set where he's in like a on like a beach um let's try something that's not in the training set let's see how this fails right let's put push this around and see what makes this harder to prompt for um we'll say wearing a jacket in a forest scene paintly oil painting uh we'll generate two of those uh someone called out one thing they're learning is that using the first version of your Laura model is going to be really helpful and figuring out what you need to do when you go back and train version two of Dora model that is very very true um so I think we already start to see a little bit of where this is kind of falling in on our style although this one's it's okay um in the forest scene this character tends to be generating without a beard right and a beard was in our training data set but this is like a different domain and this is kind of where like again someone called out if it's in the training set if we did a really good job of putting that together inside of the kind of core uh training set so in front of a beach in front of a a wall uh wearing kind of this like jacket a type thing it's going to be it's going to be able to do that a lot better if you take it outside of that context and again this talks this goes back to the word that we use as generalize if we take it outside of the context of the training data it's going to struggle to understand that relationship as well and that's why you want to compose as diverse a data set as you can this is why more data is better and when I say more data is better I don't mean 200 images of the exact same thing if I had 200 images of like this guy standing in front of a brick wall it would be a really good Laura for generating a guy like this in front of a brick wall but it wouldn't really generalize it wouldn't be able to use that anywhere else and so we're seeing some of the challenges here in the kind of the the beard at least concept generalizing now this guy does kind of have some of the characteristics of our character he's got midlength hair he's got um you know uh strong kind of nose I would say he falls more into the like you know the manly man kind of look right like a little bit broader shoulders and that kind of stuff um so the question is like well what do we do here we're not getting our we're not getting our Concepts now we can supplement so I can say short beard and see if that kind of helps bring back the beard concept we're kind of like you know bringing that in um it might kind of help and I in this case I think it does right we're kind of like we we're we're reminding it it's like okay well you still need that short beard concept and the short beard does bring back um you know some of the element ments that we'd uh we'd seen in our training data set kind of helps it push it back more towards that character um although I'd say this Beard's a little bit like more bushy and longer than kind of maybe our core character his his was uh I guess I mean some some of these beards are a little bit longer those are all shorter I mean I think it's reasonable enough I think the nose is really well captured a lot of the facial features are captured there he's got that kind of like wrinkle on his um his forehead um so that that prompt did help bring that back and I think this is another going back to that concept of the map and the coordinates we have trained the coordinates for a picture of z43 care to go to this region where we get this guy who's got this kind of like you know sharp angled face and broader shoulders and kind of this like open jacket and and a beard and when we go outside of that place on the map in the sense that like he's not in a place that we normally have seen him before he's not in the in front of an ocean or in front of a brick wall he's like now he's out in the forest that's taking it off the area of the map that we've seen before and maybe is pushing it into an area where in that region of the map even though it has a lot of those characteristics we trained in that region of the map there's not as much beard there and so by adding the beard back into the prompt it's kind of taking us a little bit more into that that region of latent space where there's the the beard concept I know that this all gets a little bit like woo woo uh you know I'm going to tell you to bring out your crystals and pray for a good picture after this one um but you know I think that the idea is is uh largely if you think about this as nud ing it in the right direction I think you'll probably have a really good mental model for how to navigate this and how to think about well what are the things that we need to do to create a better data set so now that I've got this data set working right I could I could use this to create more data in the forest um what's another area that we're going to find um you know that is it in the data set maybe in a we'll see what happens when we do a spaceship SciFi uh attire I don't even know what sci-fi attire is going to give us but we'll give it a shot uh what's going to be different about our character um I predict our hair might change a little bit you know you know it did um why because I think hairstyle kind of Maps pretty heavily to like the genre you're in you don't typically see as many like you know punk hairstyles and uh uh the forest for example you don't see like Mohawks in the forest similarly in space we've got maybe a little bit more like a crew cut um or or at least like um tight Commander is looking hair it's like combed and and groomed he's got a little bit more of like a a prominent mustache uh and so many so many in the audience does did comment zero gravity is not a friend of hair that's true that's why you don't have really long hair every time I go to space my hair gets all over the place um so you know we've got a lot of the characteristics of our our character here but there's there's definitely like this space element that has nudged us out of our like core concept of this guy into a General uh similar but generally different um character now in domain in space these characters share a lot of similarities right but between domains and by domain I'm talking about like you know um well in this context just where where that character is the place in the world between domains we see a lot of uh there's similarities but there's definite differences like the type of beard here is a lot more like Woods spin beard and the type of beard here is more like Commander it's got some of that like white gray um thing and so that's that's a piece that's like very interesting um to call out and so if we think about what we're doing here we've created our first data set our first Laura is useful for creating more data that aligns to this General concept now now the question you have to ask is what are you actually trying to do with this is this a tool that is meant to replicate a very specific character because if that's the case what you want to do is you want to generate as much consistency as possible as quickly as you can and this might be good enough if you're going to be sketching out the character and it just needs to fill in oh the beard and the the style and all that kind of stuff if you're trying to kind of synthetically align it um what you're trying to do is filter only the examples that look really really close and drive more diversity in your data set so I probably want to find you know I would take this guy out uh I would take this guy out um I could probably ask myself whether these really fit the vibe I'm going for I think they're a little bit more like villain kind of villain esque looking um guy rather than like very friendly maybe that's the character we' created uh we'll take these out and yeah I mean I guess the question we have to ask ourselves is do we think the faces look right you know and do we want to fix those do we want to try to iterate on that um and I'll talk a little bit more about that but I know we've got a lot of questions here uh so I'm going to try to come back to some of these questions um suggestion that came from the audience take notes when with whatever changes you make between retraining um if you want to be really methodical change one parameter at a time to see how the output changes um it it takes more time to train to do it very methodically but it helps you understand what's happening if you change your captions if you change your data set composition doing it one by one means that you don't have to like ask yourself how did we get this result because I changed four or five things and now it's completely different you do it little by little that can be very helpful um someone asked the question what does overfit mean and the term overfit just means that um it kind of has it's learned a concept too well to the point where it's unable to generalize that to other things and so in this case it would be like if every time I asked for this character it either gave me a guy in front of a brick wall or nothing like it right it was like it's it's like super overfit it's it's it can only do that one thing and it kind of looks bad probably too A lot of the times it's like um it it looks kind of like overbaked and it kind of distorted and all that kind of stuff um so we've got some other questions uh when multiple characters and objects are trained how do you combine them in a project to interact with each other um so this would be let's [Music] say you've got two characters and a set of props and you're trying to add them together there's a lot of ways that that can go wrong um there's also some like techniques that you can use to try to get that to work well um if you have two lauras Laura a and Laura B Laura a has a certain character and Laura B has a certain character the the challenge there is that th those two might compete with one another when you're doing a generation so you don't have a Laura of both these characters together in scenes you have two luras that are trained on individual concepts of the character and when you're prompting it might be confusing to the model it might be thinking you're trying to take character A and B and jam them together when you're prompting it's not necessarily isolating them as much um if you want the two characters to coexist typically you want to kind of like train the model with those characters coexisting in some of those scenes now what you could do um and this will be easier in future releases uh you can create you know the two lauras and then create new synthetic data that has those characters in it so character a and character be are in that scene and then you can train that into uh the model as well and that way you've got like more of this notion of there's this Laura is doing more than just one character it's the entire Laura is handling both character a character B and both of them together now that can also be hard to tease out I I I wish that this process was as simple as like push a button and you get a perfect model it's it's a lot of like really learning how to craft this tool for your process and your team um but once you get the hang of it you start to see okay I can like nudge it and get more and more useful tools um yeah somebody called out that I do have some sci-fi data in the training set like this and maybe this one um I think there are some things that people have called out like they're they're pulled out so like this spaceship background you can see elements of that in the background right so I've actually nudged it to push that um concept of if you're in a spaceship sci-fi Vibe you might be wearing orange and white armor of some sort uh you might be in this kind of like you know domed background or circular background um so this is kind of all again goes back to like when you train it it's learning these relationships even if you were intending those to be there or not so that's someone someone's a has a good eye to catch the fact that those were in the training data set and actually came out um someone commented that they are uh they would they would buy that this is the same character and the face visuals are pretty consistent I I I buy it too I think it's really just like what's the level of fidelity you're looking for um I certainly think that if we were to do another set of training and had this really consistent character and we're able to do that then we would have an even better Laura right is because we'd have more um context there um but I do think that there's some yeah I I think there's some differences like this character and this character look a little bit different to me there's something there right uh it's just not quite the same but overall I think you can iterate towards that um and kind of get get that um get them get them there uh catching up on the questions um uh someone said telling telling everyone that we don't live in a perfect world is not the content they came here for which is probably true I I I should either post clickbait or uh Positive Vibes um so let's talk now we've talked a lot about like you know what to do at the lur out let's let's talk how I got to this consistent initial data set um there are a couple of little tricks that I used to get a more kind of like specific and consistent character in like without aora so I'm going to take my z34 character off I'm going to take um this off uh maybe we'll [Music] do mohawk mohawk character now uh and I'll show you the trick I'll show you my trick this is just one way to do it all right I'll show you a couple different ways let's do version one and actually let me let me see if I could remix this guy and get a picture of this that has um extremely long hair and clean shaving so I'm actually pushing it away from our character but I'm going to see if I can get it to do something that gives me a cool new face that I can use so we'll it to those um oh he's in a spaceship with sci-fi attire okay I forgot that part was in the front I didn't pay attention but it's kind of cool you know this this guy looks pretty cool right he's like very fabiio look I like somebody else call Fabio uh he's got he's got a Vibe he's got a real Vibe um hello Fabio okay I think I like this guy's face better uh I'm going to clean it up a little bit on the canvas real quick before we use it uh we'll take this do this take it a little bit down use an IP adapter to kind of keep it consistent and then we will inpaint that face give give him a little bit more detail on that face so that we just have a really good reference face to use uh get some more details in there we like the details uh someone said the man is setting impossibly High beauty standards for men in space uh it's our CH the ultimate Challenge uh think with the second one yeah I'm a little bit more like orange than I think I like uh I don't want to fight I don't want to I don't want to mess around with this too much yall are probably like waiting for me to get a move on here so I'll just accept his like orangish face what I what I would have done is gone in and done some coloring and like lightened up the skin tones but we'll deal with this kind of like slightly sunburned uh face you one said kind of oily he's he's got a kind of oily face uh I would I would fix that up but I I just want to kind of show one way of doing this stuff so we've got this guy's face one thing that we can do is use the IP face adapter now this is not face ID this is just the face adapter and what that means is it's not going to be perfect at regenerating this face but if you think about this as like coordinates again going back to that map reference coordinates that roughly get to the same place face face wise if you leave it too high what it ends up doing is it just kind of pastes the entire face onto the character in a weird way so I typically kind of keep this pretty light um I'll leave this at weight maybe 35 I'll bring the InStep percentage down I wanted to to kind of like guide it into the basic structure of the face but not just paste all the details in so I kind of like really pull back towards the latter half of this um and we'll try generating another character actually I'm trying to not to do um mess up here pull this guy here it's back down 35 pull this down 55 okay take the Laura off and I'm going to do a picture of a man bringly long hair clean shaven right so now we don't have any of the contexts from our Laura this is a completely new generation we're just using the face that I generated in an IP adapter and we're going to see what we get so now we've got really good looking fabiio guy uh he doesn't have the same style uh it's it's separated from our Laura so we've kind of really gone back to the base model and because we're pulling off uh on the insep percentage here it's really just kind of like guiding some of the basics um so man I wish I could get my hair to do that uh this guy looks somewhat somewhat aligned right like it looks it looks pretty good um I I imagine if I changed this from extremely long hair we might see some more like face differences I think there's probably a relationship here between this like like long hair and this like facial structure um I think we we probably would see some bigger changes if we um move this to a different domain and so I'm going to say I'm going to change the hair mohawk and we're going to say short goatee in a forest plaid shirt this is like we're we're we're really jamming some people uh D some people here together now we're going to see what happens with this this may be weird right we're kind of taking this face we're pulling it out of the domain that we generated and we're going to see what we can get out of it I think the face is going to be a little bit different someone said cyberpunk Lumberjack as less of a mohawk and more of uh business in the front party in the back uh think this is like the wild the wildest hairstyle I've ever seen uh it's really kind of trying to jam this uh it's trying to jam this long hair in the concept of the Mohawk anyways when we go into different domains you now see that it has a big impact on the facial structure right this guy looks uh like he's hanging out in the woods but doesn't look like our super handsome Fabia with long hair and space like these are different characters even though we're using the face uh adapter someone said increase the strength of the I IP adapter and see what happens here uh I'll show you what happens kind of like looks like it pastes his face probably on this like weird uh weird look it's actually not too bad we'll see if that see how that lands definitely impacts the hair uh it's like ignoring my Mohawk entirely pretty much it's also taking the short goatee off it's like really forcing this this face you I me you can see where it's jamming in a lot of those details now to be fair consistent helps helps me get this um across different domains I've got more of this facial structure in by doing that uh but it also brought in a lot of that like Rosie stuff that I didn't like and I'm not really getting different variations now that I like that guy this this gentleman uh makes me uncomfortable to look at this man someone said he looks like he's from The Walking Dead I think that's a pretty good fit uh so this this doesn't have as much variation as I would like so this is where where we can start asking ourselves like how do we inject enough prompt context that makes things kind of consistent across domains um and you know IP adapter is one example of that um I really wish we could get more of that Mohawk in but we're losing a lot of face there um another way to get this kind of like facial feature without of face and then we can look at like combining these things together um is using a fake character name that's really really long so and and then you start to get into some interesting stuff like what are names what are names and what do they mean and how do they relate to one another but if we do uh Jonathan James uh Jonathan James I don't want to use all Jes Abraham uh Goldman uh rank let's we'll see what that getes us right so I'm not using the face I've taken the face off I've got this um character name I it might even ignore this this super long name might ignore all of this stuff here um we'll see what we get that I'm going to generate and because I'm using the same name I should be able my fingers to see more consistency because I've kind of created this like place it's very very narrow coordinate of like facial features it's kind of like the average of all of these um these names we've got uh I mean some facial consistency there uh not a lot but some however we got a lot more of this like Lumberjack Mohawk that I was going for right um jingle Jingleheimer Schmid um if this is a big if if I were drawn to this character I could also then uh take a snap of his [Music] face and I'll take that out and I'll replace that with my IP adapter uh and maybe I even want to zoom in more on the face again because like the thing that we're trying to capture here is not the hair or anything else it's just the face so we've got that face and we'll do one more with this Mohawk just to see if we can get something that's a little bit more consistent took took his hair down a little bit let's bring this down a little bit and see you can see what happens like his face is a little bit too too similar too controlled so we're going to use that we'll generate another getting some some consistency here we got some good like facial features that are staying consistent right haircuts a little bit it's changing a little bit but we got some good things now let's see what happens when we use the prompt that we had here picture a man extremely long hair sci-fi attire paint o painting in front of this that we were using and again this is all it's all imperfect but what you're trying to do is you're trying to find more tools that get you to the same character in different contexts so in this case I think we've got some facial features that are relatively similar it's not perfectly clean shaven not perfectly clean shaven but it does look like maybe maybe this guy 5 10 years younger uh in space someone said increase the strength of the adapter if I increase the strength of the the adapter he's going to have facial hair this is the the the challenge that we're trying to combat here so I think my prompt has clean shaving it uh we can increase strength um oh this guy's clean shaving I I mean not too bad not too bad um yeah someone asked uh someone asked I have not trained Aur on this character name of Jonathan James Abraham Goldman Brink Johnson what I'm doing is kind of creating this like very strong set of coordinates that are the average of all of these names and be in a very specific order because the model has learned if you give somebody a name they should be kind of consistent so if you have this like very very long name there's kind of this place that it gets to of that would be the Assumption of what that person might look like and you can just kind of hit that um and it's a fake name it's just a bunch of random words or random names jumbled together but it's kind of like jamming those all in and now we've got this kind of like tool that in combination with a very like low weight face adapter we can kind of come in and start getting uh some amount of consistency here with our like space uh space Fabio um so we've taken the space Fabio that always looked like this and now we've created this character that has a lot I think a lot more realistic uh Beauty and Nuance right doesn't look overly we're not setting too high beauty standards this is like you know I I I feel like this person might be real right and so now I've got this person in this sci-fi attire what if I want um um let's see like a pomidor goatee in a uh dive [Music] bar and we'll do black leather jacket okay so we still got this guy we still got the name but now we're trying to get him in a different like context doesn't have quite a pador see if we can get that PP door up does have our face facial features he looks kind of sad maybe that's what happens when you go to a dive bar um someone asked uh where is it pulling the American flag on the uniform from space people in space or astronauts often have like a patch with the um American flag so it's just again it's like it has all these relationships that have been baked in and kind of it it interprets or understands um and this is why training a model is really really good because you can break those relationships you can break those contexts um and really push it in a direction that you can control so we've got this guy pador is like common it's not quite there yet he also looks a little bit older the nice thing is if you find a composition that you like you can take this to the unified canvas and try to fix it right try to make it a little bit more um perfect uh so we've got this guy let's say we want to um you know come in and do some base fixing here uh take this down and you know we could take up our BAS um here as well and try and one with a little bit higher of a uh I have a strength right so now now we've got our guy in a dive bar and maybe we like that right and so again I've done all of this in the same style but if I were really looking to uh if I were really looking to get a more flexible Laura I'd want to try to create a lot of variation for this um if I was only ever going to generate in this style and use this FL for this style it maybe doesn't matter I can kind of keep it all in the same style it's really again goes back to your objectives what am I trying to do a what am I trying to train into this what type of tool am I creating um how is it going to be useful to me B to I don't remember if I started with a or one but b or two um you want to ask yourself like what do you not care about what type of variation do you not need because if you don't need variation across you know Styles if it's all going to be in the same style you can kind of like um generate only in that style and get variation that way if he's only ever going to be wearing a black jacket that's just his character he always has a black jacket then you don't need different attire for example but what you really want to focus on is variation where you need it right where do you need the variation I need I need to get my black jacket pompadour guy in forests in in spaceships in bars like wherever wherever I want that to be I need to have uh him in a black jacket but I needs to be wherever it is right and that's how you kind of train it now the other thing I'll call out is composition in this we only have this kind of like top half portrait um I'm this guy I'm delete that guy it's like an awkward awkward photo um if we were trying to use this to train a character that was a full bodied character we'd have a hard time and so this is where um in my data set when I did this originally you'll see that I was also getting that same problem I was getting kind of top halfes uh portraits and then I went to this kind of like white background look now this is where I've used a technique that I showed in the past um go to the uh UniFi canvas we'll just P put a white background we'll do kind of like a um braish blob in the center I'm just I'm giving a very very rough shape because I'm going to basically just do a full uh image to image on this uh and we'll say uh standing on a white background pador GOI blah blah blah we'll take out in a dive bar black leather jacket and this down a little bit and see if this gives us what we want no [Music] uh decrease our strength a little bit again this kind of goes back to it's going to try it's going to try to force this top half and we're really trying to fight that um this might be not enough to ding strength no try portrait in the negative as well and maybe we'll try um character uh someone said full frame photo is an alternative but I think this one might be working well we got we got something I don't know if this is what we were going for but we got something yeah I mean a little bit little bit we probably want to go in and Fiddle with that fullframe photo did not work for us uh body shot did try another full body shot he's legs looking kind of a little weird got some uh some janky legs this looks a little bit better at least the proportions look a little bit better from the preview right we got that and then we'll go in and we'll iron out the face Take Out full body shot and do just this and we can leave the D noising strength pretty high because there's this white background that helps contain the structure a little bit um and because we're coming in closer we're going to kind of help generate that base a little bit more consistency right so now we've got our character full body I I probably would want to do something like a control net here to get better proportions this looks a little weird to me um uh so people are commenting on his his physique uh and I I agree he's got a little bit of weirdness down here in the lower half his feet with a little bit skip Flag Day um someone says they're disappointed in in the lack of faith in my gray drawing skills they didn't like my blob um but you get the idea I mean now we've got this character in a kind of full body pose uh we can save that out and obviously like we'd want more context we'd want to switch out kind of get this this character but then we would come back and we would caption this to make sure that we've got we'll just call this guy like um JJ Abraham Goldman uh caption that with consistently in the data set that's his name and in this one we would probably s say something to the effect of like full body shots uh wearing a a black trench coat um you know describing him black pants brown shoes uh what style we would kind of use to describe this this one is more of like a character portrait maybe I'd even describe this as like um it's a little bit more oil painting it's like photorealistic oil painting type thing um this we might include this right and we might say young JJ Abraham Goldman uh in space you know in on a spaceship wearing sci-fi attire or whatever it is uh I probably call out American flag patch just so that it's like helping break that relationship between when I'm in space I have a a patch if you prompt for the if you prompt for all of the different things what you're doing is you're making it very clear that those are separate Concepts and they're not linked together if I just prompted this guy a Sci-Fi attire it might strengthen the relationship between the patch and the Sci-Fi attire because it's like okay it's in sci-fi attire and there's a patch there he didn't prompt for a patch so that must mean the Sci-Fi tire is the the thing that has a patch right like it's it's building those relationships even if you don't think about it you're kind of implying that those relationships exist um so I know that we're like at time uh and so I'll be respectful of everyone and kind of drop off but hopefully this was kind of an interesting exploration and like this notion of consistency and generating synthetic data sets for the purposes of training you would use this if you were just starting off with a character and you wanted to drive um drive towards building a tool that you could use in future Creations if you wanted to use a tool for prompting for this type of character um you might use it early on in the concept pipeline build some stuff with it and then use that later on uh concepting to train the model even better on that character again this goes back to your first version of the lauras not your final version it's just the first iteration and you're going to iterate on that and train it and build on that data set over time um so yeah uh I'll leave it there um if you all have more questions someone mentioned we do have a models and training channel in the Discord you can go to the models and training section and talk about this stuff we have open source scripts for running training and uh sneak uh Peak but coming up in the future on the hosted product will have a very robust training uh solution for professional Studios who are trying to do this at scale trying to manage lots of data and trying to manage this this type of training across teams and improve the quality of their results uh so feel free to uh reach out if you got questions but take a look at the Discord if you're watching this after the fact uh we'll include some stuff in the YouTube uh details for the repos as well as the Discord so you can join in and ask questions uh but it was a lot of fun we'll see you guys around see you
Info
Channel: Invoke
Views: 1,499
Rating: undefined out of 5
Keywords:
Id: ej7ruT7aF04
Channel Id: undefined
Length: 61min 59sec (3719 seconds)
Published: Thu Apr 11 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.