Build Your Own ComfyUI APP!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello everyone this is mato and today we are going to explore a different way of using confy UI to do that I've developed confid dungeon a very simple demo that generates DND character portraits let's see how it works very quickly on the left I have four tabs with all the options in the first one I can change the number of results steps and the overall style I can increase the result to four and roll my new characters and of course I can zoom in since I want to iterate through a lot of options I'm going to select a fast model that basically uses LCM in the second tab I can Define the background and the mood of the picture for example I can place her on the mountains and let's say that the mood is canning in the third block I can finally describe the character itself gender is of course a slider I can try to make her more masculine and with bigger body structure I also want dark blonde short hair let's see how the male counterpart looks like and it's not perfect I didn't have much time to test all the combinations and a lot of work would be needed to make it really useful but it's still pretty fun to use we can also try the anime version or the anime accurate one I've also included a cinematic option but doesn't work very well to be honest it's okay for humans and elves maybe but not so much for anything else anyway the application is nothing special but even in this state it makes comfyi very easy to use and accessible to anybody and also of course it could be improved with IP adapter instant ID UPS scalers detailers and whatever comes to mind so Backle up because I'm going to teach you how to build an application like this and no need to be scared because it's actually pretty easy let's start by demistifying this thing first of all the application runs on top of confy UI It's actually an extension you can install like any other if I go to the address bar you'd see that the URL is my confy UI address slash dungeon if I remove dungeon I get the plain old confi interface you can go right now to my GitHub repository download this demo put it into the custom nodes directory and start generating ating DND characters but the cool thing about confid dungeon is that it doesn't need python you don't need to understand the confi inner working you don't need to know how to write a confi extension 99% of this application is written in JavaScript and HTML and the actual logic is just a bunch of line of code so now that we know that this is just a glorified inter interface on top of confi we can take one of the generated images and drop it into the work area to check the workflow this is really all the risk to it if I take an image generated with a fast model the workflow of course changes and we have a third workflow in case the user adds some custom text with an additional prompt and a conditioning concut so we need a strategy to handle all these options okay let's start from the beginning first of all you need to activate the developers tools go in the settings and check enable Dev mode you should now have save API format in the toolbar for this tutorial we are going to build a very simple application that you can also use on your phone so we start designing the base workflow I want it to be super fast so I'm using an sd15 checkpoint and connected to an LCM Laura together with all the pipelines in the K sampler I select eight steps CFG 2.1 LCM sampler and sgm scheduler okay looks good this will be our base workflow nothing special of course you can make it as simple or as complex as you want now I save it in API format and open it in a text editor as you can see it's nothing more than a Json file if we analyze it we can easily spot all the nodes at the top of the tree we have the node ID in this case is uh three but could be anything uh the Class Type tell us that this is the case sampler if I scroll down I can easily spot the positive with id6 and the negative with id7 since we are going to customize the prompt on the I'll remove the text from the positive and leave an empty string the negative for this demo will be fixed so I just write low quality blurry dark horror and naked because YouTube is so sensitive I can save this and get back to comy now since I might want to increase the CFG I'll also add a rescale CFG note connect the pipe and set the guidance scale to 2.8 now I can use a higher CFG without burning the image too much let's save this for the API and open it in the text editor if I scroll all the way down I find a new entry with id1 that is our rescale CFG node I'm copying this node and adding it to our previous Jon file if I save this and drop it into config y you'll see that we indeed added the node to the workflow but it's not actually connected to anything this is fine because this node will be optional and we are going to switch it on and off programmatically okay now I need to go to the custom nodes directory inside confi here I'm creating a new directory that will host our demo I'll call it fast gen but can be whatever you want in inside this directory I create another one called web and this will be the root of our web app here I need to add one more directory called JS for the JavaScript files inside this folder I'm copying the workflow that we made before and rename it into something like base workflow now I go back to the extension root directory and create a new file called double uncore in it Dore dop remember I told you that 99% of the application is Javascript well we are going now to cover that 1% of python so this init file is automatically called by confi at startup and we are going to use it to configure the route so that we can access the application through an easy to remember URL I'm opening the file and pass in this code that I precooked there's really nothing special to see here on this line I'm setting the base URL that will be/ Fast gen and in this one I'm setting the route for our JavaScript and that's it that's all the comi specific code you need to know we need one last file that is of course the home of our application inside the web folder I'm adding an index. HT ml file and now the structure for our app is ready let me open the index file and put some basic HTML in it I'm making the background dark so the image will pop better and inside the body I'm adding some text to see if it's working okay now I can restart comy and try to go to the URL that we just set up slash fast gen and and it worked great now let's make this page into an actual app I take the index HTML remove the title and add the text area for our prompt and a image tag for the result I give them an ID so they are easy to find from JavaScript I'll call the text area prompt and the image main gen okay I'm also adding some some base Styles and we are ready to go this won't win any web design competition but it's functional next JavaScript at the bottom of the HTML I'm including the Javascript file and inside the JS folder I can create a new app.js file I open it in the text editor and now we need some basic functions first of all I create an asynchronous send box this is needed because we are going to send request to the confi server and wait for it to reply this is easily done by enclosing all of our code inside these two lines next I need a client ID generator this will create a unique ID so the server knows uh who is talking to the function is a bit cryptic but all it does is to generate a user identifier in a format that comy UI likes next I'm using fetch to load the Jon file our workflow is very simple so we could actually embed it directly into the JavaScript but for the sake of tidiness I'm using an external file I can also add a console log to see if the Json is loaded correctly let's go back to the browser and see if it works now if I check the console I should see the workflow converted into an object there you go super easy if I click on the arrows I can see the values okay perfect it is coming out nicely now we establish the connection to the websocket server fortunately mod browsers support the websocket class all we need to do is to send the client ID that we created earlier to the socket server and the connection is ready okay now we need to listen to the server and check if there are any messages coming this is done with an event listener in the first line we listen to the message event the server talks in Json so next we convert the Json response into an object then we check the type of the message that we are receiving there are many of them but at the moment we only care about executed that is when the image is actually generated our app will only handle one image result so I'm extracting the index zero and then storing all image data into constant for ease of access now I need to send this data to the HTML image tag I'm adding a constant that stores the image element and finally updating the source property of the tag to actually display the image to get the generated image I'm using a confu API if I send a request to slash view with the right parameters confy will kindly send you back the image the r option is just to clear up the browser cache okay we are almost there we only have to figure out a way to send the prompt to the server for that we need to send a post request to the web server super easy I use again the fetch function but this time I need to set the method to post because we are sending data not receiving it the data is called prompt but it's actually the full workflow not just the text prompt and then we need the client ID since we are dealing with super fast Generations I want to see the results as soon as I start typing there are many ways to do that probably the simplest is with a timer this code checks the text area every 500 milliseconds and if there are changes it sends the request to confy the only thing worth noting is these two lines this is where we update the workflow on the Fly before it is executed if I take the Jon file back you'll see that the node ID 6 is the positive prompt and the actual value is under input text so in the code I first select the ID then input and finally text same for the seed this is ID3 input seed let's see in the Jon what that is three is our Cas sampler and of course seed is the seed for the random number generation okay all left to do is to save and try scenary of an alien planet okay it works remember the first generation might take a few more seconds but then it will be super fast let's add high quality detailed or with another subject cyberp Punk cyberg woman closeup short hair battle Angel Alita detailed high quality anime okay cool from the command line I can see that the generation takes about 300 milliseconds so I'm changing the time out from 500 to 360 I could make it faster the workflows are cued so there's no risk of overflooding but it wouldn't make much sense as the image would be displayed just for the fraction of a second and we wouldn't see much one last thing if you remember we added the CFG rescal node that we are not currently using so I want to add a checkbox that will trigger the CFG rescaling in the HTML file I just need to add the input and in the application I first need to select the HTML element and check the input value to toggle the CFG rescale this time I'm changing the node ID3 that we know is the cas sampler the value is the first index of the the model key let's see what it is in the Jon file so the K sampler is ID3 model is an array with two values the first one is the ID of the next node in the model pipeline by default is 10 that is the Laural loader but we want to connect it to the rescale CFG that is ID 11 so here I set that value to 11 I'm also increasing the CFG to 2.8 if the checkbox is not selected then I set everything back to the previous values easy peasy let's give it a try I select the rescale and type Bat Mobile on the streets of an Old Town Cobblestone detailed high quality cinematic now since I know you want to use this on your phone in the secrecy of your room I'm going to tell you that if you run comy with listen and enable course header options like this you can access your app from anywhere inside your network there are some security concerns uh but for home users it's fine if the PC hosting confi is not open to incoming internet connections so let me grab my phone and see if it works I'm activating the CFG rescale and try something like superhero IT Tech Black Gold costume detailed Fork cinematic well that's it in just 80 lines of code we have the base for our app ready and we didn't need any python skill you can check my confy dungeon code to to find more options and also to get some ideas for on the-fly prompt engineering I know it's not for everybody but I believe this is how comfyi should be used to help less tech subv people take advantage of the technology regarding confid dunon as I said it's just a tech demo the code is very messy and I apologize about that but it wasn't really meant to be an ual full featured application now that I've made it though I think that it's pretty fun to use and if you like it I could clean it up and improve it I'd like for example to make group pictures uh so you could create your parties portraits first and then compose a group shot with all the characters uh well yeah we'll see let me know what you think okay then I hope you enjoyed this kind of content as always I'm looking forward to your feedback that's all for today see you next time ciao

Info

Channel: Latent Vision

Views: 16,619

Rating: undefined out of 5

Keywords:

Id: anYHG37fUg4

Channel Id: undefined

Length: 19min 7sec (1147 seconds)

Published: Thu Mar 14 2024