Building a Python API for Comfy UI with Gradio

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone in our last video we discussed about grade in this video we are going to apply the knowledge and build a python API for comy UI we are going to use Gru as our front end which will use the GPI to connect to confy UI which will be running locally on my machine but this can be done even if Cy UI is running on a server on the internet for those unfamiliar Al UI is a FR end for the stable diffusion it uses a node base worklad and stable diffusion is a technology that uses AI to generate images now when we talk about API or application programming interface this enables us to bridge the gap between two different software application will be able to establish a communication between one application with another application even if those two application are of different Technologies now since we are using gradio as our front end we can essentially use any device to connect to the back end so we can use a smartphone a tablet a laptop or a desktop PC we send the press via our API to the back end the confy workflow will execute generate an image and send that image back to the requested device I will start by making a very simple example just connecting Gru with the confy UI backend so here is example number one where we are going to just have an output and a generate button when we click on the generate button it goes to the work FL and generate an image displayed on our gradio front and interface for the second example we are going to build on it and add a text prop this is going to serve as our positive prop the stable diffusion requires a positive as an input it goes through the model and the model takes that input and generate an output image from it for our final example we are going to change from a text to image workflow to an image to image workflow so we'll be able to add an image as our input image and then I've added some example sliders and props that you can use as a reference in case you are trying to build on it when we click on this submit button the image will be uploaded on This Server now in this this case it is my local computer but if confy UI was running remotely it will be uploaded on that server and then confi will take that image as an input generate an output for us now before we begin I would like to give a big thank you to musty for being my very first subscriber to this channel outside of family members so very big thank you to you and thank you than you all for watching this video in advance let's get started so the first thing that you want to do is to create a project directory now in my case I've created a folder called code and inside it I have a virtual environment if you did not follow my previous example to create a virtual environment you will do python DM VNV VNV and then press enter for it to create the virtual environment and in order to activate it you will do vnb so your virtual environment name scripts activate as for the dependencies you will need to install requests hello gradio nump now next you want to create a new file you can call it app.py main.py I will just go with app.py I will start by creating the gradior interface as it is fairly easy you will need to import gradi here I am importing gradi as G then I will create the interface the interface will require a function inputs and outputs for the function I will just name it as generate image for now I'm just going to pass for the input I will leave it blank for outputs I will put as image just so that we have something on this screen I will call this interface as demo and then I will do demo. launch to launch the radio application next go into the terminal and do ithon app.py once your radio application is running click on the URL or you can copy the link and open it in your browser and you should see an output then we Havey generate button just below it okay so now that we know our gr interface is working let's take a look at config UI and see how we can get the API from it just in case if you don't have a confy UI installation you will have to go to the GitHub link scroll down into installing comy UI then click on the direct link to download you will get a zip file extract the file and then open a terminal navigate to The Path at which you've extracted the confy UI installation then just do run uncore Nvidia GPU if you have a GPU if you don't have a GPU then you will do the run CPU version this will install cyi give it about 5 to 10 minutes once CI is ready it will open a web page when you can see a basic workflow this is a text to image workflow for our first example we are going to use this workflow now one important thing is in your load checkpoint you will see a checkpoint name just make sure you're downloading a stable diffusion checkpoint and is storing it in your models folder under confy UI directory so if you already knew about confy UI you've generated a couple of images using comy U go ahead and modify these nodes for me personally I've changed the checkpoint I've changed the positive prom the negative PR just some Safeguard text in there I've changed the cas handl steps count to eight this is just because I am recording I do not want to wait for a long time for an image to generate I've changed the CFG scale and then just to know that these images were generated by our API I've changed the save image file name prefix to text API I will do a quick test just to make sure everything is working by clicking on the cube button and if I go into my terminal I will be able to see if the image is generating so in here it loaded the model and the progress bar will tell me whether my image is being generated we are going to use the same thing when we will be using our FR end to generate the images okay so as we can see it generated an image now the quality of the image is not really important here I'm using a low step counts as well if you navigate to your confy Ur installation and then you go into the confy UI folder output folder it should be able to see your images there I have the test API version one and then as you can see it's the same image the way I found out about the API and tested my theory was to go into the developer options you can do so by pressing control shift and I on your keyboard then go into the network tab click on this little icon here it's it's to clear the network history and then make sure that this button is under recording then click on the cube Pro you can cancel it afterwards go into the first entry it should be prop double click on it and you will see that it opens a URL where we have the Local Host the port on which confy UI is running and then we have forward SLR if you go back into confy UI you go into payload and you go into that prop you will see everything that you see as inputs here corresponds to a nude in our workflow so using this this will be able to write a request send that request over using this type of payload so let me show you how you can get an example of this close the developer option and uh on the right side click on the little gear icon and then click on the enable Dev mode options make sure that it is checked go all the way down and click on the lose button this will give you save API format option in your menu so I will be using this workflow if you're using a different workflow load that workflow first then click on Save API format this will save a Json format file click on save as now to make it easier to work with I will be saving this file in my project directory I will go back to visual studio code and open the workflow api. Json file all right so in here we can see we have inputs and it is telling us that the seed is this the step count CFG count these were the values I change in the confy UI new base workflow and it is being reflected as Json in here similarly if I scroll down I can see my checkpoint I can see the positive promp the negative promp scroll a little bit down I'll be able to see the F name and somewhere there is the file size as well so here we have WID and height and that all of these corresponds to these nodes here so going back to my python file I'll need a way to load this Json file I will go at the top and import Json Json is a standard python Library so you do not have to install anything then I will open the Json file so the wful api. Json in read mode I'm naming it as file Json then I will use the Json library to do Json do load and load the file just T I'm going to comment out the code below and going to save this data as data and I will do a print with data now pay attention I did not do load with an S loads this is because the file is already in a Json format we do python up. P1 and test that yes I'm able to correctly load the workflow now next thing that we want to do is to send this workflow over to config y to have an image being generated now if I go back to config y interface here and uh if I show you the network prompt once again this prompt it's a post request so when we are making our API request we will have to do a post request to this URL I'm going to create a variable at the top or the URL now to keep things organized I will create a function which I'm calling as start Cube which will take a workflow as the input then I'm going to create a promp variable I'm calling it do p this is going to hold the workflow the entire workflow that we are seeing at the bottom here now in order to send the request over to confi it needs to be in a particular format one thing that we've already seen is the post request the second thing is we need to encode it using utf8 so I'm going going to encode the data the pump into utf8 and then at the top I will have to import the request library and then do a post request to the URL I can replace this with our URL here the data is going to be our promp that we are encoding to gf8 at the top now just to clean up the code I'm going to move this with open file inside our generate image function and then to test if it is working our we do start Kill The Prompt is going to be this promp here we'll change the name to promp okay so just to recap we are importing our libraries at the top we need to send the post request to this URL if you are connecting to a confy UI which is running on a server remotely then you will need to get the IP address replace this part with the IP address or the URL and then make sure the port is correct the forward SL prom that's one is standof then you need to create a post request get the payload encoded into utf8 and then send it as a post request okay so at this point we can test and um I just have my windows arranged here on the left side I have my visual studio code with the code at the top the terminal at the bottom the top right is the confy UI terminal and as you can see we have promp executed um there and then at the bottom right I have UI running and you can see Q size as zero so I will go into the terminal for the python gradio file I'll do python app.py to start my gradiate application once the application has started I will click on the link okay so this is my gradio application and um pay attention to the confy UI terminal as well as the web interface I will click on the generate button and as you can see it says got prompt prop executed in 0 second and let me do it again and pay close attention here as you can see it says prop executed in 0 seconds if I click on it again it will show prob executed in 0 seconds this is because confy UI is very efficient if the last seed is the same as the current seed it will not execute anything it would just give you the same output and that is re the reason as to why we are getting prompt executed in 0 seconds it's it's not generating anything so if you want to see a new generation you will have to go into the workflow api. Json scroll all the way up until you see inputs seed and then you have a seed number change the seed number to anything that is different restart your radio server and click on the generate button now you can see that the cont UI terminal is showing got prom and we have a progress ball all right once the promp has executed if you go into your compi output directory you will be able to see the generated image that there now one thing to note here is in the web application it does say Q size uh one if you paid attention to it but you will not see the output image in uh in the save node or previe node so our gradi application is able to communicate with the confi front end however it is not really doing anything all we have is is a button where we click on it and generate an image we have to go into the Json file to update any of the values so this is what we are going to do next first will try to see how we can output an image directly in the Gia application this way we don't have to go into our output folder to look for the latest image that was generated now go back to the app.py file the python file since we will be working with a folder the confi output folder will need a way to access that folder so I'm going to go to the top do import OS I'm going to create a new function just to keep things organized we'll call it get latest image and it will take a folder as an input next I want to get all the bars in that folder so the folder is going to be the output folder and I want to get all the images from it now this next part may be a little bit complicated for beginners but I'm going to explain it so what we are doing here is a list comprehension where we are checking all the files from the os. list de function and we are checking to make sure that the file has the PNG the jbag file extension and only those part we are taking them and putting them in a list saving it as image files the reason as to why I'm doing this is because I want to order the file according to the last date modified so the OS modle provide a get time function and using it we can do a sort to get the latest file okay so I'm just making sure that we have a first image there just in case the file directory is none I'm going to set it as nonone so basically that person or the user has never generated an image the file directory will be known otherwise I will just take the last image on that folder then I'm going to use the os. maure to create the file have so we're just going to do is the path join take the folder so that is our base folder and then the last generated image file name join together to get the file path and lastly I will return this latest image so going into the generate image function just after the start Cube I'm going to call the get latest image function and I'll have to pass in the path to my comy UI output folder so I'm going to go to the top create a variable and I'm going to call it output D so output directory and this is just my confi output folder directory I'll use this variable in my get latest image function and there is a little bit of mistake here but I will leave it just to show you what I mean now I'm going to close my gradius server and restart it see what happened when I click on the generate image it got a new prop okay it's completed but on my gr front end application nothing is happened let me do a print statement and you will see what exactly this get latest image is returning I'll close my server restart it again go into the gror application click on the generate button and the bottom you can see that we have the file half let me modify the seed to two just so that we have an actual generation here and look at the bottom here back click on the generate right away I'm getting the file PA but the generation of the image has not completed yet okay so this means that our application is going through install the quebe send the request and without waiting for the image to generate it's trying to grab the latest image which obviously is the wrong one as you can see it's grabbing the fifth image but the latest image is is number six to to fix this we have to tell our application what was the last generated image before sending the post request and then do another check to see if we have a new image that does not match the previous one so let me write the code I'm going to create a variable I call it previous image and then I will use my get latest image to get the pile app this will run before starting a cute now after we've sent the queue over to Cy one we are going to start a w Loop and then every time I will try to get the latest image so Al we just keep on going over and over and it will check if we have a new image in the output directory so I'm just doing a check if this latest image is not equal to the previous image so the previous image is before the queue the latest image is after the queue if those two images are not the same same then we know that we have a new image I will be returning this new image and since this is a function the moment I I do return here it will break out all the for Loop and break out of the function as well now the Y Loop will run every frame depending on your computer so to add a little bit of delay I will import time and then I will do a time that sleep for 1 second this way the loop will run every second let's close the server and restart the gradio application refresh the page and click on the generate button now we can see we have a new problem on the right side here we can see that the prop got executed for 0 second but our gradi application is is still showing as pending so it is within the wall Loop here and it will never break out of this wall Loop because the previous image will always be equal to the latest image image since our coni did not execute and generated a new image and again this is because the seed number is the same as the previous image so I'm I'm going to close the radius server and I will have to kill the terminal for this so I'm not going to fix this issue just yet instead I'm going to change the seed number to three which is different and then Resort my server refresh my gradio application click on generate and this time can see gradio is waiting here and the terminal on the right side it is showing that it got the pr and it's executing right now so it's completed we can see a new image number seven as well as the gradio application showing the same image here so if I open the image we can see it's the same exact image on both right so we were able to grab the new generated image from the config output folder and show it on our G Radio application now in case you're wondering how do I fix the issue of having the same seat then you will have to go deeper and implement the same efficiency that confy UI is utilizing so somewhere in your application you will have to cach the previous seed and then check if the seeds are the same if there were any modification to the seed go ahead and generate New Image if the seed is the same then just return and don't do anything however this implementation as well is bled because in config you can modify anything up to the key sample so for example let's say you modify the file name which is after the key sample uh even if the seed is the same confi will generate a new image and save it under that new name so um that is why I'm not going to fix it because there's a lot of little things that you will have to take here and it is outside the scope of today's video which is about uh building a python API or confy UI we are not really making an efficient python Ur API instead just making a working python API here so I'm going to continue with example number two where we have some inputs so I will start by adding the positive promp to our interface right now it is B we only have the output image and in generate button so going back to the app p if we go into the interface you will see that we have inputs and right now it is blank it's an empty list so adding the positive is very easy all we have to do is say we want a text Rob as our input I'm just going to say positive or the name and at the top here you see that we are loading the workflow so if we go back to the workflow we can see it is a Json pretty similar to your python dictionary we have the number three here if we go down a little bit more we see a number six and the number six has inputs inside of inputs we have text which is the positive problem so if we modify this line here we will be able to type any positive problem send a new request a new post request to confy UI using the new positive form so let's implement it so right now I'm saving the Json which gets converted into a dictionary inside of up so all I have to do is do prop access number six access inputs go into text and then modify so I'm doing prop six inputs text and then modifying it with a prompt text now this prompt text is coming from the text that we have here so I will have to modify the name to promp text okay so text that we have under inputs will send over to generate image promp text which will be replaced here so if I say uh in my prompt I say monster monster will be replaced here let's test by closing the server restarting the server and do a refresh in case your gradio application is not refreshing you can press control F5 to force a clean refresh okay so we can see text uh promp text here which is the positive promp let's try doing a cat on submit and can see the gradior is waiting and cyi says got promp and we have the progress bar generating the image here so we got a cat and at the bottom we can see we have number eight which is uh a new file if we go back into the workflow API you can do the same thing for the negative props you can modify the F name prefix you can go up and you can modify any of these values by simply doing this all you have to do is do prop access whatever number you need so we can do another example here let's say you want to modify the the step count we do prop number three model bar the first one two three I need to go into inputs and then steps and then this needs to be something so at the top of my function I'm going to say step count and then I will pass this here I also need to modify my interface right now I have text which is referring to prompt text I need to add one more text which will refer to this step save it close my server restart the server refresh and right away you can can see I have my positive on the top and step count so I'm just going to do four click on submit on the right side we can see the terminal and it's taking that into consideration so four steps here here we go to the lon now at this point you pretty much have all the knowledge needed to make a good text to image workflow so any text to image workflow if you need to modify something I can just add it as an input and then just modify this part here next I want to show you how to use an image as an input and do an image to image workflow so for that we'll have to go back to confy UI and you will need to load a workflow for image to image so the only thing that you need to change for a basic image to image workflow is the empty latent image deleted and a VA encode and for the pixel drag it out use a load image node for the V you can use the built-in bake one or you can load a VA all right so I've loaded an image change my checkpoint my VA is there the encode is there connected to the latent image I also need to modify the doo I will say about 0.5 and then I have the text prom which we've already seen once I have the workflow I will click on the save API click on save as and because I want to use the workflow api. Json name I will replace my previous workflow Json file with this new image to image work so if we open the workflow in Visual Studio code we can see we have the previous seed there I'm going to change the step count I forgot change that in the workflow we change it to 8 CFG to 6.5 the D noise is at 0.5 right now it's good we have a text prompt but we can modify the text prompt from the radio application since I do not have a field or negative I'm going to modify it and then if we go all the way down we can see the node 11 inits image have test. jpeg this is our load image node which is telling us which file we want to refer to or our image to image so pay attention to the test. GP file name this is important so for now I just want to make sure that my workflow is working correctly so I'm going to close the server start the server once again but using the new workflow underscore api. Json far go into the greedy application refresh I'm just going to modify it on submit and check my compi terminal okay the steps count uh we got the new image and let's just check test the input image so we can see that it is different the den noising strength is maybe a little bit too low but uh you can see here there are like bus but in here we don't have that so this input image and generated a new image or an image to image workflow now what we want is to have an input image box here where we can drag an image into that box or maybe take a picture using our phone and then use that image inside of um a g application send it over and then have it generate a new image so let's Implement that part so because we want to use an image as a as an input we are going to change our inputs in the interface to image if I close the server restart it you will see right away um that we now have a place where we can drag and drop images we can click to upload and if I were to open this Gia application on my smartphone I would be able to use my camera to take a picture our greedy application doesn't know how to handle the image so I'm going to go modified and say input and I have to work with this input image now we'll remove the positive promp and the steps count okay so going all the way to the top where we have our import statements we will now make use of the two remaining libraries so I will import Lum part as NB and then I will do from fil import image so once I have my workflow loaded I will use image to do from array and pass in the input image this is going to be my image the reason as to why I'm doing image R and aray is when we use the gradio application to load an image it will load it in the form of a numpy AR and in order to get a full image a pillow image we'll have to convert it uh from an array into an image format from the pillow library now once I have the image I will do a simple calculation just to resize the image into if 512 by 512 resolution the reason as to why I'm doing this is because when uploading images the images may be of higher resolution for example my smartphone can take images at a resolution of 4K sending a 4K input image over to confy UI is very taxing on the system that is why I want to resize it to 512 by 512 so first I want to get the image and get the minimum size from it this can be the width or it can be the height once I have the minimum size I will do 52 divided by that size in order to get a scale factor then I just need to get a new size the width and the height will be multiplying each side by that factor to downscale the image okay and I'm calling this new image the re res size image once I have that resize image I need to save it as test. JT if I go back into my workflow and this is the name that confy UI will search for now since I'm making a video here I will put underscore API just so that we know this is the new file okay and to save the file we just need to save resize image do save and pass in the input folder for config UI in my case it is in the confy UI folder input and then I have the testore api. jpeg there this will save the file The Next Step goes into getting the latest image that's good we are going to start a que which will take this new file as the input and then do the rest so I'm going to go down close my server I'll restart my gradio server refresh the gradio application and then just choose an image here so I'm using this bottle as a input image if I go into my workflow API my text so the positive FR is still the default beautiful sceny nature glass button landscape one uh the Galaxy purple button example I'll click on this submit button the Gadi application is listening and we have the confi terminal that is showing got and the progress bar is there just wait for it to complete okay so we have this input image it took this input image and uh modify to whatever this is now if I go into my confi installation folder um confi inputs one of these file will be testore API it's the one that is saved by our gradio application using the pillow Library okay so essentially what we did right now is we took the radio image and save it in the input folder for com and and this is going to work even if you have um a tablet or a phone that is taking pictures okay so on the right side here I've connected my phone to my computer so that you can see the screen I have to disable my camera for that I will close my server my gradio server go all the way down where you have den. launch inside the launch fion and set share equals to to tr in order to use the hugging face space restart the server and this time it may take slightly longer and that is because it's actually creating a public URL and in here I have this public URL if I copy this go into my mobile phone go into your browser pce the link and as you can see we have the exact gradi application so at the top we have the input image input image the output and again the output so on the phone here I'm going to click on the drop image here this will ask me for access to my camera as well as pH then videos click on load and then I want to click on camera this will request access for my camera and now it is it's asking me to take a picture so I'm just going to take a picture here I will accept this photo and now this is my input image let me okay so I have my config UI terminal here we click on submit and as you can see here it says gotop have eight steps going so the prompt got executed and on the right side we can see it's telling us that radio may break that's okay but at the bottom we can see the output so I think this is a good stepping point for today um we were able to see how to use an API to connect gradiate application with confy UI we've also seen how to add inputs different types of inputs we've seen the positive p as a text input as well as a step count which is a number but you can use the text input for that remember that we are sending the information across the internet so everything can be an input part we've also done one where we have image to image and we'll take an example where we are using a smartphone uh camera to take an image and then use that as image to image to generate from config why hopefully this video was helpful and gave you the knowledge the fundamentals that you require in order to make your own thought and using GRE you thank you for watching this video until the very end this video was very long so hopefully you were able to get something out of it please consider liking this video it helps me subscribe to the channel if you want to watch more videos like this I will see you in the next one
Info
Channel: Code Crafters Corner
Views: 3,854
Rating: undefined out of 5
Keywords: python, api, gradio, comfyui, stablediffusion, aitutorial, Code Crafters Corner, Sharvin
Id: yspMVTL08Rc
Channel Id: undefined
Length: 38min 26sec (2306 seconds)
Published: Sat Nov 25 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.