ComfyUI Node Use Llama 3 To Help You Prompt Awesome Image And Animation

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello everyone we are going to talk about the latest update of a comfy UI custom node called The Comfy UI if AI tools previously we have talked about this tool it's a very cool tool that you can use to boot up any locally hosted large language model on your machine or right now they have enabled LM Studio they have also included open AI using Chad GPT and the ability to use a clae 3 API key to retrieve their large language models working together with the diffusion model which is the stable diffusion model we are always playing around with now this comy UI node has recently received a major update with many new features they have created text to speech nodes and also okay so this is the one the text to speech node and they have also included the whisper node which is similar to text to speech again additionally they have the dream talker node which I'm looking forward to testing in the next video it's like a talking Avatar using your audio and one image and you can make that character talk so that is basically it for the recent update of this tool but then there are core features of the AI tools that we have talked about in previous videos here you guys can check out those videos by going back to the previous videos on this channel we have talked about how to install how to create a workflow how to allow the large language models to read an image and give you some text prompts in a stable diffusion style that you can use to generate images and also use for videos but then right now they have also created a very cool AMA supported language model which I talked about in my large language model Channel llama 3 was just released previously like recently this month and it has a surprising number of parameters similar in size to competing AI models out there and right now the author has fine-tuned the Llama 3 model which is really cool using a 50k data set to fine-tune the Llama 3 model for stable diffusion prompting and I believe it can also work for other kinds of diffusion models as well because once you work it up for clip vision and transform the image noise into understandable text prompts there should be no problem using it with other diffusion models so that is basically it today we are going to use the fine-tuned version of llama 3 which is this one SD prompter maker Q pumper 4km we are going to copy this download prompt which is going to be run in and then we are going to update our config in the AI tools I see some new features coming up here that we have specified like an agent style to reply to your prompts with you know different usage of it so let's proceed with it here on our desktop we have to first download this llama 3 Model we have to use AMA to download this llama 3 Model first we need to go to the command prom and type AMA serve which will boot up your AMA server now if you need guidance on how to install llama and cover the basic setup you can refer to my other videos specifically the Llama installation on Windows video in my other large language models Channel you can check it out there but then let's go back here after we boot up the AMA server we need to create another instance to download and work on the AMA prompt with the Llama 3 Model downloaded here so let's do it here by default we need to type AMA run impact frames on lama3 E SD this is the specific fine-tuned model by impact frames the author who created this model and the custom node we're discussing you can download the Llama 3 in llama but this one is different this is the default llama 3 released by meta Ai and of course you can check out my llama 3 release videos where I talk about llama 3 I'll link all these videos in the description below so you can check them out as well by default the smallest size is the 8 billion parameter model which is the basic smallest one they also have the 70 billion parameter model and an upcoming 400 billion parameter model will be released however let's not focus on those larger models for now realistically running the 8 billion parameter model is sufficient for stable diffusion prompting but then we need to remember to use this command to download this fine-tuned llama 3 Model to work along with with the comfy UI uh AI prompt tools so what we need to do is copy the text prompt and press enter in the command prompt window to initiate the download of the Llama 3 Model this is only required the first time you don't have these files next time you'll only need to boot up ol and run the Llama 3 fine-tuned version by typing the same command prompt it will work once you've downloaded it the first time you don't need to open a second window next time you run this booting up all serve in the back end will be sufficient when connecting with comf youi wait for the download to complete and we'll continue with the next step once it's done you'll see all goes completed download everything is going fine and it will start letting you ask questions for prompting that indicates success as you can see you can ask a simple question to test if this large language model is working or not for example you could ask ask what is life and see if it responds appropriately okay it's working and we can close this command prompt window and bring back our other prompts here we can see there's one request that we just did previously in the API chat and now we can go and play with our comfy UI here this is the previous workflow that I did and we can see there's something new in the if image prompt section after the update you'll notice different personas for your chat AI to respond in different manners and so on for what we're going to use it's the if prompter IMG which is specialized in prompting essentially it's structuring the large language model to predefine how the AI will respond I won't go too deeply into the large language model sections in these videos just go here and you'll see this is the previous one that we had lava and what we're going to do is refresh this window and see if we have something else yes there we go we got the selected models so this should be using the impact frames llama 3 IFA fine-tuned model and using the if prom IMG and the port number may be different sometimes because the first time I downloaded this it was using lava so this will be call it lava you can test for other times but it will autogenerate it for you so don't worry about that that and the temperature you can set it for that there's also a randomized negative prompt for this one and basically you can select what kind of negative prompt you want to use by default like bad chill bad dream scenes you can do something like that or just a simple prompt like landscape if you're generating Landscapes you should use a landscape negative prompt to specify more details about that yes so that is basically some updates of the impact frames AI tools so let's generate one time once again we have to make sure we have connected the right Al and right here scroll to the beginning of llama serve as you can see we have a text generated here from the system called listening on your local IP and the port number so copy this one and go to your custom nodes here the if image to prom custom node you need to paste that IP address and port number into here then you will have to select and once you select it will show you the existing downloaded large language models in your local instance so again select the Llama 3 fine-tuned model and we are going to select the profile for if prom maker IMG or you can use if prompter maker as well both are fine but I like to choose the IMG one just to make something using the text prompt for the text to image prompt this is something different if it will not be getting the image input like this one we have the image input for this one because there's an image for the AI the large language model to go through the clip vision and understand what is in the image itself then it will generate the prompt based on the image so we need to use that one in the Second Step again copy the IP address and the port number this one is the basic one that uses text to image this is going to be a fantasy art prompt you can update all these fields because those things appear for the previous versions of this custom node so you have to select again for the updated fields in all these values for the Fashions you have many other Styles as well let's choose epic I like to see how that will turn out and for the Styles prom you can find fantasy art and for the negative prom you can do something like uh you know fantasy art as well and set the temperature to around or .7 which is good enough if you set it too high the large language model will be overly creative for your response so just set it to 0.7 for Max token just keep it by default in these values whatever you have here is good enough for seed numbers anything you know you can just click randomize here it will generate a randomize seed if you want to stay consistent with the same style then of course you need to use a fixed seed number for this one before demos right here I will just click randomize and for random temperature if you enable this it will be using the seed number randomized values here I will keep this off for the first time try to run this and we need to disable the other note custom node groups so we need to focus only on this part first let's say we have an ancient Mega structure small loone figures in the foreground that is the default text prompt here maybe we can try something different let's say e we a futuristic Mega structure and because we are doing a fantasy art prompt so go for futuristic Styles and urban cities it is the tallest building in these cities we are looking from the top of this building let's try something like that as you can see I'm just typing a very simple sentence structure without any stable diffusion style prompts here so it will help us generate a text prompt like this for us to generate an image in the second step step later let's try to click run and see what we will have in the meantime you can check the second command prompt window it will be running something like this so keep that command prompt in the back end and also you have to remember to turn your comy UI on as well so basically you need to have two command prompts running at the same time and also your comy UI web browser working in the front after it's generated something will be sh going up here so here we have the best quality Masterpiece epic covered art and futuristic Mega structure towering above something like that that will be appearing and showing in your show text prom here as you can see here I got an error message but that is okay because I have not connected any strength we need to connect this strength to our second part which is the sampling part for stable diffusions text to image groups here so it's a very basic text to image process we are loading the checkpoint again that is all explained in my previous videos about this workflow I will not go too deep in explaining what is happening in the text to image process as I assume you are already familiar with using comfi UI what we are going to do here is connect this text strength to the condition positive here that is how we are using this if prompt to prompt node for creating images using just English words and it will generate a text prompt like this for you it is going to be very easy for you to create a good image like this as if an AI assistant is helping you within comfy UI and you are able to create good images once you provide more information here it will help you generate a very detailed descriptive text prompt and yes it will be creating some stunning images like this for your videos storytelling content or whatever you have it helps you generate images like that and lastly we have the last group which is the SVD stable video diffusion group although I don't particularly like stable video diffusion well we have to try this so the first thing we have is again I need to fix this seed number and if you don't want to regenerate again you can fix the seed number so it will bring this image to the stable video diffusion group but then we have not clicked that at the second run so it will be generated one more time again here yes this looks even better with more content in the image it is showing different buildings around it and let's see what happens in the stable video diffusion process hopefully it is not only camera motion videos or we can try to bring this one and if AI tools text prompt and bring it to animated if to create some motion videos in there okay so the video is generated and this one looks better it's a very reasonable movement not just moving the camera the building is actually moving the clock on the top is better because sometimes it will deform the elements in the video when using SVD and right now as I can see the building is staying very consistent on every image frame it doesn't break on this small part but then we can see that only the clouds are moving in the other direction right here so it looks better we can try something else actually and here I have a morphing workflow that is coming from one openart citi.com Community post and the guy ipv I give credit to him a shout out and he has created this workflow for animations just you know very simple using a few images like this and creating moving transformation sections from one image to another image like that now we can use similar things in this workflow but of course I have added a little bit to this workflow that is originally from this workflow and my workflow version here is almost exactly the same but you know we need the QR code to do the motion Movement we need the Samplers and I switch this to K sampler Advanced and using a CHF to resample it again and that will be something different but from the original one here it is using 2K sampler in the original one that you download so you know just edit it by yourself if you want and I won't go too far about that if you have questions about this workflow you can always go ask the author anyhow we'll continue with the large language model AI prompt assistance I would say this is a prompt assistance group and we are going to copy this and bring it to here so right here let's say I have this group created so from assistance and that will be for this group to control all this work here we will have the data flowing from our positive prompt and negative prompt and then we will bring it to our conditions so in the conditions we don't need this text prompt anymore what we're going to do is convert the text to input just like what we did in my workflow here that converting the text as the input again for the negative again takes as an input and we are going to connect this one the string number the string values to the positive prompts and the negative of course goes to the negative prompts and that is how easy it is to integrate the AI tools custom nodes and bring them into our existing workflows that are going to use text prompts that will be how we are going to do it and right here again I will keep the same text the futuristic Mega structure in here we will be using that and creating some animations like this but instead of a house we will have a mega structure like this one and moving even better than in stable video diffusion because as people who subscribe to me know I don't like video diffusion that much okay don't get me wrong I will use that but not too often so in here let's click this and we will have four Images generated and keep this small pixel size so the width and height are going to be small here we will have four small-sized images and we will bring each of them from this batch of image lists and going to each IP adapter handle it individually so we have four Images therefore we have four IP adapter batch nodes here to handle each image that will be going to mask our frame numbers on here so right here we have a 96 batch size so therefore we have 96 empty latent frames we will fill in those images in those 96 empty frames here so what will be happening you will see there are Arrangements of each image frame that are going to be processed in this fade mask Advanced custom node that will be running for our animations in QR code motion control net and bring it as an animation like I previously did for the lightning here so let's click run and you will understand what will happen as you can see here we have four different Ms for 96 frames here and each of them has a different color representing that it will be taken part by this IP adapter and the second one is taking part of the presentation in this area of the frames and so on and so forth so we got these four images that are going to be doing animations from this one to this one and the third one and the fourth one that will be something we are going to expect to see with smooth transitions this is the first sampling output from this K sampler we get this output from the first one pretty fast I mean faster than videos to videos animations because there is less generation and it is very straightforward using four images to generate moving motions in those 96 frames and that is the second one after the highx detailer the second sampler and we have the third one which is the upscale I'm going to use just a very simple basic upscaler here and we can see there is better quality and actually these dimensions are really good to create like a video like those YouTube shorts or Tik Tok short videos those formats are very good for creating this kind of motion animation videos I see a lot of trending videos are using this kind of motion animation videos rather than using videos to videos and restyling other people's actions or any videos actions into AI styled videos and that is the third one as you can see it's way better more detail and higher definition and everything is getting better here and then the last one is going to be the frame vfi that will be multiplying these motions it will be creating better status for each frame so let's wait for this one and we will see the result okay so we got all four results generated and as you can see right here we have the final one that looks like a busy City and then all the lights turn on and off and then you know fast moving like uh time lapse fast moving recording of the CT's development in this video so basically that is generated with a list of four images and right here we are using only one sentence to generate this whole thing and the large language model we are using is llama 3 fine-tuned the SD prompt fine-tuned model and we are using the Juggernaut sd15 actually we can use something else to see the difference in Styles rather than the jogger nod we can see this is how easy it is to create prompts and you know make animations create images like this one even using this for a streamline stable video diffusion as we talked about you know those video AI video generators that are using stable video diffusion as the back end just try to make a beautiful nicer user interface and I don't think we need those anymore like if we can do ourselves some you know a workflow that is convenient we can run it locally and better than like this one that I'm talking about then it is okay we can have our own workflow like that and you know running things ourselves locally without any restrictions or stuff like that so there you go we are having this new update of the if prompt to prompt custom nodes and that is is also working with the image to prompt notes as well in the custom notes package so there are more things coming up from the author and next one I'm going to test the dream talk custom nodes so I will see you in the next one and by the way check out this video that is featured in the GitHub it's one of the videos that I did before so it is right here somewhere yeah it's right here and yeah check it out if you have not installed that and just go through these videos first and come to the current videos that we are doing right now to get some updates so see you guys in the next videos have a nice day bye

Info

Channel: Future Thinker @Benji

Views: 4,665

Rating: undefined out of 5

Keywords: ComfyUI, custom node, ComfyUI IF AI Tools, Llama 3, prompt, image, animation, text-to-speech, Whisper node, DreamTalker node, fine-tuned model, Sd_prompter_maker_Q_4_K_M, download, setup, integration, beginner, experienced user, creative, stunning, comfyui, ai art, ai, learn comfyui, animatediff comfyui, Llama 3 ollama, LLM With Stable Diffusion, ComfyUI Node Use Llama 3

Id: yR2Y9G71w6E

Channel Id: undefined

Length: 23min 7sec (1387 seconds)

Published: Fri Apr 26 2024