ComfyUI - Learn how to generate better images with Ollama | JarvisLabs

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Hi everyone. I'm Vishnu Subramanian, founder of JarvisLabs. In this video, we are going to see how anyone can create beautiful images without actually knowing much about prompts. To do this, we are going to use something called if prompt will get to the technical details a bit later. But before that, let's see what can be done here. Right here all I am giving is the word woman, which is probably the most used for generating images and see how the prompt is being getting enhanced, and how the model is able to generate really good looking woman without any of the fancy prompts that we otherwise would be giving. Right? And there are like different ways, like you can say probably neonpunk, and it gives a completely different style of the image, right? Just by certain tweaks, we can control how the image is being generated. How can you achieve this or how you can do it in your instance or something will be looking in this video before we do this, we need to understand important tool called OLLAMA, which is an open source tool. If you are using JarvisLabs it all, it already comes as a template. So you can just click and you can point it to your local UI tool, which probably I'll be making another video soon. Right. But in this video, what we will do is we will not be using the OLLAMA template, rather we will be using the comfyui instance. And installing the OLLAMA in it. Write the actual steps for that. What I'll do is I'll leave it in the description so that you can try it. If you're using your own workstation, the steps should be similar or you can go to the OLLAMA website. They have steps for Microsoft Windows, Apple Mac and Linux. Machine on. We provide Linux machines in Jarvislabs, so you can use these two steps to get the OLLAMA installed on your cloud instance. Once you do that, you have to start the OLLAMA serve, which actually enables a server. So let's do that. And we will also need to run some particular model or we need to download some particular model. Let's say for example, in this case I'm downloading something called Gemma : 2b to be right. Just to make things quicker. I've already downloaded this model so that we don't waste time. Right? Once I'm here, I can talk to it, like how we interact with a platform like ChatGPT, right? Can you give me, . Prompt for generating. You know. Beautiful.Mountain. Right? So if you see if you look at the response for this, it's not very much specific to how we use prompts for stable diffusion. Right. So that's where we use another tool or a node for company called if I tools, which is , probably an amazing work done by probably an individual contributor. It's open source software. So what this allows us to do is it allows us to use the OLLAMA, multiple LM models available in it with our Comfyui software. Right. So here what we will do is now let's get rid of this by clearing the workflow. Right. And let's start with the default one. Right. So before we do this, let's just move all these things to the side so that we have enough space to build it let's. Probably pick. I've been playing with the juggernaut model. Sorry. It is SDXL. Oh, let's use this. Right if you're doing it for the first time, you also have to install this multiple nodes. For this one, you can just install custom nodes and you can search for if comfy UI, right. It's already there. Or what you can do is in this my case, I have already installed it. Right. So this one is the comfy UI if ai tools. You install it, restart it. You can just do it from ComfyUi Manager. As usual, if you're using a JarvisLabs installs the ComfyUi managers pre-installed for you, right? Once it's done, let's try to go and add the if ai tools node so that we can talk to our lama . different models right. So it's part of impact frames. Let's start with something called if prompt to prompt okay. So here. What. So we added a node. Okay. And in this node what we are going to do is it's talking to the local host and port number 11434. That's the default port at which ollama runs right. Let's pick up a model called GEMMA okay. And let's leave everything aside. And then what we will do is here. We are not going to give the positive prompt or negative prompt. So what we'll do is we'll say right click and say convert text to input. Once we do this, we can actually connect the text to input here. And we'll do the same here so that we can do the same for the negative prompt. Right. So let's minimise these things so that we have enough space for playing around. Right. So I think this should be enough for us to generate the image. And let's bring it here so we can see things. Right. So this is very simple right. But let's say what if you want to understand what is happening. So in order to do that lets there is another node as part of this impact packed. So impact frames which is called IP display text. There is other node also called save text which you can use probably if you want to , save the prompts that is being generated. So let's put this to the response and put this to the negative. Right. And let's also bring another one and node. Impact frames and see IP display text. Let's put this to the question. Now let's run the prompt again. Now what you would see is we can see what is the positive from that is being generated. Right. So we are just saying okay let's remove this. And let's say just say cat give me an image of a cat. So what is happening here is the Gemma model. And the node is doing certain applying settings styles to the prompt. And it's also giving a lot of more details to the from that we have. Right. That is what is making the , images to be generated very beautifully. So we are able to do a lot more professional images, or it gives you a starting place for us to have images. And probably you can remove the words that we don't want. These are not perfect, but this gives you a better place to start. Right? So but what if I don't want prompt whatever. I want to start with an image. Right. So in order to do that, what we will do is we will not use this particular thing this particular node that is prompt to prompt, but rather than that, we will say image to prompt. Okay, now we pick this. Right. Let's delete this. All right. And let's do these connections as usual. Let's keep this bit far, okay? For now. Since we have seen how the prompts are getting changed. Let's remove them. Right. And let's connect the positive prompt here and negative prompt here. So I have to enhance this. Then I have to connect this to the positive prompt. Now let's minimise this. Let's expand this and let's connect it to the negative prompt right. Once this is done. So what we can do is we can attach an image to this. Let's say load image. Okay. And let's use the car, that image that we have. Let's see if you're able to generate the image. Which I don't think so. Okay. The reason it doesn't work is all the models are not good at understanding the image. To do that, we have to use a model called llava. It is a multi model LLM’s or usually single model. That is, it understands the text right. So llava is capable of understanding the images also that also you should be able to run it with the ollama tool which is open source tool which allows us to run multiple large language models or other multi multi large models like lava. You can also run them on your laptop right. So now we have this image. We have chosen the model as llava. Let's see if the workflow is able to generate images of a car. All right, so now we got the image of a car. That's it guys hope you enjoyed the workflow. If you face any challenges, you can write down on the comments we will try to answer. Or you can also join our discord group please hit the like button and subscribe to our channel for more such videos. Thank you. Bye bye.

Info

Channel: JarvisLabs AI

Views: 4,239

Rating: undefined out of 5

Keywords:

Id: i9DFP9W66bM

Channel Id: undefined

Length: 9min 26sec (566 seconds)

Published: Sat Apr 13 2024