ComfyUI - Learn how to generate better images with Ollama | JarvisLabs

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hi everyone. I'm Vishnu Subramanian, founder of  JarvisLabs. In this video, we are going to see   how anyone can create beautiful images without  actually knowing much about prompts. To do this,   we are going to use something called if prompt  will get to the technical details a bit later.   But before that, let's see what can be done here.  Right here all I am giving is the word woman,   which is probably the most used for generating  images and see how the prompt is being getting   enhanced, and how the model is able to generate  really good looking woman without any of the fancy   prompts that we otherwise would be giving.  Right? And there are like different ways,   like you can say probably neonpunk, and it gives  a completely different style of the image, right?   Just by certain tweaks, we can control how the  image is being generated. How can you achieve this   or how you can do it in your instance or something  will be looking in this video before we do this,   we need to understand important tool called  OLLAMA, which is an open source tool. If you   are using JarvisLabs it all, it already comes  as a template. So you can just click and you can   point it to your local UI tool, which probably  I'll be making another video soon. Right. But   in this video, what we will do is we will not be  using the OLLAMA template, rather we will be using   the comfyui instance. And installing the OLLAMA  in it. Write the actual steps for that. What I'll   do is I'll leave it in the description so that you  can try it. If you're using your own workstation,   the steps should be similar or you can go to the  OLLAMA website. They have steps for Microsoft   Windows, Apple Mac and Linux. Machine on. We  provide Linux machines in Jarvislabs, so you can   use these two steps to get the OLLAMA installed on  your cloud instance. Once you do that, you have to   start the OLLAMA serve, which actually enables a  server. So let's do that. And we will also need to   run some particular model or we need to download  some particular model. Let's say for example, in   this case I'm downloading something called Gemma  : 2b to be right. Just to make things quicker.   I've already downloaded this model so that we  don't waste time. Right? Once I'm here, I can   talk to it, like how we interact with a platform  like ChatGPT, right? Can you give me, . Prompt   for generating. You know. Beautiful.Mountain.  Right? So if you see if you look at the response   for this, it's not very much specific to how  we use prompts for stable diffusion. Right. So   that's where we use another tool or a node for  company called if I tools, which is , probably   an amazing work done by probably an individual  contributor. It's open source software. So what   this allows us to do is it allows us to use the  OLLAMA, multiple LM models available in it with   our Comfyui software. Right. So here what we will  do is now let's get rid of this by clearing the   workflow. Right. And let's start with the default  one. Right. So before we do this, let's just move   all these things to the side so that we have  enough space to build it let's. Probably pick.   I've been playing with the juggernaut model.  Sorry. It is SDXL. Oh, let's use this. Right   if you're doing it for the first time, you also  have to install this multiple nodes. For this one,   you can just install custom nodes and you can  search for if comfy UI, right. It's already there.   Or what you can do is in this my case, I have  already installed it. Right. So this one is the   comfy UI if ai tools. You install it, restart it.  You can just do it from ComfyUi Manager. As usual,   if you're using a JarvisLabs installs the ComfyUi  managers pre-installed for you, right? Once it's   done, let's try to go and add the if ai tools node  so that we can talk to our lama . different models   right. So it's part of impact frames. Let's start  with something called if prompt to prompt okay. So   here. What. So we added a node. Okay. And in this  node what we are going to do is it's talking to   the local host and port number 11434. That's the  default port at which ollama runs right. Let's   pick up a model called GEMMA okay. And let's leave  everything aside. And then what we will do is   here. We are not going to give the positive prompt  or negative prompt. So what we'll do is we'll say   right click and say convert text to input. Once  we do this, we can actually connect the text to   input here. And we'll do the same here so that we  can do the same for the negative prompt. Right. So   let's minimise these things so that we have enough  space for playing around. Right. So I think this   should be enough for us to generate the image. And  let's bring it here so we can see things. Right.   So this is very simple right. But let's say what  if you want to understand what is happening. So   in order to do that lets there is another node as  part of this impact packed. So impact frames which   is called IP display text. There is other node  also called save text which you can use probably   if you want to , save the prompts that is being  generated. So let's put this to the response and   put this to the negative. Right. And let's also  bring another one and node. Impact frames and see   IP display text. Let's put this to the question.  Now let's run the prompt again. Now what you would   see is we can see what is the positive from that  is being generated. Right. So we are just saying   okay let's remove this. And let's say just say cat  give me an image of a cat. So what is happening   here is the Gemma model. And the node is doing  certain applying settings styles to the prompt.   And it's also giving a lot of more details to the  from that we have. Right. That is what is making   the , images to be generated very beautifully. So  we are able to do a lot more professional images,   or it gives you a starting place for us to  have images. And probably you can remove   the words that we don't want. These are not  perfect, but this gives you a better place to   start. Right? So but what if I don't want prompt  whatever. I want to start with an image. Right.   So in order to do that, what we will do is we will  not use this particular thing this particular node   that is prompt to prompt, but rather than  that, we will say image to prompt. Okay,   now we pick this. Right. Let's delete this. All  right. And let's do these connections as usual.   Let's keep this bit far, okay? For now. Since we  have seen how the prompts are getting changed.   Let's remove them. Right. And let's connect the  positive prompt here and negative prompt here.   So I have to enhance this. Then I have to connect  this to the positive prompt. Now let's minimise   this. Let's expand this and let's connect it to  the negative prompt right. Once this is done. So   what we can do is we can attach an image to this.  Let's say load image. Okay. And let's use the car,   that image that we have. Let's see if you're  able to generate the image. Which I don't think   so. Okay. The reason it doesn't work is all  the models are not good at understanding the   image. To do that, we have to use a model called  llava. It is a multi model LLM’s or usually single   model. That is, it understands the text right.  So llava is capable of understanding the images   also that also you should be able to run it with  the ollama tool which is open source tool   which allows us to run multiple large language  models or other multi multi large models like   lava. You can also run them on your laptop  right. So now we have this image. We have   chosen the model as llava. Let's see if the  workflow is able to generate images of a car.   All right, so now we got the image of a  car. That's it guys hope you enjoyed the  workflow. If you face any challenges,  you can write down on the comments we   will try to answer. Or you can also  join our discord group please hit the   like button and subscribe to our channel  for more such videos. Thank you. Bye bye.
Info
Channel: JarvisLabs AI
Views: 4,239
Rating: undefined out of 5
Keywords:
Id: i9DFP9W66bM
Channel Id: undefined
Length: 9min 26sec (566 seconds)
Published: Sat Apr 13 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.