ComfyUI for Everything (other than stable diffusion)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

what you can do with comi other than using it for running stable diffusion I have collected almost 30 different use case things like image to text how to create a caption how to create sound effects direct from one image and some other ones are just as a simpler functions some effects filters or image enhancements quickly I will show all of the things I found myself using for the last month and how you can make your workflow even more complex with all of these functions so you don't need to go any other software other than comi let's start first one is lava if you don't know what lava is it's basically image to text model that can understand what is happening in the image and you can ask questions about that image so we have our main lava module here you need to download BL models to be able to use it and we can write our prompt in this part connect our image let's say we want to use this image and we can say things like describe the location of the image and what is happening and if we run it we will get something like the image features a sering seene with two small wooden cabins switched on the Steels of the forest they are positioned next to each other overlooking a picturesque leg or pond it's pretty accurate and we can you can also ask some other more detailed questions like maybe what are the buildings made out of buildings are made out of wood or we can maybe try uh with some different image as well uh there is really no limit to what you can ask let's put this image now and say maybe what is the style of the the image describe it as your structure with many windows and unique design it appears to be res or commercial building possibly office space and the building and it get cut because the maximum tokens is set as 40 we can increase it to maybe like 300 tokens and then it will or it may create longer output change 200 apparently 200 is the maximum one so we get a way longer output and with the temperature you can say how creative the model should be so this is the first one let's go to our second workflow in our second module I collected couple of different ways which you can use to remove background of different objects from any image you want to use in this case I will use this picture to remove background in the first two workflows uh we don't have control to Which object to keep and which object to remove uh it is kind of understanding on its own what is the main element in the image and then removes the background for that object in the second one we have couple of different models that we can choose some of them are for general purposes as you can see and for example this one is focused on the human segmentation which can be nice uh let's keep this one for now and the third one is slightly different than the first two because in this one we can actually prompt what we want to keep in the image let's say we want to keep the this armchair we can type as a armchair and then um if we generate it in the first one it decided to keep the armchair and this part of coffee table and the frame at the background in the second one just the armchair with this piece of wood here and in the third one it tried to only keep the armchair um but it also removed this part U we can try to fix this with the threshold value and so we can try to prompt it more add more detail but as we can see the quality wise is not as good as the first two but it is more flexible so it is up to you in which use cases which one makes more sense let's try to keep the poster on the wall one of the main reasons why I really like Ki is because it's just an empty canvas a tool for us you can do almost everything thing if you want to use it to take notes or create mood boards it's not really possible to do with confi but it's Brothers secr tal can help you with that secr tal is a visual not taking platform where you can place cards on your board and connect ideas together to document them better inside each card we can add list images PDFs videos and a few more option as well they have lots of templates for different kind of use cases like for um research about the blog post for example or how we can take lecture notes depending on the different topics here are my notes for this video on the board you can see it's really similar to confi and super flexible in terms of what you can do with it I have placed all of the resources materials and the custom extensions I used in the video here on this scrol board you can find the link in the video description thank you scrol for sponsoring this section of the video if you want to trade it out you can use the design 10 code for Discount okay let's continue our T module which is video to mask how we can remove the background of the video so in the first note we have our load video component which we can choose here to load our video in this case I have a video of this guy dancing so we can segment him from the background uh in this part we can choose the the frame rate we can keep it 24 or we can reduce it or make it higher and we can choose a limit of frames let's say you don't want to wait for the whole thing until when you are testing so we can set it like a 30 frames so it will only generate the first 30 frames of the video and then we are removing the background for all of the frames in this case instead of EET this the one we used previously I'm I'm going to use unit for the human segmentation model and then we going to merge all of the frames back to create our video so let's run it and of course depending on the length of the video this will take considerably longer because we are doing the same process for the each frame in this case we're going to do it 30 times you can see our guy completely removed from the background and also the alpha Channel version of it if you want to use it to maybe run through controller to create animation videos since this is only the first third frames it's not super long let's try maybe like a 90 frames with 15 FPS so we can do the whole video in this ones we can see all of the individual frames that generated and we end up with a video like this one so you have uh really total flexibility on the settings about the FPS and the frames and how you want to segmented basically let's continue our fourth module which is the LM part uh our text generator part I wanted to show a couple of different workflows for running different type of llm model models the first one is running totally locally on your computer or if you're using a server on your server and this case we are using this VM notes extensions we can easily install any models and then we can choose which model you want to use I have four of them installed right now the mixl and this one is trained especially for stable diffusion prompts it is not super cool but I can see how you might want to use it let's try this we have two prompt option one of them is the system prompt and the other one is a normal uh The Prompt we want to write in system prompt you can specify like what is the purpose uh what type of things you are trying and then here we can say maybe a things like create a prompt for building in desert covered with scent here uh again similar to Lava we have couple of options to give the maximum tokens the temperature and and bunch of other settings so let's run it and we have our prompt I mean it's bunch of different things happening here but I think it's a pretty decent one from model running locally let's try another one maybe mix when we get a prom like uh it actually put image generator prompt it's a decent prompt I think we can totally use it so let's go to our second option our note is called generate stable diffusion prom with llm this one is a bit more different because we are actually using another service to run our llm so it's not running locally but this one is using a platform called open t. aai in the parameters the last one we can choose which model we want to use and actually there are lots of them that we can try like the one we used previously and they even have GPT 3.5 turbo gini Pro Cloud Etc and some of them are saying like three run these ones and I think the rest for example like GPT 3. turbo you have some kind of limit set it to your account so far I'm able to use it without any issues so you can try to check it all you need to do in this platform uh you need copy API and go to a config file you need to paste your key here after that it should work again we have the our system prompts here and on the top we can type our prompt so I will just use the same ones we used here and let's try for example one of the three ones M and run get our prompt it's a similar prompt what we get here because it's the same model and let's try to use for example GPT 3.5 turbo we get our respond back our sty realism I think because of this prompt uh it's creating a this categorize here so maybe let's remove it and send it again and now with the update prompt we get a detailed a nice prompt you may think like why I should use GPT here instead of the chat version directly I see all of these extensions like a component of like a complete workflow so if you have all of these components like lava which you can get information from your images or llms or background removals and other things we going to cover they just become a tool that we can connect them together and create something way more complex so let's continue and in the last one um we have a note from mixlab the chat GPT note uh here you can directly place your API key that you can get from open Ai and then then basically use GPT 3.5 3.5 turbo for or for the newest version of for model directly inside here you can get your API key from the open AI accounts page and then place it here then right your prompt here and then send it it should work without any issues our llm part are this locally on a third party server and then uh directly from open AI let's go one more step to upscale enhancer workflow the one on the top image contrast adapter sharpening it's a pretty simple and nice component I guess it just removes the blur on top of the image and adds more texture but but not super high to make it obvious that it's being sharpened so it's a nice one and in the downside we have our upscale image using model and there are lots of model that you can find and use like for example real ascar are some of the most popular ones or this full Hardy one uh or Ultra sharp let's use ultra sharp model in this one and run and I added this image compos notes so we can can see the difference between the before and after versions so first let's go check the sharpening one if you zoom in a bit we can see the details a bit more clearly for example the right side is before and after of course there is not much super dramatic change but I think sometimes it's worth to do this we can see the BL is disappearing without changing the image at all and in the down one is the chain should be a bit more dramatic because we are upscaling the image four times and we can see from right to left the upscale and the normal version the one on the left is upscale and the right is the normal one um this upscaling version is not super crazy a good one it's just a simple quick one to use maybe in between the workflow between different steps not so great to use as a final upscaler because this is not adding any more details to the image in the next module we have some image filters that we can use to add some different touches to our images the first is the channel shape which creates this type of effect with uh moving the RGB channels of the image you can set up the distance and the angle you want to shake the channels we get an image like this all of these uh modules you can think they are more similar to like a Photoshop filter effect functions than nothing like a super crazy image generation ons but I think they are good to have in know like open source platform like this so we can do many different things I can imagine confi becoming a tool like grass software uh with lots of different possibilities to do but of course focus more on the AI tools to interact for non coders and the next one is we have a filter for waterolor the motion blue and this one is a bit more interesting because this creates a dep map and then it applies a dep blur and we can choose where to apply the blur basically how close or how far away this effect is better to show in a something like uh this image I think but let's go check all the other ones uh the watercolor gives this type of effect of course we are not actually generating a version of the same image with watercolor style it's more about the filters and the edge lines around the objects and this one is a nice motion blur with a horizontal Direction so let's try this one uh for the depth image I want to focus on the farest part so I will set this one as zero first we will create a depth map of our image and then use it to apply so as you can see this Parts which are closer to us get blured and we focus on the further wall like this and of course you can change it to the opposite way so the farest part will become blury and we focused on this part I think this is uh also a good Builder to use especially for for portrait images so maybe let's say we have something like this image and we want to blur the background so we can totally do it like this maybe this is too much blur we can also control how strong the blur should be let's reduce it a bit and run it again smoother or the other way around we BL the girl and then the background is in the focus let's change the image and I will turn off this Parts open this last one last one is a bit more different because in this one we are adapting the color of the original image we want to use in this case this one uh to match it with an reference image and let's run to see what type of effect we will get with these colors so it change to like this we can see like same wipe of course it is up to you if you like it or not but I I think it's a really quick way to change the image let's try this colorful one and see how it will affect the image you can definitely see the effect but it's a bit too much so let's reduce it and run it again so we had like a color correction almost on the image so these were more dramatic filters like a Channel Shake or the water color motion blue dep Mo or the color adaption in the next one we have more simpler enhancement features the first one is the filter adjustments which you can change the brightness the contrast saturation sharpness you can add some normal or gin BLS or enhance the details this is basic something similar to sharpening or the second one is apply some film Grain on the top of the image or the third one is if you have some loots that you can use to adjust the colors adjust the style of the image you can upload bunch of them to your confi folder and then choose there to apply some effects I found these ones uh free online that I will also attach to the video description so you can get the same ones let's run it to see all of their effects usually what I do is I add like a adjustment filter at the end of my workflow to fix the final image once I like it or if you want to keep the maybe same color style along the different images you generate you can apply the same loot to all of them to have a similar look so let's compare the first one it's not so dramatic change as I said it's like a adjusting the contrast of it and increasing the saturation the second one is more about the liim Grain on top of it so in this part is a bit more clear and the third one is the loot color adjustments this is more change compared to first two this one is for the vibrant color maybe like the same change to a warm filter and run it then we end up an image like this one let's continue to now more exciting workflow here we have text audio and image audio where we combine a couple of different modes we covered so far let's do the first one basically in this one we have Audio ldm model generative a model for creating audios from text you can give it a prompt here and it will generate the audio for right now it set as 10 seconds with we have similar settings like CFG guidance scale the seat number and um extension type we have a prompt uh City Life here maybe let's change it to um Bishop on a busy Sunday people are talking L room let's run it then we get a sound effect like [Music] this this is super cool on its own but it becomes even more exciting for me when we combine like a three models after each other in this one we have our lava image to text model and to describe our image the thing I want to do here is uh image to audio so we can maybe create image to video first and then at the same time create image to audio and then combine them to create a nice uh video with sound effects on its own so we have a render like this and I'm passing to Lava describe image and it is location and environment and then we are using the answer we get from lava to feed it as a prompt to our local llm local chat box and we are using as a system prompt you are an advanc sound effect producer AI suggest the possible sounds for this space suest one sentence long prompt for the sound effects and we are using the answer we get from our chat boot to pass it directly to our audio generative model so let's see what we can end up with first low is the scene set up in the park people walking around enjoying the outdoors where are several individuals in the area uh some of them closer some of them further away and then our llm mix model said create a layered soundscape of NBN chatter this L and Casual footsteps on various surfaces such as gravel payment Gra andless here I think it was pretty decent and I can totally see this sounds from this image so let's try another image let's try for this interior kitchen view uh I wonder what type of sound effect the our chatbot will suggest for the space let's see s food step approaching and stopping in the large modern kitchen followed by sound of chairs being pulled out and placed back under the kitchen island I'm pretty happy with it to be honest I wasn't expecting a prompt like this for the sound effect but let's see how good it will try to follow it in the our sound effect it's not bad but also not super dramatic because there is not much happening we basically just our footsteps in interior room but I think you get the idea and and you see the potential of it how we can use it so let's continue our next one this one is similar to the layer effects inside Photoshop which we can create some drop Shad off on some specific object in this image we are getting the sofa and remove background and then placing back with some drop shadow of course it is up to you after this point how you use it or we can add a stroke or around our object or we can apply outer grow or reduce the image opacity let's see their effect uh this is the image opacity reducer uh nothing super fancy but sometimes to blend different elements to each other it it might be a nice option I'm using this color as a background color for now so we can see the outer glow is like this one we can change all the colors the light color the glow color and the brightness this one we add our stroke around our object with a specific color in this case red and our drop shadow this one is more like a color palette generator from an image so maybe we can use to create some type of mood board the first one is we are getting the main color from the image second one we are getting the average color for the entire picture and then in the third one we are getting the color pette but it's like a pixelated version of it so we can see like the blues here representing the building the greens are here so according to their location in the image and the last one we saying you want to get nine colors as a color palette from this image and these are the color palette used in this image Let's test another one for example maybe this one and we can see what's the most dominant colors what is the average of them and then color p one for the placement within the image and one in general color palette which is pretty accurate I think so let's continue now we have our image to 3D generator this one is using the table 0123 model and then we are basically creating six images from different angles of this space you can also use a 3D viewer component here and then um be able to see it in in 3D this is more on the experimental side because especially in a picture like this in a like architectural design I don't think it's super useful yet but definitely we are going there especially after the new video models get a GD of images like this and if you checked all of them one by one um this is like a top view of the space I mean it's pretty accurate but of course the background is nothing happening because we don't have any information about that part but the part we see is pretty accurate this is from the behind I guess also this is from the other side again is pretty cool that it applied same pattern of the facade also in this part from the behind and on the side I total see some use cases for object design or like a character design uh but in this case it's more on the experimental side I place a remove in paint workflow this is a simpler workflow just to remove some objects directly with an given image and given mask as a location of the image uh in this one it's not like a common in painting workflow it's table diffusion uh we are not writing any prompt any other settings just the image and our mask so if you click right and then opening mask editor let's try to remove one of the windows for example example let's say you want to remove this one this one let's say there is like a weird line here let's remove that one as well and save it and now if we run it it will it will remove these windows for us and try to blend it with the rest of the image which in this case is pretty cool except this part which you can see some mistakes let's try another one so let's try to remove one of the chairs it did a pretty good job in this case I'm a bit surprised to be honest let's copy this one and place it here so we can try to remove something else to test it let's remove this light in the middle so maybe let's remove this plant here to give it a more challenging task see how it will go yeah is pretty good in couple of steps you can change this image to version like this the last workflow I want to show is how we can write text directly on top of the image inside comi and then combine them to create a grid view maybe you want to do like some comparison view or show a couple of images together let's say we have two versions of the same design one with uh like black concrete like this one and the other one is with Timber facade and we want to write like a dare material on top of the image has a small description right Timber facad to this one and this one concrete facad we are using two different text creators both of them are really similar the most difference between them is one of them you can set up some margins and line spacing Etc and in the other one you can directly add Shadow to the text uh in both of them we can choose some some different fonts or you can also just upload your own fonts that you want to use let me turn off this part first and run it you can see where is written as a text on itself and on top of the image so this is how we can type text directly on top of the image I prefer using this this one through text component which is a bit easier to use compared to this one and later you can decide where this text should go on top of the image from the sliders right now we are saying 10 pixel to right and 15 pixel to down so if we move this one the text should go more into the middle side like this and same as for y and once you're done with uh both of them let's combine them using first uh image to patch and then create image grid which you can set up the Border color border thickness how many columns you want to have and then once we combine all of them we should get a g with our images in two columns like this one so these are all of the workflows I wanted to share with you this is a slightly different video trying to show what are the things you can do inside comy UI it is not just a stable diffusion user interface but much more than that I can totally see this becoming a tool on its own not just for stable diffusion but with all the new open source developments around Ai and similar tools if you are a non- coder this is an amazing tool to use almost the full potential of these new technologies and their new models so I definitely suggest you to take a look into this one I have prepared so far two different videos how you can install locally on your computer uh with a single click installation and also same for the rod how we can use on the cloud if you don't have a computer these are the two videos so feel free to check them if you want to learn how to install them otherwise thanks for watching up to this point I hope you like the video let me know which one of the work flows what your favorite one and which one you think you will use I will include all of the workflows and the necessary extensions tools models in the video description and you can find the direct template on my patreon page so thank you for your support and see you in the next video

Info

Channel: Design Input

Views: 4,915

Rating: undefined out of 5

Keywords: architecture, render, AI, rhino, 3d, cad, modeling, unreal engine, tutorial, design, parametric, computation, parametricism, 3d print, planfinder, ai, arch, automatic, generator, Learn, Architecture, Visualization, lumion 2023, ray tracing, Interior Design, midjourney, stable diffusion, sketch rendering, sketch to render, ai art, artificial intelligence, ai image generator, midjourney ai, residential project, house design, home design, 3d modeling, 3d model, architecture student, ai design

Id: fUcDAExxndQ

Channel Id: undefined

Length: 32min 52sec (1972 seconds)

Published: Tue Mar 19 2024