Realtime Image Generation with SDXL Turbo

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
oh my god let's try that again that was so cool a city landscape thousands of people walking through it in a futuristic world at night time with [Music] neon lights in the background robots amongst the crowd of people and flying cars so now if I just go back if I just hold backspace you can see it changing as I as I go back in time oh that is awesome that is cool that could be like a time lapse or or something so stability AI just released sdxl turbo and that is stable diffusion turbo which allows you to generate images much faster and I want to take a look at it right now so here we are we're on Google Chrome we've got sdxl this was the website or this was the release that they did and it goes through some things here but we're going to skip that and I just finished downloading some of the models so I did fp16 and then this one as well and so the easiest way for me to get this up and running was comfy UI examples and so now that it's done downloading I want to try it out because I saw some impressive things on Reddit for Generation speed and I'm super excited um so let me just go ahead and pop open the comfy UI um window and then I can just go ahead and launch it and then uh what I'm going to do is just download this image and then drag it into here so after downloading it uh let's see we got our sdxl turbo Turbo example all right cool so here we go um and this is just comfy UI pretty General stuff but if we go ahead click Q prompt it'll create an image so let's go ahead jump this through looks like it already selected the um sdxl turbo there and that was pretty fast let's try that again um let's say instead of sunset let's say nighttime put that in there C okay wow that is blazing quick um this is only a 512x 512 it looks like um okay so let's try 1024 by 1024 by 1024 right Q prompt oh that's a little broken um okay well that let's try that again that's super quick holy moly 112 by 112 isn't working uh or one uh well 1024 x 1024 isn't working so I'm going to slap back to 512 and regenerate this here um or yeah I don't know why it's creating this there but taking a look at the model card it says it was trained off of sdxl Base okay so it looks like 512 x 512 is is what it is so all right let's just continue trying it out in 512 um but let's go ahead and do something different let's say beautiful landcape all right and then Q prompt see at night so that's not bad that's a little blurry though so not as clean as I would like oh that's a little bit better um but yeah no that is a little bit blurry but there are some probably additional steps that I could do like an upscaler so but before that I want to try one more thing and that is um apparently you can do this in real time and and I want to try that so there's this extra options if I auto cue um it should allow me to start making Generations um while I'm typing in real time so let's go ahead and try that out night with a sun um um hold on maybe I got a q ooh okay so you can see the Q size is going um 0 1 01 over there and whoa oh that's cool that is awesome mountains in the that is crazy hold on um okay let's try to describe a person walking through oh my that is crazy what the heck person walking through the street with a that is so fast that is that's mind-blowing what in the world oh what the a person walking through the street with a suitcase in hand facing the camera oh um okay I didn't get fac in the camera with a suitcase in hand in let's say a snowy town that is awesome that is that is super awesome I can see this being very very useful um as you can see it's not as coherent but also I'm just typing into here and it's generating which is unheard of um I do have a 490 of course so that is take with a grain of salt on how what type of GPU you might have um regardless though holy real time image generation let's try an anime see girl running through the sand running through the sand with a sword let's say sword in hand and is generating a sword try a staff now of course these Generations aren't that good but real time generation as you're typing to get images is wild um I don't I probably said that so many times already but um this is kind of cool let's try land Cape landscape scenery with sprawling mountains in the background and a dragon flying in the sky oh man that is crazy that is awesome I love that let's try a jar a jar with Universe inside of it and let's see with a jar with a civilization growing up inside of it let's say human civilization human civilization civilization oh that's creepy that is creepy with a human in it that that is creepy um and I don't know what's happening there so so yeah let's let's back up a little bit from that um let's try a different noise seed okay so let's run back through a person walking through a city a man walking through through a city head hanging down while holding a hat say looking down looking down beautiful beautiful 4K okay so so yeah the cohesiveness of this I don't think um maybe my prompting my prompting is probably just terrible um a suitcase okay so Wow Let's well let's try a um let's try to see if I can do the regular one but that isn't fp16 um and then okay we've got the um sdxl turbo running the non fp16 one going so let's see if this makes things a little bit better so see a man walking through a city through a city with sorry I'm cutting off the image a suitcase in hand meeting with a friend interesting so so yeah their faces are kind of messed up but um yeah what else should we try there are so many different things let's try a a comp a room with a computer a large L shaped computer desk in a futuristic that is cool did you just see that look at how it changed in a futuristic that is crazy look at how that changed that is so awesome I know the quality isn't there but you can see that happening in real time in a futuristic World cyber Punk style at night time with LEDs with neon lights in the background oh that is cool wow oh my god let's try that again that was so cool a city landscape thousands of people walking through walking through it in a futuristic world at night time with [Music] neon lights in the background robots amongst the crowd of people and flying cars I probably have too many things in here a try with thousands of robots walking through it in a future world at a night time you that is awesome so now if I just go back if I just hold backspace you can see it changing as I as I go back in time oh that is awesome that is cool that could be like a time lapse or or something and here we are it's cool that it's kind of starting at like this kind of seems like an older town you know maybe an older era and if we type like future wait what what the heck was that oh maybe I think I was typing like football ball yeah future what is that one of those walking things from Star Wars I missing the name futureistic robot oh that is awesome I can't iterate reiterate enough how cool this is to have it running in real time time um I will have to try this out on my 3060 to see if it is able to um keep up with this but this is completely gamechanging because um well if we can get this uh upscaled image that would be even better real quick okay so I just added a quick little upscaling workflow um and I put the images side by side so let's go ahead and try this out I do think it is going to be a little bit slower um yeah it's it's definitely slower uh because of the upscaling that needs to occur a universe inside of with a universe inside of it okay it is it is much slower um but if let's go ahead and take a look if you take a careful look um the quality is is much higher so yeah you can take a look at the mountains in here versus the mountains here they're much crisper in this upscaled image over here um but it is much slower so let's go ahead and draw with the universe inside of it and a landscape surrounding surrounding the jar yeah so that is much slower and just to give you an example of how much slower we can mute this one um and then let me just backspace and so so here you can see it updating in real time compared to um compared to if I unmute this again and then start backspacing it it's now processing a little bit slower but I believe that's probably just going to get better with uh time so we'll see how that um improves in the future the upscaler will have to be as fast as the inference model which is which is crazy cuz normally the um diffusion model is the one that's taking the longest is the bottleneck of the process but it looks like the upscaling model is now the new bottleneck of sdxl Turbo so this was just something that I found today and I thought it was pretty neat and I really wanted to try it so um yeah I downloaded it and hopefully you guys found this as interesting and as awesome as I did because this is pretty fantastic uh real time image generation but to see your ideas coming to life in real time just by typing is kind of fantastic to see so that's going to be it for today's video um if you guys like the video please consider likeing and subscribing if you're a member of the channel thank you for supporting and I will see you all later
Info
Channel: Jarods Journey
Views: 12,795
Rating: undefined out of 5
Keywords:
Id: 6oFvDfEK9So
Channel Id: undefined
Length: 14min 10sec (850 seconds)
Published: Thu Nov 30 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.