Image2Video. Stable Video Diffusion Tutorial.

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello you lovely bunch of people today I'm going to show you how to take a still image like this and turn it into a video like this or an image like this and turn it into a video like this this is stable video diffusion and it's free it can take any image whether it's prompted or regular photo and turn it into cool video look at these cute little birds just look at these amazing results here oh and stick around in the end of the video I'm going to show you an AI art contest with up to $113,000 in prices so scientists were keeping an eye on the Earth turn but uh after 24 hours they got bored and call it a day AI so before we get to using this I'm going to show you a little bit about what it is the stable video diffusion is released by stability Ai and it's their first model for generative video and it it's you know based on the image model of stable Fusion not surprisingly as you can see here you got a couple of examples of the model working so all of these have been made with an image input and it's adaptable to numerous video applications multi view synthesis from an image that can actually take this and turn it into well sort of a 3D model and you know get it to spin so you can see it from all the angles now there are two models available for this and we're going to look at both of them there's one for 14 frames and one for 25 frames so that's basically how long this generation will run now here's a a comparison where where they call something win rate so that's kind of a Loosely based comparison uh because you're actually asking people what they think is the best however from the people that they did this test with stable diffusion or stab video diffusion came up on par with or on top of the competitors run Runway and pabs I mean obviously if Runway or pabs would have made this comparison they probably skewed it so they want it so you know take this with a grain of salt now here are two other examples and this is from comfy so this is already implemented into comy UI and you can download these workflows I'm going to link them in my text and image patreon guide as well that you can find in the descriptions below where everything is explained in more detail all the files are available there but I have to say these look pretty sweet now if you would load these workflows into your comy by the way if you need comy installed check my channel for that I might even link it in the top right corner right now if you drag and drop the workflow you're going to have a pretty simple workflow here so for many of my videos I create a text and image guide that I put up on my Patron for my subscribers here I go in depth with in detail about the settings and how to set everything up so if you need something more in-depth and detailed check out the guide on my patreon we have a load image we just upload an image and the preferred weight here is 1024x 576 then you can set your frames and how much it moves and the frame rate and all of that you're going to need to load the SVD models so here I have the SVD 14 frames and SVD 25 frames I'm want to show you in a sec how to get them and then we're running this into uh SVD image to vid conditioning and the video linear CFG guidance through a k sampler and then you can see you know you get the output here now the default workflow has a different video combine node but this is basically the same thing I'm using VHS video combin instead because I get more options for formatting stuff like that now let let me show you how to get those models so if you go into the links in the description you're going to get to the SVD model cards they're going to look something like this and you're going to have files and versions and you're going to have SVD XT save tensors so the XT that's the 25 frame version so you're just going to download that and in the other link you have the nonex version so just the SVD safe tensor that's the 14 frame one so what I did when I download them is I put them in my models checkpoint folder and then I renamed them so the SVD safe tensors I renamed to SVD 14 frames and SVD XT I renamed to SVD XD 25 frames and if you using comy you can put them in your models and the checkpoints and they should be in here it even says so put your checkpoints here now it is possible to load other images as well let's say that we take this one here of me standing around so this is a different format let's just do a square one here so this is a size that this model is not trained for with the 14 frames model and if I cue this up it's going to start working here and I'm using a 4090 so I have a lot of vram however this can be done with an 8 GB card now if you don't have a card that can support that I recommend you to check out the think diffusion where you can uh pay for for cloud GPU power for this and as you can see here even with this other resolution so we have one resolution here which is a vertical one and then we have a square resolution here so this is not optimal at all but we're still getting an output that you know it kind of works it's not fantastic but neither is the input but we have a camera pan here we have me I'm fairly stationary you can see the background here and it's moving now I've noticed that for the sampler I've tried a couple different ones and in my experience Oiler works pretty well I think it's it's one of the better ones for stabil diffusion if you take my favorite one in general to Caris for example now obviously this is a new seed but let me quickly show you here just found in general that Oiler was you know a good default model that I think you should probably stick with you can experiment a little bit but if it goes out of hand you can go back to the oiler one it actually gave me better results than Oiler a now we had some luck with this one so it kind of works we're actually getting a little better movement in this one so we have me actually moving a little bit with the camera here so that works pretty well so now I'm creating an image that we can use and take that into stable video diffusion I have a closeup portrait of a warrior woman long hair and full KN armor let's see if we can get that now these are all fairly simple renders I'm not really using a specific anime model or anything like this but I think it's going to be quite okay to work with so I'm going to take the second image here and to just drag that to my desktop and we're going to go back into comy and we're going to load that up here and we're going to erase the motion back bucket ID a little bit to get uh a little more motion into this and we are queuing here so in our output here we have some movement it's basically just a zoom but you can clear this see the character is separated from the background let's go a little crazy here let's raise the motion bucket even further let's just raise the augmentation level as well and we are queuing a second generation which would probably give us a lot more movement and this is basically what happened but our image kind of broke down so we're going to go back and we're going to decrease the augmentation level again and just raise the motion here by a lot and cue this up again so now our image shouldn't break and we should get a lot more motion and as you can see this is what happened our character is now looking good again she's not moving herself but background is moving a lot so it's all a little hit a miss about what you're going to get for the motion because you can't prompt for it but play around with it and see what you can do now if you don't have 8 GB of vram or more go into think diffusion launch the app comfy if you want a quick one just launch the turbo one start up is going to be a minute or two and then you're going to get in inside of confy where you can load the workflow and it's going to look the same as it is in my local install here now if you feel that these workflows are too simple for you there are more options now openart has a library of workflows for comfy and you can even sort them back category so if we check here for example SVD we're going to see all the workflows available for SVD and if we select this one here just as an example we can see all the notes available here so this is a quick representation of what would look inside of your comy so if you download this then we're just going to drag and drop this into our comy you will probably get a message like this or even more so you're going to go into the manager install missing custom nodes select this one here or whatever you have and press install that will probably be the case for any new workflow that you find and you want to use and it says here here now to apply the installed custom node please restart comy UI sometimes it works just by pressing this little restart button here if that doesn't work you're going to have to stop your machine and relaunch it and that's the same for your local version once you restarted all your nodes should be in working order and if they are not you're going to have to install the custom nodes manually if they aren't getting installed in the manager let's load a new image here take this one make sure that um an SVD model is selected here with SVD XT we're going to set this to 25 frames and then we're just leaving the rest default here and we are generating so what basically happens is we get one combine from the original and then we are running sort of a high-r fix that upscales the image a little bit so this is just one of many SVD workflows that you can find out there so we can see here now that our first generation has come in so this this is a non upscale or nonis fixed one so we have a pretty good zoom here with the character and background separated now if you don't want the little backgrounds and and colors broken up like this you're going to have to change from GIF to age 264 for example and now it continues on in uh what the author of this workflow calls an AI upscale we're going to see that in the video combine here as well as soon as that is finished and here's our final let me assume that in the a little bit so I mean this pretty okay let's see how big this is so this is 1432x 800 so I mean you could probably upscale this quite a bit and still get it looking decent so openart is holding this comy UI workflow contest and this what I talked about previously in the video so this is Award with a total price pool of up to $133,000 and that's split into a lot of categories so each category here says it will have three winners 500 buckaroos each and up to five honorable mentions $200 each so you have five categories you have art design marketing fun and photography and then you'll have a lot of special Awards like best workflow with IP adapter Best workflow with anime diff best workflow with st video fusion and it goes on and goes on so if you have a workflow ready you can basically upload it get a chance to win 500 uh smackaroonies so I mean that's that's not too bad bear in mind the workflows will be a available on open art if you decide to compete in the challenge so if you don't want them available to the public then then probably this isn't for you and it's fairly simple you just go to contest. openen art. and then you have upload workflow here that'll get you to this page here where you can just upload comy UI workflow you're going to drop your workflow right in here then you make sure that this box is selected here and particip in the open art workflow contest name the workflow Sebastian super mega great video workflow 3K 30k sure why not and you decide what it's about type here drag and drop a thumbnail to your workflow perhaps an image that you've created upload and publish and you're in the contest good luck
Info
Channel: Sebastian Kamph
Views: 23,999
Rating: undefined out of 5
Keywords:
Id: HOVYu2UbgEE
Channel Id: undefined
Length: 12min 23sec (743 seconds)
Published: Sat Dec 02 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.