How to Make AI VIDEOS (with AnimateDiff, Stable Diffusion, ComfyUI. Deepfakes, Runway)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
AI videos is one of the hottest Trends in Tech deep fakes animated videos video to video generation text to video today we're going to go over a primer on how you too can get familiar with the latest Technologies here and how you can make your own video similar to if you saw my last video the AI short on how AGI takes over the world check it out if you haven't there's a lot of AI art in there and I'll show you how I was able to make some of those today shall we just jump right in now there's an easy way and a hard way to do this both are interesting actually the easy way is simply using a service like say Runway ml.com which I'll show you towards the end of the video but both are based on using stable diffusion which is an open source project and so the hard way is really just running your own stable diffusion instance on your own computer now since I'm using a Mac actually today I'll be using a hosted version of stable diffusion and there's a number of services the one I'll be using today is runi fusion.com if you do have a Windows machine you'll be able to run this natively if you have a Mac there are some new projects lately that's trying to build in better support but essentially we'll be using animate div stable diffusion with kyui to generate the AI videos so animate div what is it it's essentially a framework for animating images then you have stable diffusion which is the text to image AI generator and then comfy UI which is a node-based editor for this project all right so to get started I'll be using run diffusion which is essentially stable diffusion in the cloud fully managed the first thing we want to do here though is we want to select a UI interface for stable diffusion because I think stable diffusion is just like a command line interface so there's all these uis for it we'll be using comy UI which is node base drag and drop so if I select this I'm going to choose the medium Hardware comfyi software version I'll be using the Beta release and then I'll click launch and so this is now starting up the machines what we'll be doing today is videoo video AI generation where we'll be modifying the style of an existing video if you take take a look at this guide there's actually a videoo video control net Json file that you can download there will be a link in the description below so you can follow along and so we have your comfy UI with stable diffusion loaded and I'm just going to click clear the workspace and drag in the Json file so this is kyui basically and you can see there's all these workflows and processes that the images come in and out of and it just helps refine the images and parameters you can see for each of these nodes there's all these different parameters that you can set but essentially it's going to start with an input image here and so we're going to be able to load a video or a set of images I'm going to load in a video so I'm going to rewire the image for this node to the video node and then here in my file manager I can upload the video video. MP4 and so this is the image I fed into it it's of me typing here looking sharp and I'm going to reference that here in the path so it's going to be SL Mount private slide. MP4 now if I click Q prompt I can see is starting to bring in the video but there are some errors here highlighted in the red boxes and if I click the server manager I can actually take a look at the log here and it's just not able to find some of these checkpoints now your checkpoint is essentially like a snapshot of a pre-trained model so it's a way for you to style the type of images that you want and so there's all sorts of different checkpoints here there's like Disney Pixar cartoon style so this is trained on like Disney Pixar cartoons and you'll notice there's also sdxl models the sdxl models is a different type of model I'll explain that a little bit later but it's kind of like an incompatible style so you don't want to pick that for this example and then we're going to pick a vae here now I don't know what the vae is and this is part of the problem with this workflow it is quite complicated and we can see here it's also going to be generating this line model so maybe doing some Edge detection here for the images and maybe it's using the line art to determine some of the motion you can check where the inputs and outputs are flowing into here it flows into some K sampler notes and then there's a prompt here where you can change the subject matter so we'll say it is a cyborg male robot typing and so now if I click Q prompt it's going to start generating this and we can see what we get check the server console and I can see it's actually starting to generate the images here zero frames out of 25 we can take a look at the result so apparently it had created this animated gift yeah it looks Pixar style what you can do is actually come into this node and change it to a mp4 file format and then recue that and the cool thing about comy I is is smart enough to know not to reprocess the whole entire workflow but just that last note that had changed so the process is faster this time we can see the np4 file here this is the final AI video that it had generated for us it looks all right there's actually a nice website civit ai.com which has a bunch of pre-trained art styles that you can use to generate your own videos and so for example let's say you want anime style there's a model here known as dark Sushi mix which was trained on these anime styles and so you can just copy this URL and run diffusion actually has support for bringing in civit AI models so all you have to do is click the civit AI button here search for that URL and then you can download that into your workspace now if I come in here to change this to dark sushi style so I'll click that and then click Q prompt then it will run through this whole workflow and we'll see a different stylization for this in anime style now the other way to do this the arguably easier way if you don't want to run your own node is to Simply use Runway ml.com which is essentially a hosted version of stable diffusion I think they even help develop stable diffusion but they do have something new here Gen 2 which is generating video using text images or both and so typically what I would do is I would first bring in an AI generated image I generated the images using mid Journey but you can also use Runway or dolly or any number of AI image generators if you're using mid Journey you would come into the mid Journey Discord and then you would use the Imagine command to generate an image like imagine a dog on a beach in photo realistic style and then you can give it like an aspect ratio of 16 by9 and it will generate this for you and it shows up in your mid Journey feed you can see all these other images I generated earlier download the image of your dog bring it into Runway and then give it some camera motion like you can zoom out you can roll you can pan and they even have something here known as motion brush where you can just select the area of the image that you want to animate and decide if you want it to come closer further left or right save that and then you can click generate and the nice thing about Runway is it's fairly fast after 2 minutes or so we have a dog here just smiling on the beach with some motion and you can even extend this by another 4 seconds if you like you can also give a description here to actually describe how you want the scene to animate and it's going to take this into account here and so that's another way to create AI videos and it's a good way for animating photographs or even memes now aside from Runway Gen 2 they also also have gen one which is the videoo video generation which is similar to the animate diff example we were doing earlier so for this I can also bring in the same video I had earlier of my self typing give it a style reference or even a prompt so we'll say like a cyborg machine robot and then you can generate this Runway actually has preview Styles as well which is a great artist tool I think increasingly the ease of use and the UI is important for these AI tools especially for more creative types but we can see here a quick preview of the Styles and generate the video for the one that we want here and so this here's the video generated with Runway ml a simpler process but perhaps less customizable than running your own noes and then a few other interesting tools here if you're looking to create deep fake videos for example that we have achieved AI then one thing I found useful is wav to lip where essentially you can just upload a video and a voice sample and it's going to sync the lips to the video so it's pretty much plug andplay there's even this GitHub project I just found sdw to lip uhq which apparently is an improvement on the original using some stable diffusion techniques but at least the version I used was the original just a w to lip tool and they have this website syn labs. so I like these hosted versions just because I don't have the time necessarily to be playing with all these tools where you can just upload a video and an audio file click a button and it just kind of trains this mostly and generates something for you for the audio track if you're looking to clone a voice and to have it say something what I would use is this tool on replicate tocom now replicate is essentially hosted machine learning models and so here I've loaded the too TDS model you can just search for it generate speech from text clone voices from MP3 files type in the text that you wanted to say upload a sample of the voice and then click run and it generates the audio file essentially if you're having trouble with that you can also try 11 labs. I which also does generate the voice Ai and they have this as a service and then finally before I close I thought I would show you the latest inst stable diffusion stable diffusion XL turbo which is essentially realtime image generation so in the original model was stable diffusion after that they came out with stable diffusion XL which is an improvement on the model with better human anatomy and just more accurate image generation and so then just recently they came out with sdxl turbo which is realtime text to image generation they have a sample website here clipd drop. where you can play with this yourself and so I can see here dog on a beach and it's going to be generating this very quickly and like maybe I can change this to a cat playing with a ball right so it's just generating this very quickly and if you're curious you can also run these workflows yourself if you come to the comy UI GitHub you can see the examples and so they have sdxl turbo here where you can actually download this workflow using comfyi so you just download that bring in the sdxl turbo checkpoint and then click Q prompt and this is going to run through this much faster so there you have it an image of a fox in the bottle but I can also change changes to say a cat in a bottle and it's going to regenerate very quickly using the sdxl turbo model and so there you have it a basic primer on AI video and AI art generation personally for me one of my favorite and maybe the easiest tool to get started with this is Runway ml.com or they have like text to video Generation video to video all these different tools image to image generation expanding an image subtitles and so forth but let me know any other interesting tools in the comments below or any questions you might have hope you enjoyed the video see you in the next one thanks bye
Info
Channel: TechLead
Views: 118,868
Rating: undefined out of 5
Keywords:
Id: dFJxwl-azEA
Channel Id: undefined
Length: 10min 30sec (630 seconds)
Published: Sun Dec 03 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.