The BEST AI Video Model Is Out & FREE!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so the AI video generator that we have all been waiting for is here it's awesome and you can use it today it's not Sora which was killed by vidu according to a bunch of YouTubers anyhow uh and then vidu was then killed by Google's Vu and then Vu was killed by cling which we just saw recently but you had to have a Chinese mobile phone number in order to access that but all that is in the past now I've had access to this new model for a couple of days now so we're going to run through the whole thing plus I've got uh at least one exclusive piece of information for you okay let's dive in so this new model is from Luma Labs who I have covered on the channel in the past they had uh Genie which was a Tex to 3D generator well now they have released their new AI video model dream machine and I mean this thing is insane so not only can dream machine do text video but it can also do image to video which is something we have not seen from Sora as of yet we're going to take a look at examples that I generated from both sides in just a minute and we'll go over you know what it's really great at and you know where it's still kind of lacking for some quick technical specs dream machine does generate at 128x 720 uh the clips are around 5 seconds and they generate well on the website they say less than 120 seconds I've definitely not waited as long as 2 minutes so it is faster than that the UI is dead simple which to be honest is actually kind of refreshing right now um there is a little tick box for enhanced prompt depending on the length of your prompt if you just want to give it something fairly simple uh I'm just going to run the Sora Tokyo woman prompt and let's see what we get and the Tokyo woman prompt indeed gets us the dream machine version of that prompt now I never really like to one toone compare to Sora prompt so we're going to modify this prompt in just a little bit and take a look at you know where I think things are much more interesting and then obviously L if you want to use an image reference uh you simply hit the little photo button here and then upload a photo now I'm always a fan of starting off with text to video namely because that gives you a really good idea of what the model is capable of so uh kicking off with a cinematic action scene a Hitman bald wearing a black suit in an abandoned Factory in a shootout against other assassins yielded this which is super cool I mean very Dynamic very action-packed the second version from that same prompt um you know obviously per prompt you get two generations uh yielded this as a result and yes while there is some decoherence and a little bit of you know kind of Morphin I think that there's so much like action-packed Dynamic movement happening in here uh like with a handheld camera and everything that I I really don't mind is it perfect no it is not it still does weird AI video stuff uh but it does so at a much higher quality and I mean honestly that makes the results that much more funny this one actually wasn't bad this is a beautiful pirate woman crosses her arms while standing on the deck of a pirate ship uh longtime Channel viewers you may know where we're headed when we hit image to video and while I don't think that this shot is bad we do end up you know Midway through cutting to a reverse angle uh that said I you know I got to admit like you know as our pirate woman kind of moves towards him she's kind of rolling up her arms there like this dude's getting punched we'll be rolling back to that pirate ship scene in just a little bit uh because I've got a pretty cool hack for you another quick text to video example the is a young man walking alone on a beach foggy Sky Full of dark clouds soft sad atmosphere Noise video shot by Retro Camera I mean this is 100% a shot from a music video I mean very clearly sung by someone with a British accent likely about having never seen the Sun but to note that atmospheric prompts like foggy Sky Full of dark clouds and the soft sad atmosphere definitely do play a pretty major part when you are using text to video again because I want to set expectations and not cherry pick here uh this was another version from that same prompt so interestingly in terms of that enhanced prompts I wanted to see what would happen if we turned it off so uh I ended up taking here's another sore example but it's uh Paul trello's you know massive long block of text uh that he used as a SORA prompt for his music video uh and ran that and I mean we kind of get some fairly comparable results granted our shots are not necessarily as long as Paul's are and I didn't modify the prompt through the various shots as Paul did and to be honest I only ran it like you know three times whereas Paul ran his about 700 but I will say that even though these are only 5sec Snippets it definitely shows that I mean in my opinion that this model is definitely on a sore level now I do have a trick coming up in just a minute to show you how you can extend these shots and probably get them up to something more like 1 minute but my method is kind of more of a hack work around I did talk to Luma about adding extensions in and they said the model itself is capable of pushing out as far as they wanted it to but you know obviously things start to break down uh and what they said is that you know characters will just kind of end up standing still and there won't be a ton of motion when you start pushing past the like that 10-second Mark finally before moving over to image video yes of course I had you run Will Smith eating spaghetti and these were the awesome and hilarious results uh so yes clearly uh this model still does not pass the Smith test so sliding over to image to video which is the thing that I think that most of you are probably going to be interested in uh the first one that I ran was an image that I generated for another project this is a synth playing a synth I thought it was funny and running that through Luma gets us this result which is super impressive um you know there is a little bit of morphing maybe going on in her fingers you know fingers playing piano kind of tough for AI video but more important than that is the fact that you know the background stays very coherent the character doesn't end up morphing out like I'm actually surprised at the level of detail that it keeps on sort of the synth suit as well uh you know other models like previous video generators I I think you would just see a lot more uh you know decoherence shifting and morphing going on given that level of detail the other thing that I want to shout out is her facial expressions I mean granted there isn't a ton of it but she's you know she's a synth she's not going to be extremely emo uh but she does kind of have a little bit of stank lip going on there as she's playing uh so whatever she's playing is definitely very like maybe it's a daff punk song it's definitely got some funk in there and of course you know I'm going to run one of the Channel's favorite recurring characters Dutch football player Daniela van denak dressed as a pirate and by far this is definitely the best output of you know that image that we have seen yet I will definitely say that the Luma camera AI uh for sure is shooting for a very specific audience here and I will say that giving specific actions to characters which we actually have not seen as of yet uh can result in a little bit of weirdness uh for example taking the Daniela shot again and then giving it the prompt to have her cross her arms that's the you know pirate example that we saw in the text to video uh portion um yeah I mean it it definitely loses some stuff I mean we definitely get a lot of like uh morphed AI fingers and hands and and definitely the arms kind of fold in and become like one weird kind of like David Cronenberg sausage there that said that is just one output I'm sure that if I spent a lot of time rrolling that shot we probably could have gotten something a lot better that said you can get some really good results by prompting action uh for example in this image to video uh output this was a young blonde princess turns and looks towards the camera and smiles I did give some descriptions like uh she's in a garden full of flowers and birds a close look a castle in the background uh fantasy movie style um yeah though it definitely followed directions here in terms of you know our gal turns and she definitely does smile towards camera in terms of camera Direction it was a bit of a mixed bag uh you know sometimes if I as it to pan tilt you know dolly or Zoom it would uh and then other times like in this case where I took a Batman image and what I was hoping it would do is rotate around to uh show Gotham uh but what instead it just gave me a hard cut so you know it kind of cheated it it it works um but it was a cheat we're going to roll back to Batman in just a second but here's another example of uh you know kind of cheating the directions uh in this I had prompted for a wizard holding an orb the camera Zooms in on the orb and transitions into an Epic Fantasy battle scene uh so obviously what I was looking for here was actually the camera to directly go into the orbit and the reflection to kind of turn into an Epic Fantasy battle scene um what we ended up getting of course was you know you kind of move into it and hard cut Cuts over to I don't know if I'd exactly call that battle epic it is funny that in my head I was thinking you know like Epic Fantasy battle scene like like a Peter Jackson kind of like Lord of the Rings level uh battle sequence and what we got was you know this is what you're getting on a budget this is like 32 extras on a Sound Stage remember what is really impressive there is that it did manage to do a transition like that and it you know tonally stayed consistent with our initial image reference there was no reference for our battle sequence rolling back over to Batman I decided to take one of the you know most iconic shots from all of modern Cinema uh the opening of The Dark Knight and for the text prompt I thought it would be interesting just to you know take literally the line from the script uh a man on the corner back to us holding a clown mask an SUV pulls up in front of him the man gets in uh this was the result the result and while I do not think that Christopher Nolan is worried about his job at all here uh I mean yeah it is not you know the actual film at all but it did more or less follow directions and to be fair that one was cherry-picked uh here were some of the other examples that I ended up generating from that initial image um this one's pretty good this one I thought was really funny with like a chauffeur getting out and like uh your car is here Mr Joker interestingly running one of the script Pages without the image reference just as a straight text to video yielded uh this as a result which actually I mean I I'm super impressed with this I think that this is actually very cinematic definitely doesn't necessarily look like it's from The Dark Knight but looks like it could be from some Heist movie something that I was curious about was what would happen if you fed in an actual photograph so uh this is a photo of a younger me at San Diego ComicCon uh meeting Scott Ian the guitarist from Anthrax on the convention floor so uh taking this image and running it through Luma uh we end up with well this which to me is actually super hilarious I mean aim definitely is super pumped to be there it definitely loses the coherency in my face but I think it does register the excitement uh that I was feeling meeting Scott Ian now in terms of shot extensions yes it can totally be done you know using the old final frame trick so you know what you would do is at the very end of your clip you just simply take that last frame save it out as a screenshot and then feed it back into the AI video generator with a different prompt um so for our text to video version uh with our pirate woman who crosses her arms we ended up with this uh which to be honest this is still part of the same shot he turns and moves and now this is the secondary shot um so yeah that now becomes a 10-second shot will that work for every shot well it may or it may not uh I did end up running kind of that Sora old mining town shot and tried to extend that uh the problem is that with this particular shot we do get a lot of like decoherence and morphing the sun kind of goes down you can see definitely where that transition takes place too with that hard snap but I think that with some adjustment some rerolling some planning you can definitely pull off you know a minute long sequence if you wanted to I have a lot of exploring to do with this model as well as you know trying things out like bashing it into the Crea upscaler that we took a look at last video and I'll definitely be taking everything that I learn about it and putting it together as one big like ultimate tutorial lesson so you know if you haven't had the chance to subscribe uh I do invite you to do so anyhow go get started on your projects I cannot wait to see them I thank you for watching my name is Tim
Info
Channel: Theoretically Media
Views: 134,591
Rating: undefined out of 5
Keywords:
Id: JmSHU2FZ8II
Channel Id: undefined
Length: 12min 43sec (763 seconds)
Published: Wed Jun 12 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.