AI Generated Videos Are Actually Getting Out of Hand

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

with AI generated videos becoming this good aside from not being able to tell what is real and what is not anymore another major threat we all have to face is that you can now get Rick rolled in an unimaginable way like you think that's an Eiffel Tower nah you're getting Rick Rolled oh you think those are clouds oh guess what's below it you're getting Rick Rolled you think this is the distracted boyfriend meme nah he's actually the Rick Roll right now you can pretty much make the most absurd things in the world like connecting memes back toback or while looking cursed but some what reasonable it's like that crayon a era all over again with the most uncanny AI generated images but instead now it's time traveler stopping famous videos from happening for those who are not really in the AI Tech bubble and wonder why this is new or exciting and just watch that Drew good in video well you're not too wrong about that but just let me give you a quick rundown on what's different and what's new this time back in February 2024 Sora from open AI was announced with one of the most jaw-dropping pure text to video generat ators we have seen to this dat what I mean by pure is that the AI can generate videos from only text there have been other various AI video generators but those are either not as impressive or fundamentally work on a different level like these ones need a reference video underneath and in some cases it's code video to video generation the difference between the two is kind of like watching a kid learning how to draw from scratch versus them showing you some artworks they have traced so the former is definitely more impressive but the latter would still provide consistency that pure generators cannot achieve like if we can map out the 3D environment first then run a blender camera to film it from a 2d angle and use AI to sync up these perspectives while it may not be pure text to video anymore this method does provide a more consistent world compared to any other pure text to video generators the only thing that stopping you from doing that right now is a developed toolkit to let you easily do that which brings us to today's sponsor storyteller. a they have developed a ready to use toolkit that lets you do exactly what I just described easily with edit edting timelines 3D send box and built-in generators from the developers of fake you.com Storyteller is a generative video tool for creatives that want precise control over their video outputs it's inspired by both physical film making as well as game engine Machinima you can get of think of Storyteller as figma or KVA for video creation and its creator in the loop system lets you build sets populate them with actors and provide you with precise control of all the action and animation for the 3D environment you can easily build a reusable one yourself or pick a preset from the futured stages with you being able to fully customize your sets by adding props from the menu or import your own 3D assets animations audio and even characters each character can be given some animations and you can even use motion capture to precisely control your character's body and facial movements but if that's too much of a hassle you can direct your cast to walk dance fight and so much more then when you're ready to generate your video you can choose from a variety of Artistic Styles ranging from realism to anime and once your video is complete you can come back and edit it again in the 3D editor at any time similar to the concept of a re-shoot but in a completely different Manner and if you encounter any problems or want to share your ideas or results definitely join their Discord Community where other passionate creators are also playing with this new tool they're currently in a close beta and normally require signing up for the wait list but if you join right now you can get access as part of an early sneak peek check out storyteller. now using the link down in the description and thank you storyteller. a for sponsoring this video you can also learn more about the difference between AI generated videos in this old video of mine so in the last few weeks a few pure AI video generators were announced one after another probably thanks to the fear of missing out which is a pretty common occurrence in the AI commercial and research space as they are trying to outshine the others or being fearful of becoming obsolete some people do say these new AI generators have come close to the quality or even surpass what Sora could do but I think that's also Up For Debate as we don't know if the results shared by open AI are representative of what Zora can actually generate in the first place as we have learned a few months later after Sora was announced some newer and cooler demos were actually postprocessed and edited well it is obvious that they want to impress the Hollywood folks but that's definitely a questionable move there about that the first ever AI commercial was made by T sorus which is pretty neat I guess they might have used the Sora but who knows and with sora's release dat still being completely unknown it only took other companies four months to figure out how to use the new model architecture called diffusion Transformers that made Sora so good I've also mentioned in my older video where the AI architecture cold diffusion Transformers that made all these improvements is not exactly new While most researchers a year ago knew this architecture is the next big thing it was still only proven useful after open AI Sora and stable diffusion 3 were trained based on the variant of it so they are basically the successful Lab Rats that might have cost tens of million dollar anyways this recent AI video boom was all started thanks to the release of cing published by a Chinese social media company called quiso which is similar to Tik Tok and the demo previews they provide were just really good especially for the close-up shots on the body movements like when they are riding a bike riding a horse eating eating eating eating and eating wow that's a lot of eating going on in general the AI generators capabilities are highly dependent on what it was trained on Chinese mukbangs that's all there is okay joke aside like it can generate Chinese girls pretty consistently and have people holding Chopsticks without some crazy deforming can any other models do that probably not but what is even crazier is that they let people use it completely for free it's just that it's only available through its mobile app it's all Chinese and it's a bit complicated to sign up you can refer to theoretical Tim's video to learn how to sign up if you want to try it never mind weip app is available now at a time of editing on the other hand the biggest downsides of this model is that it is Chinese based so if you do want to use it you probably have to pair yourself up with chat gbt when prompting the model this is not foolproof either as some sentiment will still be Lost in Translation on the flip side though clling is able to generate up to 2 minutes at 30 FPS extend videos up to 3 minutes and lets you choose an initial input image I'm not sure if all these functions are available for everyone these info are probably just them sharing what their model can do but the initial input image should be available next up we have Luma dream machine which was released 5 days after cling this was another big but pleasant surprise too as Luma laughs was previously known for their Nerf and 3D Gan applications where you can convert a series of 2D images and reconstruct a 3D environment out of it and to see they have been cooking a text to video model is definitely something I didn't see coming the quality of their previews looks around the same level as cing but definitely have a different vibe to it you can roughly guess what the model is primarily trained on and should be good at based on what kind of demos they promote more so from what Luma AI has posted on their social media it feels like it has a more cinematic theme around its Generations from the color tone to the camera movements it feels less stock video looking and doesn't have the Tik Tok Vibes to it it can even do some pretty cute animations too or in the style of anime but how the community uses it has been drastically different from what they intended to make like later on when they introduce the key frame function which is essentially being able to pick the starting and the ending frame of a generated video things start to get wild yet Time Travelers stopping Vines from happening and memes being connected to each other this is also how I made the video intro and the internet is definitely rediscovering the era of crayon a once again thanks to Luma however for its free tier Generations you are only limited to 3 per day and 30 per month with video length of up to 5 Seconds the Quee is pretty long too so you have to wait a pretty long time for your generation to process but the actual generation itself probably only takes a few minutes for me I had to wait around 15 to 20 hours to get two of them done I don't know how exactly long it took cuz I went to sleep so my friend helped me out shout out to Quay the most cracked 3D researcher I know you can check out his video if you want to learn Nerf or 3D galaxian implementations down to the nook and cranny like implementing them in Cuda while it is in Chinese there's still English subtitles too if you really do want to learn anyways if you do want to skip the queue the price right now is currently sitting at uh 30 bucks a month which is pretty expensive from that you also get 150 Generations no watermark and a commercial use license there are other tiers too with the main difference just being having more Generations per month while the key frame and the video extension function is available even for the free tier and just as I thought Runway AI was really going to stay there and take the L 6 days after lum's announcement Runway announced a preview to their gen 3 Alpha model then 10 days later they open the access to everyone of course with a catch you had to pay 15 bucks a month to even try it this time however gen 3 Alpha is coming in strong with one of the strongest Clarity and consistency out of the other few with a lot of POV or drone flying cinematic footage from showing bird POV to end POV I think they want to demonstrate their strong generalization capabilities on the realism aspect the movements of the animals are not as awkward either apart from that the close-up shots for the human faces are by far the most realistic ones with the skin and the eyes being the most realistic out of the other two it kind of does work with animations and Anime realistic fantasy content looks pretty good too I guess or does that count as horror however gen 3 Alpha currently doesn't have the function where you can choose a starting frame but it might be a matter of time as previous gen all have dysfunction and they have also promised more structural and motion control that'll be added further down the line but with the current 15 bucks you only get 62 seconds of generation every month if you do some quick math that's basically three times more expensive than Luma AI but there is also a native upscale function which is pretty neat but if you don't want to pay and sign up for a random account you can alternatively look at open Sora which is an open source implementation that aims to replicate Sora by open AI they also release a new version 1.2 on June 17th open source models like this usually have a better description of its model and documentation on how it's trained so more specific details like it can generate 700 20 * 1280 up to 4 seconds will be shared too so if you really want to learn how exactly textto video Work definitely check out their GitHub and just look at how long the prompts they use maybe that's the new standard for a well-made prompt to generate something reasonable looking but anyways aside from the official demos released by old those companies the custom Generations people posted online speaks louder about the model qualities unless the users are not prompting them well but usually these companies would have something in the back to enhance the prompt you enter anyways image generation like mid Journey does this and Luma AI also has this function too which you can enable as you can see here while it is pretty difficult to nitpick when they're all getting better and better there are still General Vibes that makes another feel better than the other but somewhere down the line the functionalities these companies offer might be the sole determining factor for a lot of people to choose between these tools and as of right now these models are roughly equally good and equally bad at the same time they are all still terrible at Anatomy well it's still expected as AI image Generations still can't handle the anatomy well either so when human bodies are flipping upside down the model has no chance to make sense of that it definitely is also a lack of data in that department and synthetic data from 3D simulation might be the solution to this if generalization gets better on the other hand fineing grain and complex World interactions are still out of the question things like generating handwriting and drawing out the letters correctly is impossible simulation might have a chance to save these but it feels was much harder than the previous problem however pouring liquid has definitely gotten better so these AI video generators are still far from being World simulators but creative wise it still has some pretty interesting touch it can provide like how this guy made this insane AI generat MV with a wide range of different tools which let the current weird Vibes AI generator videos have Blended in perfectly but for them to get even better in the realism aspect or even become an actual simulator video models probably need to achieve some decent reason capabilities too for it to generalize physics and learn from the videos within the training data and if you want to keep up with the latest AI research definitely check out my newsletter where I publish research breakdowns on many cool papers that I don't have time to make videos for thank you for watching a big shout out to Andrew lelas Chris Leu Alex Shay Dean Alex marce mulim FAL Robert zasa and many others that's supporting me through patreon or YouTube Follow My if you haven't and I'll see yall in the next one

Info

Channel: bycloud

Views: 31,421

Rating: undefined out of 5

Keywords: bycloud, bycloudai, gen-3 alpha, gen3 alpha, ai video, ai video generator, ai video generator free, ai video generator app, ai video generator tutorial, ai video generation, free ai video generator, sora, open sora, opensora, dream machine, luma AI, luma dream machine, luma dream machine meme, ai video memes, how to make ai generated video memes, ai generated video memes, memes connected in a video, how to make ai video memes, KLING, kuaishou kling, kuaishou, kuaishou ai

Id: z4fai9N8HtQ

Channel Id: undefined

Length: 13min 2sec (782 seconds)

Published: Mon Jul 15 2024