OpenAI Shocks the AI Video World - Sora Changes Everything

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
open AI just changed everything, again all  these videos you're seeing right now are   fully AI generated I don't usually try to push  out News videos like this but this is the most   excited I've been about an update from open AI  or any AI tool ever and that's saying a lot this   is better than any other AI video platform  and it's not even close it's called sora it   was just released today and it's not just text  to video a few hours after they released those   demos they also released a research page  that showed image to video video to video   and actually my favorite part is this feature of  connecting videos like this drone flying through   these ruins then changing into a butterfly  flying underwater for some reason but it looks amazing they're supposedly allowing select  creators to test it right now to get feedback   and they said to give the public a sense of  what AI capabilities are on the horizon I   mean the possibilities with this are pretty  intense honestly I'll get into some of my   overall thoughts and potential consequences  and opportunities later but let's jump into   some more examples first this movie trailer is  from a single prompt and it maintains the scene   and character consistency across multiple shots  it has this Dynamic camera movement and handheld   shots plus just how realistic the people are and  then there's a couple drone shots here that are   just indistinguishable from real drone shots same  with this longer one of an SUV driving on a dirt   road I mean this scenery just looks so good where  this cat walking through a garden is so true to   the way a cat actually moves with that like light  footed stealthiness plus the physics of the fur   and whiskers bouncing and this is a 27 second shot  it can generate up to 1 minute as opposed to the   4sec morphing kind of slow motion Clips we see  from everywhere else but now let's look at some   more complex scenes this one in particular really  blew me away first off just all these buildings   being viewed coherently from all angles as it  passes but at the same time there's all these   Reflections that interact with them properly  with the lighting and refraction especially if   you go frame by frame right here as it crosses  this building and see the scene it's actually   reflecting honestly other video generators  would struggle just generating that scene on   its own with multiple faces and hands let alone  as of reflection interacting with an even more   complex scene I'll jump through a couple quick  ones here all these people celebrating Chinese   New Year is an unbelievable amount of people to be  tracking and they're all pretty coherent and this   flythrough of a historical town has all sorts of  like horses and people walking and carrying things   all while viewing these buildings from different  perspectives or the ability to generate so many   completely unrelated videos across all these TVs  at once I that's just crazy to me then most of the   wildlife shots look essentially perfect I really  like this crab interacting with octopus a lot of   this is photorealistic but it can do other styles  too so we have some 3D animation shots that are   amazing then this papercraft one is awesome and  this dancing one is a great spot for comparison   because just the other day I was generating videos  in Runway it was of a cult of barn owls performing   Secret disco rituals don't ask but my point is  just look at the difference of these slow motion   dance moves that start to like morph even in  just a couple seconds clip and some of these just   have no faces the comparison isn't even remotely  close and then I want to focus in on this one for   a second so it has two just incredibly detailed  pirate ships that are viewed from all angles but   simultaneously having multiple difficult physics  simulations of like the flags waving and then the   liquid then it also knows to apply tilt shift  to make it seem like it's miniature and that   just highlights how much is going on in this and  how timec consuming it would be to do this in any   other way now those are all amazing but let's look  closer at the shots that involve people that's   where things get really difficult so first it can  do a close-up of an eye incredibly well so even   captures the little micro starts and stops your  eye does when you move them from left to right   they're called Cades but let's zoom out some more  here's a gray-haired man pondering the universe   this is amazing even the just subtleties in his  wrinkles as his face makes like minor expressions   and like the different speeds he blinks and then  there's really accurate urate refraction in its   glasses but let's zoom out even more so this  one's actually the generation Greg Brockman   used in his announcement and it does an incredible  job with Reflections and this realistic walk cycle   and then proper physics on the purse and clothing  and her earrings that's all with a pretty complex   background too this is miles apart from anything  we've seen before but there are some issues with   this like you're not going to be convinced this is  a real video if you're paying attention at least   you like the feet kind of sliding across the  ground or if you look at any of the people in   the background they're good but some are a little  weird so there's certainly limitations so here's   another one that has a lot of people in it while  the camera is spinning which is really pushing   the limits if you watch this bottom left corner  there's a kind of trippy perspective shift here   and then some weirdness in the hands and movements  of the people overall they look great and can't   stress enough how far beyond this is from anything  else we've seen but it is still of course not   perfect especially with people but that's just  when I'm analyzing them to find something weird   if I was just scrolling social media I would  think almost every one of these is real and   as the saying goes this is the worst it will ever  be and all of those examples are at open.com Sora   but they also released more examples of all the  other stuff it can do Under the research section   of their site I'll link to that as well before  I jump into that I assume everyone watching this   video uses chat GPT to some extent but one thing  that can be a bit of a struggle is knowing how to   implement it in your daily life especially at  work that's why I've partnered with HubSpot to   share a free resource bundle to help it will be  linked in the description it covers topics across   all different Industries with tips and guides on  how to use chat in those Industries with specific   examples and applications I mean the number one  problem I hear from people regarding AI is they   see these videos about chat GPT or other AI tools  and say that's really cool but how can I actually   use it to help me you know to save time or solve  a problem one example section here is called 100   ways to try chat GPT today with 100 sample  prompts you can use or modify for whatever   career you have and there will actually be five  PDFs that section's in the one called supercharge   your workday with chat PT but go through all the  other ones in there too again they are completely   free just use the link in the description to go  download those I've been happy to partner with   HubSpot to bring these free resources to people  who watch this channel onto the Sora research   so I'll link to that as well they call it video  generation models as World simulators which that   title starts to feel a little weird they say it's  a promising path towards building general purpose   simulations of the physical world and this goes  to some extent on how it was trained and how it   works in these first sections but let's look at  some more examples it does demonstrate a lot more   text to video capabilities with some really solid  generations involving people but since we've seen   a lot of that let's jump into the image to video  just like with text to video their image to video   is far beyond any other video generators so these  are examples of animating Dolly images and they   are all amazing I this one with the Surfers  being the most complex and just like with the   texted video it does a really good job with the  physics like even in a really difficult example   like this it can also extend videos forward and  backward in time this is an example Le where all   three videos were extended backwards so they all  lead to the same ending this video to video is   really impressive so it starts with the example of  change the setting to be in a lush jungle and does   an amazing job but you can click on this and it  will be a drop down to select from with all these   different options so we've got make it underwater  and how about change it to Winter and we can add dinosaurs switch it to claymation we've got a  bunch of others here so this is just next level   when you compare it to something like Gen 2 or  kyber kyber can do a bunch of stuff that is pretty   different from this which I really like we'll have  to see once we can actually play with this but   either way this is mind-blowing then this next  section is probably my favorite part so it can   connect videos where it interpolates between the  two input videos to create seamless transitions   between them even with entirely different subjects  and scene compositions so the center video is the   one that combines them and for each of these it  finds just a really creative and seamless way way   to merge them like this drone shot where it puts  this snow diarama thing on the back side of the building then we've got this slow morphing between   the chameleon and this cool bird  it looks just awesome while it's transitioning and I'll just let these other  two Play Got This SUV turning into I think   it's a cheetah and that historical Gold Rush Town  transforming into this underwater scene both just   super cool and it is also a image generator now  I wonder if this will be replacing doly at some   point I assume it will because honestly just  a screen grab from a lot of these videos looks   better than if you ran that prompt through  Dolly right this is what I got when I copied   the prompt of the woman in Autumn I mean this is  just way better than Dolly and if we move along it   talks about some of the emerging capabilities of  training at scale similar to how it is with llms   there's just unexpected capabilities that emerge  when you train a model like this they's say some   of that has led to its ability to generate Dynamic  camera motion that shifts and rotates while people   move consistently throughout 3D space and the  object permanence as these people walk past this   dog like it'll be completely blocked from the  frame at some points but it still persists and   will be there after they walk past interacting  with the world is another capability that no   other model has been able to really do so it  can still only affect the state of the world   in simple ways but these examples of painting  and actually leaving the brush Strokes on the   canvas or leaving bite marks on a burger this  is amazing this is a huge step forward they also   do show some limitations like glass shattering  they did show some more limitations on the main   site where the physics was off like this person  running backwards on a treadmill these wolf pups   just kind of appearing out of nowhere some weird  interactions with his chair and a few others again   not perfect and it also has just this mention of  simulating digital worlds where it can control a   player in Minecraft while also rendering the world  and its Dynamics in High Fidelity the ways this   will be applied in video games in the future is  going to change that whole world too honestly I'm   still processing ing a lot of this it is such a  Monumental breakthrough I wasn't expecting to see   text a video like this until maybe the end of the  year at the earliest they're working on putting   some guard rails around this which they should  but I mean there's only so much you can do you   know people have complaints about the censorship  within chat GPT which I get but when it comes to   videos which is generally done with the intent  to share it it just gets a lot more complicated   we already have fake videos all over the place  some of relatively harmless things like fake   Landscapes or fake weather phenomenon then there  is the more sinister side of fake videos spreading   around of War victims or political figures  up until this point they're usually pretty   obvious to anyone that's close to this stuff but  maybe not to everyone else sometimes they seem   to trick millions of people but as these tools  keep getting more sophisticated that will just   accelerate and also become impossible to detect as  far as videographers and filmmakers and YouTubers   there will be a lot of possibilities that open up  I think creating just amazing b-roll really easily   is going to be awesome I personally avoid stock  footage almost completely because it just feels   generic but if I can generate something that I  kind of imagined for each occasion that's just   a lot more interesting to me and engaging for  the audience overall this completely opens the   door for anyone to create visual stories in the  same way one person can write a book one person   could be able to create an entire movie this just  provides more tools to create with and new types   of content will emerge there will be bumps  along the way with AI content farms and deep   fakes misinformation I'm not worried at all about  being replaced I think that no matter how good any   of these tools get people will always be more  interested on seeing things that do themselves   there's a lot of uses for this to help people in  the process but not to replace people entirely   also I think as more generated content comes  out there will just be a pendulum swing in the   other direction for people craving more authentic  human content people are consuming more content   than ever so really there will be demand for both  there's a lot of random thoughts at the end there   this is a much more offthe cuff video than usual  this update was amazing and if you want to stay up   to date along the way to everything that happens  in AI make sure to subscribe here and check out   futurepedia .io we curate all the latest and  greatest AI tools you can find the best tool   for any use case use the chatbot to help guide the  process you can save your favorites then you'll be   given personalized tool recommendations every week  there's been a ton of updates there so go check it   out if you haven't for a while thank you so much  for watching and I'll see you in the next one
Info
Channel: Futurepedia
Views: 176,601
Rating: undefined out of 5
Keywords: sora, openai, text to video, image to video, ai video, cinematic ai video, video generator, dalle, dalle 3, text-to-video, text to image, ai video generator, ai content creation
Id: ZFwzaAU8xFA
Channel Id: undefined
Length: 12min 18sec (738 seconds)
Published: Fri Feb 16 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.