Hyper-Realistic AI Video is Here: Full SORA Break-Down

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

open AI has announced a revolutionary new AI video algorithm just take a look at these videos coming out of this AI video generator these videos are longer more detailed and more beautiful than ever before so without further Ado let's dive in and explore the next revolution in AI video together so what makes this revolutionary well Sorak allows us to create videos up to one minute long before this we were using videos that were only four seconds long so this is a huge increase in the length of the videos that we can get the quality coming out of Sora is unparalleled the detail coherence and resolution is absolutely incredible and most of all is the videos do not degenerate as they get created many of the existing AI videos will start off well for the first second and then suddenly you're looking at a strange blob on the screen and we're getting a whole host of aspect ratio varieties so let's take a look at some of the best examples coming out of Sora here we can see a large orange octopus is wrestling with a crab on the bottom of the ocean now what's really interesting is that there are highly detailed elements to this video which are maintaining cohering it's also excellent how the AI video generator understands that an octopus has tentacles and that a crab has limbs legs with joints and it keeps that in mind when animating the movements of these two creatures now if we look at this a flock of paper airplanes fluttering through a dense jungle weaving around trees as they were migrating birds uh this is a pretty mesmerizing and Beautiful video showing some of the abstract potentialities of this tool now where things are taking an absolute huge step forward are in photorealism you can see that the human is generated to an almost lifelike level that it's maintaining the coherence of its features and even adjusting the eyes from when the Cat puts the porw into the face this is working well for a diverse set of image types everything from Landscapes as you can see here this rugged Coastal scene all the way through to more Pixar style generated animations now I'd like you to pay particular attention to the detail of the firm in this example especially how the light responds to the glint on top of each individual string of hair on this little creature now here we have this mesmerizing peacock you see the eye is slightly glinting around and the feathers are reacting very naturally to the undulating nature of this animal's neck and look and look at the water here the water is so incredibly lifelike the ripples of this coffee cup is absolutely astounding how the the bubbles are being generated and created in an abstract and varied way that seems completely natural now I am absolutely blown away at how effectively it is rendering humans and human movement this has been a challenge consistently for AI video generators and here you can see that this is an incredibly complete human even the hands are coherent in this situation which is another aspect of human anatomy that AI generated have been struggling hugely to get right how it's incredible how effective the AI generator is you can see here that even the shadow of this individual is being cast in a natural and correct way though I would draw your eye to something that is not quite right here if you can see that the ground beneath her feet seems to be moving at a different speed to that of which she is walking giving the slight effect that she's on an elevator an escalator or a moving floor so it's a huge Improvement but obviously there is still room to go that next step to something that completely matches reality so Sora has displayed a few different ways that you will be able to generate videos using this model the first is the text prompt where you'll enter in an exact description into the generator and get a video out for example here is the prompt that was used to generate this video of some woolly mammoth several giant woolly mammoths approach treading through a snowy Meadow their long woolly fur lightly blows in the wind as they walk snow covered trees and dramatic snow caed mountains in the distance mid-afternoon light with wispy clouds and a sun in the distance creates a warm glow the low camera view is stunning capturing the large furry mammal with beautiful photography depth of field so the other way that they are showing that you can create videos in Sora is image to video and what this allows you to do is upload a still image even a photograph or an image from another AI art generator and it will create a video based off that what it does is it understands what is in the scene how it would relate to the other objects and then using its insights it will animate it according to the most logical paths so for example if it's a human they're likely to be walking if it's a car it's going to be driving and the way that they move is very different so might be a little bit skeptical about the quality of these videos because they are such a huge improvement from what we've been using and after Google's demonstration of their altered chatbot which portrayed to show a chatbot being able to watch a video and immediately tell you what it was in it when actually that was entirely fabricated open AI have made a point to say that all videos on this page were generated directly by Sora without modification which gives us hope that this is actually as good as they say it is and another piece of evidence we have for that is that Sam ultman was on Twitter and he was taking live prompts from users and creating videos based on those prompts in minutes just take a look at this one with golden retrievers but I've been looking through all of the examples available and there is a variety in quality for example here is a a very well rendered video of a basketball that goes through a hoop and then explodes everything looks pretty good here however in this video which I was looking at of five gray wolf pups the first thing I noticed is that sometimes they seem to generate out of thin air if you look here at the start there are three that seem to then turn into four and five out of nowhere they seem to split and multiply which is a little bit weird however what they say in the release paper is that Sora the model understands not only what the user has asked for in The Prompt but also how those things exist in the physical world and this gives us a a lot of hope that actually it's understanding the reality of our existence it's not only thinking in pixels and colors it's really thinking about objects and their relationships to each other and a little fun fact is that Sora is the Japanese word for sky and if you are learning Japanese there's the character so the key Evolution here in how open AI has achieved this dramatic increase in capabilities is that they've said that by unifying how we represent data we can train diffusion Transformers on a wider range of visual data than was possible before spanning different durations resolutions and aspect ratios here they're talking about the algorithm that's used to generate these videos this is a diffusion algorithm which is the same that is used in other AI art and generative Technologies such as mid Journey stable diffusion and chat gbt essentially they have thrown a huge amount of computing power at creating the same relationships between the individual pieces of information within video so that it can coherently predict what is a sensible image to generate after one frame and it undoubtedly creates a huge amount of relationships within this video understanding how the colors how the composition how the lighting how the nature of the objects all relate to each other and it has some understanding of all of these different elements Sam ultman recently asked for7 trillion do to continue training AI models on vast and complex problems such as this and that's 7 trillion so the key things here are the coherence and consistency of the images The increased duration the high resolution and the variable aspect ratios that are taking this technology to new levels now it's interesting to talk about duration because some of the best inclass video generator models that are available for public use right now have a limit of 4 seconds and many people have commented on my recent videos which explore these models about the limitation of 4 seconds saying that a 4C video is actually just a GI or an image well I think it's very interesting if you watch any movie and see how often that they cut after 1 to 4 seconds and you will see that more than 90% of all shots are 1 to 4 seconds in length even if you watch any YouTube video you're likely to see even shorter Cuts so an entire film is actually made up of short 4 second shots but the possibility of having a minute long shot does give us a huge amount of possibility now talking about resolution they don't give us any specific details on the quality of the videos coming out but if you download the examples from their site you can see that they are being exported in 1920 by 1080 now it's very easy to also use existing technology to upscale these to at least 4K and what's interesting about 4K is that beyond that the human I actually cannot discern any increase in detail so there is going to be no issue with the resolution related to these videos and if we looked at one of the vertical image in one of the vertical video resolutions it came out at 576 by 832 now diving through some of the technical aspects of this release I came across this very interesting section which described how the model can also take an existing video and extend it did or fill in missing frames and that is going to give us a whole host of other possibilities first of all is that we can give it two images and tell it to create a video that connects these two different scenes so you could take it of a man on one side of a room and then show it another image of a man on the other side of the room and it will animate between these two frames now this will give us a lot of possibility for creative exploration now the other possibility of this is obviously to vastly slow down any video you give it so it can create missing scenes missing frames in videos allowing us to create Ultra Rich slow motion videos of any existing video and the other interesting possibility coming from this model is the ability to extend your videos so you can give it a short clip even of your own videos and it will add add more frames and make it longer now this can be great if you're working on a project and you are trying to fit the right clips into a certain scene it's very interesting that Sora has also released the prompts used to create each of the examples it shows on its release Here is one of the example prompts that it shows now it's very interesting to see that this fits in with the approach of natural language however from my experience with prompting it seems that there are quite a lot of redundant phrases inside of this prompt and I would love to get my hands on this and really give it a much clearer set of directions because the descriptions say the use of warm colors and dramatic lighting further enhances the coy atmosphere of the image is it's quite verose it's quite expansive it it adds this sentence further enhances the Cozy atmosphere of the image this is a very subjective description of what you might get out from the video rather than being a create specific articulate prompt which is what I would imagine would work better with this system and to give you a little overview of how I would break down prompting on any AI video generator and I specifically think that this would help on this version of Sor is to think about it in terms of three elements subject style and motion and understand how the subject is the content the object the characters of your piece the style is how you want it to look the medium the lighting the composition the colors and finally the motion is how is the camera moving and how is the subject moving how is the character moving and if you break it down into these variables you can much more accurately articulately and comprehensively take control of how you organize and Define your work so because Sora has released these videos and also the prompts that have been used to create the videos I've been able to take these and put them into the best of class video generators available right now I've done this so that we can compare exactly how much of a step up this is so I took an image a still image from this monster scene from the Sora model and used it as an image to video inside of pixers which is what I think one of the best video generators available out there right now so this is the Sora version and you can see there is wonderful detail on the strands of hair there is consistency with the fingers and I even admire how the the wax on the candle starts to melt down it's not an exact replication of how you would imagine wax to move but it does at least understand that a candle dissolves is that the right word melts is better I suppose and now let me show you the Pix version and you can see here the contrast is absolutely incredible so first of all you can see that we've lost a lot of lot of detail in the hair it suddenly become mass of fur mangled together you can also see that the coherence in the hand has been lost that it's impossible to see exactly how many fingers this individual has and it's a a real incredible Improvement let's take a look at another example here we have the peacock from Sora and we compare that to taking the image putting it into pix first and animating it and you can see that PX first does a lot better on this example however the video is only 4 seconds long and essentially the peacock bely moves you get a slight bit of flattering in the feathers whereas here in the Sora version the peacock is showing its whole body it is rotating and the video is much longer it's also able to render and understand the different sides of the animal you can see here we've got the side view and a very consistent and accurate view of the front of the this beautiful little peacock so we'll also take a look at this human this is the Pix verse version and you can see here his facial features start to float a little bit that the it's almost like his head is made of Jell-O or a rubbery substance and his fingers seem to be as if they are printed on the book rather than they are actual real human hands whereas in the Sora version you can see that there is a lot more consistency in the man's face His Eye Is Not floating around and I would also say that the quality of lighting is much brighter and and more inviting you can see it's quite dull and dead compared to Here There is a real vividness there is a drama and a lightness to the rendering now I also went ahead and put the prompt in directly to pix first to see what we get out and you can see that I put in the prompt from the animated Pixar guy which is the very furry little dude here and once I put that in to pix first I got out this quite horrifying rendering of a very frightening candle one thing that you can do right now with AI video generators is actually generate your images in a specific AI art tool and then take them into a AI video tool to animate them and separating the process like this you can get much better results and I did this just to prove that the technology is uh still feas right now that I created this little guy inside of mid journey and then animated it in pix verse and you can see we get something a lot closer to the example but I would still say that this example from Sor blows the one that I've made out made out of the water Kaboom so I have got some videos on my Channel about some of the existing AI art generators that you can get your hands on including pigs verse which is entirely free right now and Leonardo animation and I recommend checking those out some of you might be wondering what does mid Journey think about all all of this well mid Journey has revealed that it is working on its own AI video tool and because I believe that mid Journey currently offers the best quality and experience for generating AI art I would put a fair amount of money on them creating an effective coherent and class leading tool for creating AI video there is a weekly discussion on Discord from the team at Mid journey and this is where they talk very candidly about the future of mid journey and in one of their recent discussions they said training for new video models to commence which could potentially be ready in a few months so we're expecting this to come out shortly as well so there is a huge amount of excitement around the AI video space is open AI are releasing a revolution AR model and we're going to see other AI companies also releasing competing models of a similar if not better quality hold on to your hats ladies and gentlemen because things are about to get wild we're going on another roller coaster ride of Rapid AI technological advancement and if you're new here then why not subscribe because I will be keeping you up to dat with all the latest developments inside of their release they have no that our text classifier will reject text input prompts that request extreme violence sexual content hateful imagery celebrity likeness or the IP of others and essentially it will follow the exact guidelines that are set out to users of its darly 3 model which means essentially there are a lot of fun things that you won't be able to do with the open AI tool essentially anything related to violence sensual content anything suggestive hateful or using any celebrities or the input the inspiration of other artists you cannot say in the style of Leonardo D Vinci Pablo Picasso or Samson Vols and it goes on to discuss his other limitations which I have also noticed that it may struggle with accurately simulating the physics of a complex scene and cause and effect for example a person might take a bite out of a cookie but afterward the cookie might not have a bite mark so essentially the limitations on Sora is that you can't have any fun people will probably not have the right amount of fingers and it doesn't understand logic and you can see this in some of the examples here like if you look closely at this little guy he starts off with four fingers and then suddenly a fifth one appears out of nowhere he just grows a fifth finger as you do it's pretty handy actually whoa and of course as we mentioned the self-generating baby pups so as for when we get our little hands on this magical piece of of Kit well that has not been revealed they have alluded to the fact that they will be developing a product and releasing it to the general public like us but before that they'll be saying we'll be taking several important safety steps ahead of making Sora available in open ai's products and one of the things they will be looking into is adding c2p metadata into any video created with Sora and what this does is it proves that this was made with an AI generator and the idea is that this will help with the Providence discussions of digital media on the internet because obviously with AI video of this quality it's very easy to generate misleading videos that can influence people's decisions as we have seen in recent months there are a number of examples of firstly in Pakistan one of the leading opposition politicians is imprisoned and he is currently being represented by an AI video version of himself we've also seen videos of the Ukrainian president zalinski being shown to ask his troops to put down weapons when this has been an AI fake and even in Turkey there was a sexually explicit video released of the opposing party's leader that was suggested to have been created with AI and all of these are leading us to some quite challenging situations and as the technology evolves so rapidly it's only going to get more complicated but I'm very excited because of the possibilities that this is opening up which means that we can be creating our own videos our own stock images our own stories we can be generating and creating entire worlds with video before you would need a multi-million dollar Studio to create a Pixar likee video and now it's possible that you you can do it on your own in your bedroom with just a computer Beyond this the possibilities for leveraging this technology to make money are immense that with every technology comes new opportunities to make money and I have a video exploring some of the different ways that you can take AI video and start creating your own income from it now one way people are creating money already with AI is with AI influencers and this is going to add a whole new realm of possibility for people creating digital identities online that they can now create realistic consistent videos that are attributed to those identities and of course there is possibility to create sensal suggestive or intimate videos down the line obviously with this model they will be restraining and refraining from creating sensual or explicit imagery but obviously there are going to be competitors who come up that will provide that Serv service and the key here is that here we've got a proof of concept it will not take long for other competitors to release models that are of a similar quality and that is going to be very exciting I'd also like to mention quickly about my latest course I've released and it's about designing and building fonts with AI I developed a process that allows you to leverage generative AI to generate real usable fonts and and this is a very interesting use case for creating a very usable product and in the course we go through the whole process of how to design your own fonts and also how to sell them effectively online I've been selling digital products to designers and creatives for the last seven years and in this course I teach you everything I've learned so if you're interested in a fun project where you will learn something and create a product that has the possibility of earning you a side income check check out the course where I've got a little discount code for everyone who watches this video and made it to the end congratulations for making it to the end this is your reward a discount code to get the lowest price possible for the course in the description below and if you'd like to do me a little favor I'm very curious what did you find most interesting in this video If you learned something or that you particularly enjoyed a section of this video do let me know in the comments it certainly helps me understand how I can make these videos better and also where I can improve because I'm really looking to grow and become a better man most of all thank you very much for making it all the way to the end thank you for watching it's a delight to have you here I look very forward to seeing you in the next video you even be a little video here that you can watch if you want to continue down the AI rabbit hole with me anyway thank you for watching and have a delightful day

Info

Channel: Samson - Delightful Design

Views: 7,685

Rating: undefined out of 5

Keywords: ai, samson vowles, Ai video, Sora, open ai, ai news, ai video news, video, best ai video, ai video generator, ai video, text to video, artificial intelligence, stability ai, ai art, ai generated video

Id: c0wao4uJG6s

Channel Id: undefined

Length: 25min 44sec (1544 seconds)

Published: Sun Feb 18 2024