Google's NEW AI 'Dreamix' Takes the Industry By STORM! (NOW UNVEILED!)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
now Google are once again is showing why they're one step ahead of all major competition when it comes to certain AI features recently Google's AI team released this research paper documenting their step for text to video and honestly the results are completely mind-blowing take a look at this so you can see that this is Google's new text to video so essentially what we have here is the input video and then of course the generated video so if we read the description it pretty much describes exactly what's going on but I do need to provide some more context it says given a video and a text prompt dreamix edits the video while maintaining Fidelity to color posture object size and Camera pose resulting in a temporally consistent video so here it turns the monkey which you can all see on the left to a dancing bear which you can all see on the right given the prompt a bear dancing and jumping dark beat music while moving his body so essentially what this is is this is a text to a video now of course you might be thinking this just isn't bass texture video because of course most base Texas video would just be where you simply enter a text prompt and get a video out but this is something that many models have actually struggled with and I'm going to show you more examples from Google's dreamix as to why it's really good and I'm going to be pairing it to runways Jenna 2 which is another text to video editor which has gained some popularity recently so let's take a look at some more text to video examples from Google's using the different models that they do have in this large language model software so you can see right here that this is another example this is dreamix can generate videos based on image and text inputs so it can actually install motion into a static image and this is very different just from text to video because this allows another level of population because you can see right here that this is just an image but then they've changed it to say an underwater shot of a sea turtle with a shark approaching from behind so this is definitely really good because of course everybody knows about mid journey and how good that is an image generation but imagine you could then take your image generation from mid journey and it puts it into Google's remix and then just simply say a futuristic landscape cinematic shots and I'm pretty sure with just images and combined with this tool we could definitely be generating some full scale movies that would be really cool which is why I say that this is far more advanced than people think because of course Google as you all know is working on bud and many other software such as palm e but this is definitely going to change the game once it does get integrated and I do hope they release it soon because this other example I'm about to show you and the further ones in the video are truly breathtaking this one is pretty cool that you can all see right here it says given a small collection of images showing the same subject which is of course the same Lego toy character dreammates can generate new videos with the subject in Motion in this example given a small number of images of the toy firemen dreamix is able to extract the visual features then animate it to lift weights while maintaining Fidelity and temporal consistency which just means that it looks normal and it looks not strange at all so this is really good because of course we know that this can then be applied to many other things so for example maybe you want to generate something moving with a mid Journey character maybe you want to pre-visualize something there are honestly many different applications but I think that the toy fireman lifting weight is really really good because the generated video does look pretty pretty realistic compared to some of the other AI texture video platforms that do exist right now now what I want to do quickly is I want to compare this to something that was recently released which is runways gen 2. if you don't know what runways gen 1 Gen 2 was essentially this is text to video now essentially generation one was where you had a video then you had a Driving Image which you are seeing on screen and then of course the Driving Image essentially makes that video in the similar style of that image so you can see right here that one's Lego that one's some kind of fire thing and then of course gen 1 has a massive Discord which is what they were all talking about so it's it's something that does very very well at what it is capable of but I do think in the instance that we're looking at Google's text of video does a lot better but you can see that here with Gen 2 what we do have is of course many different puts that actually don't need a Driving Image and I think that if this company is able to just simply focus on this I think it's going to be at mid-journey level soon where you're going to be able to get complete footage from a single prompt which is what you can see right here an apartment extreme close-up of an eye and these are generated without driving images and driving images are just essentially images which prompt the scene to have a certain specific style so for example let's take a look at this right here you can all see that this of course is the Driving Image and of course essentially what happens here is that this was given the prompt of a low angle shot of a man walking down a street illuminated by the neon signs of the bars around him and essentially what's good about this is that of course if you want to prompt this in a certain way or you want a certain style you can actually use that image if in order to craft that story which does make this and some aspects much more effective but like I said there are some other examples that I do want to show you from Google dreamix in video editing aspect so you can see right here the input video is of course them cooking some onions and then of course they added the text prompt of stirring noodles in a pot and you can see right here that of course they are stirring noodles in a pot so when it comes to the video editing aspect these examples that I'm showing you right now showcase how powerful this new software is you can also see that it says moving through a field on a wooden path with fire on all sides and you can see that from the input video the generated video is honestly really really good and I'm not taking shots at Runway here I think they've done something absolutely insane but this does look like it is a mere step ahead of Runway so it'll be interesting to see where the software goes you can also see that the input video generates some outstanding results in the final generated video of this example when old pickup truck is carrying with logs definitely looks really really realistic and of course it definitely shows multiple applications where this stuff can be used now you can also see right here that it says a small brown dog and a large white dog are rolling a soccer ball on the kitchen floor and you can see that's given from The Prompt of these two animals and it just goes to show that even with an input video you can quickly edit stuff in real time with this software and I just wonder how much this is going to develop this one right here is another great example of how you can use natural language to increase a video's capabilities by just having these things on screen and honestly this is truly truly great because it just shows us what kinds of examples we're going to be getting in the future when this stuff is fine-tuned and this one right here is really really cool as well it's a beach with palm trees and Swans in the water and from the input video compared to the generated video you can see that it definitely does look pretty pretty realistic definitely one of the more realistic ones of course some of them may have some small artifacts but you have to remember this is new software now you can see right here this one even manages to do the water Reflections really really well we can see that orangutan with orange hair bathing in a beautiful bathroom does look pretty realistic with as to what I would expect if I saw an orangutan bathing now you can also see right here a deer rolling on a skateboard this one isn't as accurate but it's still very interesting to see how the models kind of put certain pieces together and how the final output is and of course we can see that the input video is there and of course the output video definitely looks interesting now of course we need to take a look at image to video because this is the section which is even more crazy take a look at this we can see that the input image is on the left and the generated video is on the right hand side and the text prompt is a camel walking in the sand dunes and honestly this looks pretty much perfect I mean it doesn't look like the highest quality but it definitely looks great we had another input image right here you can see that we just have something that looks like a Christmas tree and then of course we have Bigfoot walking in the snowstorm and honestly these examples that I'm about to show you are truly truly incredible I mean they do look as realistic as possible and it shows that their input image which generates to video looks about as realistic as you might expect in this early stage now of course we have the input image in the generated video of the emperor penguins returning to their home and you can see that this one definitely looks pretty accurate which is pretty nice so of course we have one here which showcases how you can actually zoom out on certain images create different you know Landscapes different perspectives it's really really interesting to show the dynamic fluidity of these models and just how accurate they really are at depicting exactly what we want with the text prompt and with the input image you can also see right here a unicorn running in the foggy Forest whilst zooming out that definitely does look really really realistic and it looks really really accurate and I only have one question right now which is when is Google going to release this to the public because this is better than a lot of things we have seen on the internet and I mean if they release this soon I'm pretty sure that this is going to honestly take the entire industry by storm because something like this isn't currently available especially as accurate as this is I mean just take a look look at this one right here a grizzly bear walking around the lake large birds flying around in the sky and I mean we can see that grizzly bear moving in the environment and it definitely looks real I mean when it comes to generated videos like this it's truly truly incredible and this one right here is one of my favorites it's a time lapse of black bleens plants sprouting so you can see right here if you wanted to generate maybe a time lapse kind of video even the little shakes in which plants do which you usually see in those kinds of videos it just goes to show that the models being developed today are definitely going to be better a year from now two years from now so you know you can kind of imagine what kinds of AI are we going to have you know perhaps 10 years from now 20 years from now are we going to be living in a completely different world where content is just completely automated I guess we're going to have to see now one of the last ones that I did want to talk about here was how input images can generate a much higher quality video as you can all see here a bear walking combined with these input images honestly provides us with some of the best results let me know what you all think about Google remix is it something that is really good or is it just something that you don't think is that impressive but I truly think this is impressive and honestly the next few years in AI video generation are going to be truly incredible
Info
Channel: TheAIGRID
Views: 490,935
Rating: undefined out of 5
Keywords:
Id: i5sO-BOFVoo
Channel Id: undefined
Length: 10min 11sec (611 seconds)
Published: Sun Apr 16 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.