Google’s AI: This Should Be Impossible!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
This idea sounds like science fiction, but by  the end of the video, you will see that this   makes perfect sense. So, what is it? Well,  consider image inpainting. It is amazing.   What can it do? Well, when we cut out a part  of an image, then, bam! It fills in the void   with plausible information. This is image  inpainting. And now it works on video too.   But it gets even crazier. Image outpainting  also works. Whoa! What is that? Well, we can   essentially extend the image in any direction  and once again, fill it in with plausible data. Now please note the choice of words here: in  both cases I said it fills it in with plausible   data. Data that could be there. But synthetic  data nonetheless. Now here is an insane idea   from Google’s researchers. What if we would take  these, and fill them in in not with information   that could have been there, but with information  that was actually there. Filling in with reality.   That is of course, impossible, right? Well,  look. Oh yes. It seems like it is impossible. If we try to complete this image with previous  techniques, for instance, Stable Diffusion,   we get something that is plausible, you know,  the hat continues, the post its also continue,   that is good, but still it is likely not the  real thing. So, can I get the real thing? Well, let’s think together. What if we are trying  to outpaint a historical building? Wait a minute,   that is the key! If we try to fill in information  for something that we have other photos for,   it might be possible. Let’s give it a  try. This is the incomplete input photo,   and here are our other photos. Now note that this  is still quite hard. We can’t just copy it. The   angles are different, the lighting is different,  lens distortion is really different. But,   in the age of AI, let’s see if it can be  done. And…oh wow! Look at that! Perfection. And it can do it for a variety of scenes  over and over again. It appears to work   pretty much everywhere. Well,  it does not work everywhere,   I’ll tell you about it in a moment.  But, all this is absolutely amazing. But still, wait. How do we know how real these  photos are if there is nothing to compare to?   Well, let’s make sure that there is something to  compare to. Let’s take a real photo, cut off the   top, and now we know exactly what should be there.  Stable Diffusion does not know. Paint by Example,   a paper from almost exactly a year ago  does not know at all. But the new technique   called RealFill, this one knows. Look. That is  incredible. Almost pixel perfect reconstruction.   My goodness. What a time to be alive! Now  note that this is not a copying machine,   it has access to information about the room,  but it has to understand which part is missing,   and what that part would look like from  this angle. So it fills in reality after   all. And it does it over and over  again with breathtaking accuracy. Now, I noted that it is still not perfect. I mean,   all of these look nearly  perfect. So where are the issues? Ah. Of course. Text. It’s always the text.  Every time. We finally left behind the age   of AI systems generating mangled, incorrect  hands, mostly, but text is still a challenge.   I am fairly sure that this is something that will  be possible just one more paper down the line.   And can you imagine what will be possible two  more papers down the line? My goodness. We can   already do a pretty good reconstruction from just  one image. Not even a set of images. One image.   This is supposed to be a failure case. If this  is a failure case, bravo, sign me up right now! So, adding a little more information to the  AI by reusing already existing images. That   was the crazy idea, which in hindsight,  makes perfect sense. What a brilliant   paper. Loving it! And one more thing. I have  a little daughter and when she was a baby,   we could not really afford a good smartphone  to take better images of her. However,   there are a lot of pictures, and I was thinking  that over my lifetime, there will surely be an   AI that will be able to upscale those not great  images to a higher resolution version. And it   should not just fill in things that could be, but  with things that are really there. And finally,   we are here. I can’t believe it. And all it  needs is one to three photos. And as a family,   we have thousands of photos of  ourselves to learn from. So good. This was Two Minute Papers with Dr. Károly  Zsolnai-Fehér. Subscribe if you wish to see more.
Info
Channel: Two Minute Papers
Views: 213,086
Rating: undefined out of 5
Keywords: ai, google, google ai
Id: bD_HyxHMHPo
Channel Id: undefined
Length: 6min 5sec (365 seconds)
Published: Thu Oct 19 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.