This new AI system has looked at 20 million
images. So what has it learned? It learned to do something meaningful with
these unusable, super pixelated images. Dear Fellow Scholars, this is Two Minute Papers
with Dr. Károly Zsolnai-Fehér. So, the goal here would be that a pixelated
image goes in, and out comes what? Well, out comes an image with more details. This is what we call super resolution and
this is what a previous technique was capable of. The result is fair, it is little less blurry,
but I don’t feel that it has added too much detail to the image. But, a newer technique from 2023, from less
than a year ago, whoa. Now we’re talking. Tons of detail just appeared! The feathers looks great, but the eye. My goodness, the eye is a bit creepy. Not a fan. And now, let’s see if the new technique
can do even better…and…oh my, look at that! That looks fantastic! If you thought the feathers were good before,
look at them now, and finally, the eyes are completely different, and much more realistic. And if we run the same for a person, previous
techniques aren’t even able to recognize that this should be a person. But the new technique. That is incredible. So, what other incredible things is it capable
of? Well, let’s see together through 6 examples. One. Have a look at this car. We have an unusably pixelated image here of
a car, and we know it’s good, so it’s going to get the car right I am sure, but
are you thinking what I am thinking? Because I am thinking license plate. Can it do anything with the license plate? Not a chance, right? Let’s see…and…whoa! Look, we got something. Now I will immediately say that there is so
little information in the input image that this new data might very well be completely
made up. If not, bravo! But even if it is, I love the fact that the
AI knows that this is roughly the kind of information that belongs there. Really cool. Two, how about trying to remix old videos
game graphics? That works too. And if we could do that in real time, we might
not even need to store all of these detailed textures, just small coarse ones if we know
that we can super resolve it to something way better in real time. Three, landscapes. It not only gives us the clouds up there,
but also the trees in the background in considerable detail. And if it can create so many new details,
I am thinking about just drawing up a really crude draft of an image, and have the super
resolution AI do the heavy lifting. That would be so good. Also, not far away at all. Four, animals. Just look at that. In the area here, we get a really sharp image,
we can see the fur almost down to the individual strands, and in the background, we even have
a subtle depth of field effect as it goes out of focus. Fantastic. Five, human faces. Now this one puts up a clinic on how to do
this. Wow. Here, it is even better - we get a result
down to the individual level of hair strands. Some of them in the foreground look a little
plasticky, and the teeth are not great either, but everything else. Unbelievable. Just think about having the internet using
images and videos that are compressed down to take a hundred times less space because
every device will be capable of performing this kind of super resolution soon. And six, vintage photos and movies. This one has tons of those signature compression
artifacts, the image is also probably zoomed in. Almost unusable. I don’t think I would want to watch this
video. And now, hold on to your papers Fellow Scholars,
and look at this. Holy mother of papers! I am just looking and looking, and I can’t
believe that this is possible. Interestingly, the paper discusses that the
AI has also learned from 100k negative prompts. What does that mean? This means that it was given counterexamples,
badly done images to also teach it not only what to do, but also what not to do. And this is a good paper, so they also tested
how much of a difference this makes. So let’s see. Goodness, it makes a huge difference. But, we are not done yet, because we haven’t
even talked about my favorite feature. Text prompts. Wait, what can text prompts possibly do here? We already have an image, just repaint the
image, right? Why do we need the text? You see, this is great, if we invest an additional
minute to describe what is supposed to be in the scene, it creates these absolutely
incredible results. Or we can tell it that look, this should be
a bicycle, and now it is. But, it gets better. Two, we can even use the text prompts to bend
reality itself. Yes, in this case, we can even specify that
the woman’s hat should be made out of suede or denim. That’s good. And now I wonder if we can push this further? Oh yes! Yes we can! We can easily add a mustache, I can’t tell
that this has been added, I don’t think anyone can, or we can even choose if our subject
should be resolved to a young or an old man. How cool is that? Bending reality. What a time to be alive! And, good news, an online demo of it is either
coming or will be already available by the time this video appears. So, hopefully, let the experiments begin!