NVIDIA’s DLSS 3.5: This Should Be Impossible!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

You hear it everywhere. DLSS this, DLSS that. But what is it? And why is it doing the impossible? This AI-based technology is being talked about the media a great deal, and it is supposedly a way of dramatically speeding up computer games and virtual worlds. Does this really work? Well, when Jensen Huang, NVIDIA’s CEO says that every single pixel on your screen for these games will not be rendered, but will be generated soon, presumably by an AI, that really got my attention. That sounds insane, and if hearing this you think that this cannot possibly be true, I don’t blame you for a second. We can’t just create all this footage, there has to be a proper system that can compute reflections, diffuse interactions, and all that. Right? Well, not quite. In 2018, I had a little project where I wanted to generate photorealistic images of material models quickly. Now, generating an image through ray tracing took up to 40 to 60 seconds, and that was a bit too long for my taste. So I tried to write a neural renderer that could take just the description of a material model, and it would immediately, in real time, synthesize what it would look like. And it was able to do all this not in 60 seconds, but in 3 milliseconds. Yes, it was 20,000 times faster. However, it was limited to this very scene. So, how did this research field improve in the last 5 years? Do we have anything more usable now? Oh boy, if I could tell you. Well, meet DLSS. Deep Learning Super Sampling. This is a system where hardware and software works together to create an incredible experience where games and virtual worlds really run faster than what should be possible. So, how? Well, as of DLSS version 3, get this, by generating 7 of every 8 pixel that appears on the screen. Yes, more than 85% of the pixels are just generated, not computed by for instance, ray tracing or traditional techniques. That sounds flat out impossible. But, it is possible. The first step is running super resolution in real time. This means that a coarse image goes in, and an AI finds out if we pretended that this is a fine image instead, what are the details that are missing, and it synthesizes those details. Second, optical flow. We can look at two adjacent frames of the video game, and try to estimate what has happened between them, and where things are going. With this, entire intermediate frames can be synthesized, creating the illusion as if the game was running smoother than our hardware can run it. Combining the two together, with a pool of very few pixels that are computed, both image quality and smoothness can be improved at the same time. What a time to be alive! But it does not stop there. It gets better! They have also announced DLSS 3.5, hopefully, all this can be improved even further through something that they call ray reconstruction. Oh my goodness, this is going to be so good. But, how? Well, normally, when we perform ray tracing, we simulate the path of millions and millions of rays, and bounce them around in a scene. And if we can’t afford to wait for up to hours at a time for an image, we will have to settle for a really noisy image. Like this. Denoising techniques exist that try to guess what the missing information could be, but they are not perfect. Not even close. Some important details between objects can be missed entirely, and still, reflections can be a lot less clear than in a full simulation. And unfortunately, we have a problem. It gets even worse. This image will undergo an upscaling step, where these errors get magnified even more. Denoising and upscaling are two separate steps typically done by two separate models, so why not do both of them with the same AI model? And now, ray reconstruction enters the scene. It learned on a ton more information than the previous models, 5 times more training data was given to it. And look at this step. This one is right on the money. This is tailored to retain high-frequency information, fine details to make sure that before the upscaling step, we have the highest quality information available. So, is it better than what denoisers did before? Oh my goodness, look at that! We finally get that shadowy region back that was filled in by incorrect information with the denoisers before. Yummy! So good! And that high-frequency information retaining capability can also be witnessed here. Loving it. Now note that ideally, here, we talk about peer-reviewed research papers where we can see all the weak points and failure cases, that is my home turf. This is not a research paper, so it is harder for me to find the flaws, so please bear in mind that it may have weaknesses that were not shown here. And in the meantime, this technology is being handed out to millions and millions of people all around the world, and it is incredible. So, this will be available only for the shiniest, newest graphics card owners, right? Well, hold on to your papers Fellow Scholars, because here comes the best part: no! This comes to all RTX graphics cards, even the older ones that you can pick up for a couple hundred bucks. This can breathe new life to aging hardware which is highly appreciated. Note that the particular game or app you’re running has to be ready for DLSS3.5 for all this magic to happen. Now, I am a light transport researcher by trade, I dream with rays of light, and ray tracing if you will, so I am incredibly happy to see this technology make it to the hands of real users in the real world in such an incredible way. And make no mistake, this is not perfect by any means, there are games where it does not appear to work very well, and some people prefer to not use it at all, anywhere. However, note that this is still nascent technology, and it can already synthesize more than 85% of these frames, which is something that I thought would only be possible in a science fiction movie. Maybe not even there. I am absolutely stunned by these results, even with their weaknesses. And don’t forget to invoke the First Law of Papers. The First Law Of Papers says that research is a process. Do not look at where we are, look at where we will be two more papers down the line.

Info

Channel: Two Minute Papers

Views: 384,298

Rating: undefined out of 5

Keywords: ai, nvidia, nvidia dlss, nvidia rtx, dlss 3, dlss 3.5

Id: hr85Lc_WT38

Channel Id: undefined

Length: 8min 28sec (508 seconds)

Published: Tue Sep 19 2023