NVIDIA’s DLSS 3.5: This Should Be Impossible!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
You hear it everywhere. DLSS this, DLSS that. But  what is it? And why is it doing the impossible?   This AI-based technology is being talked about  the media a great deal, and it is supposedly a   way of dramatically speeding up computer games  and virtual worlds. Does this really work? Well, when Jensen Huang, NVIDIA’s CEO says that  every single pixel on your screen for these games   will not be rendered, but will be generated soon,  presumably by an AI, that really got my attention.   That sounds insane, and if hearing this you think  that this cannot possibly be true, I don’t blame   you for a second. We can’t just create all this  footage, there has to be a proper system that   can compute reflections, diffuse interactions,  and all that. Right? Well, not quite. In 2018,   I had a little project where I wanted to generate  photorealistic images of material models quickly.   Now, generating an image through ray  tracing took up to 40 to 60 seconds,   and that was a bit too long for my taste.  So I tried to write a neural renderer that   could take just the description of a material  model, and it would immediately, in real time,   synthesize what it would look like. And it was  able to do all this not in 60 seconds, but in   3 milliseconds. Yes, it was 20,000 times faster.  However, it was limited to this very scene. So,   how did this research field improve in the last  5 years? Do we have anything more usable now? Oh boy, if I could tell you. Well, meet DLSS. Deep  Learning Super Sampling. This is a system where   hardware and software works together to  create an incredible experience where   games and virtual worlds really run faster  than what should be possible. So, how? Well, as of DLSS version 3, get this, by  generating 7 of every 8 pixel that appears   on the screen. Yes, more than 85% of the pixels  are just generated, not computed by for instance,   ray tracing or traditional techniques. That  sounds flat out impossible. But, it is possible. The first step is running super resolution in  real time. This means that a coarse image goes in,   and an AI finds out if we pretended  that this is a fine image instead,   what are the details that are missing,  and it synthesizes those details. Second, optical flow. We can look at two adjacent  frames of the video game, and try to estimate   what has happened between them, and where things  are going. With this, entire intermediate frames   can be synthesized, creating the illusion as if  the game was running smoother than our hardware   can run it. Combining the two together, with  a pool of very few pixels that are computed,   both image quality and smoothness can be improved  at the same time. What a time to be alive! But it does not stop there. It gets better!  They have also announced DLSS 3.5, hopefully,   all this can be improved even further  through something that they call ray   reconstruction. Oh my goodness, this  is going to be so good. But, how? Well, normally, when we perform ray tracing, we  simulate the path of millions and millions of   rays, and bounce them around in a scene.  And if we can’t afford to wait for up to   hours at a time for an image, we will have to  settle for a really noisy image. Like this.   Denoising techniques exist that try to guess  what the missing information could be, but they   are not perfect. Not even close. Some important  details between objects can be missed entirely,   and still, reflections can be a lot less clear  than in a full simulation. And unfortunately,   we have a problem. It gets even worse.  This image will undergo an upscaling step,   where these errors get magnified even more.  Denoising and upscaling are two separate   steps typically done by two separate models, so  why not do both of them with the same AI model? And now, ray reconstruction enters the scene.  It learned on a ton more information than the   previous models, 5 times more training  data was given to it. And look at this   step. This one is right on the money. This is  tailored to retain high-frequency information,   fine details to make sure that before the  upscaling step, we have the highest quality   information available. So, is it better than  what denoisers did before? Oh my goodness,   look at that! We finally get that shadowy  region back that was filled in by incorrect   information with the denoisers before.  Yummy! So good! And that high-frequency   information retaining capability can  also be witnessed here. Loving it. Now note that ideally, here, we talk about  peer-reviewed research papers where we can   see all the weak points and failure cases, that  is my home turf. This is not a research paper,   so it is harder for me to find the flaws,  so please bear in mind that it may have   weaknesses that were not shown here. And in  the meantime, this technology is being handed   out to millions and millions of people all  around the world, and it is incredible. So,   this will be available only for the  shiniest, newest graphics card owners,   right? Well, hold on to your papers Fellow  Scholars, because here comes the best part:   no! This comes to all RTX graphics cards, even  the older ones that you can pick up for a couple   hundred bucks. This can breathe new life to aging  hardware which is highly appreciated. Note that   the particular game or app you’re running has to  be ready for DLSS3.5 for all this magic to happen. Now, I am a light transport researcher  by trade, I dream with rays of light,   and ray tracing if you will, so I am incredibly  happy to see this technology make it to the hands   of real users in the real world in such  an incredible way. And make no mistake,   this is not perfect by any means, there are  games where it does not appear to work very well,   and some people prefer to not use it at  all, anywhere. However, note that this is   still nascent technology, and it can already  synthesize more than 85% of these frames,   which is something that I thought would only  be possible in a science fiction movie. Maybe   not even there. I am absolutely stunned by these  results, even with their weaknesses. And don’t   forget to invoke the First Law of Papers. The  First Law Of Papers says that research is a   process. Do not look at where we are, look at  where we will be two more papers down the line.
Info
Channel: Two Minute Papers
Views: 384,298
Rating: undefined out of 5
Keywords: ai, nvidia, nvidia dlss, nvidia rtx, dlss 3, dlss 3.5
Id: hr85Lc_WT38
Channel Id: undefined
Length: 8min 28sec (508 seconds)
Published: Tue Sep 19 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.