Introduction to DirectX Raytracing

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Slides and more info: http://intro-to-dxr.cwyman.org/

👍︎︎ 7 👤︎︎ u/corysama 📅︎︎ Aug 14 2018 🗫︎ replies

Captions

you there we go alright let's get started so welcome to introduction to DirectX ray-tracing while you were filing in and before before I started I was showing a number of live samples tutorials that you can actually download online and play with them we're going to talk about a little bit in this course and so these are things you're gonna be seeing and they were running live on this GPU up here so in this course this course is being live-streamed so photography and recording is encouraged for those of you in the audience here today resources for the course are on this webpage intro to dr d xrc wyman org so again welcome and hopefully I don't have to cover this but let's let's ask why are we here why do we have this course and the key is that but you know Mike Microsoft announced DirectX ray-tracing at GDC earlier and I don't know if yours did but my Twitter feed blew up as a lot of people learned and relearned and pride and retried ray-tracing and I saw this image I don't know how many times on my Twitter feed so I showed it to you again it you know here before we started running live and so what we have now with DirectX ray-tracing is we have a widely used API that has ray-tracing support built in and this is great I mean coding in this is great it allows you to have straightforward sharing of GPU resources and as you've seen this week and at GDC there's a number of compelling demos that suggest that you can actually use use this in real production now of course I do want to caution you that we're not yet at full path race renderers and I hope some of you in the audience and some of you on the live stream might help us with the research that we need to actually make this usable everywhere in every game and everywhere else right now we only have enough computation for a few rays per pixel as you saw with those demos as I was moving around and so you might get something like this on the left what's quite a bit noisy I mean we'd really like to get something more like what's on the right fully converged so we need some research an additional research beyond what's already happened to reconstruct from little sample counts and of course yesterday we had hardware acceleration announced so let's talk about the course goals so if I think about who's sitting in the audience I think there's probably roughly two classes of people there's you know people who might be students faculty first-time SIGGRAPH attendees who have little or no DirectX experience and there's also I see a number of you game developers and other people who have lots of experience in rendering who might be asking well what does ray-tracing mean for my raster application and how do I think about integrating it and so these are two very different audiences and I had a hard time thinking about putting this course together in a way that's satisfied both so the plan is in the first half of the course I'm gonna aim and target towards those in the first category those with little DirectX experience who might be asking how do I get started with ray tracing and maybe what is ray tracing and so ideally we're gonna you know I'm gonna give you an introduction to ray tracing Pete's gonna do that and hopefully help you understand the new DirectX shading and the mental model of GPUs in this in the ray tracing pipeline that we now have and then I'm gonna walk through a number of incremental shader tutorials that you can download the code and actually much more extensive tutorials online to help you build a simple path tracer this is very simple it's not optimized in any way I don't pretend it's optimal code it's there to get you easy access to a great trace or a path tracer that you can experiment with in the second half we're gonna go lower level and focus on the second part of the audience we're gonna talk about the lower level C++ host site API and help you understand the basics of how DirectX has changed for ray tracing understand the mental model for programming the CPU for DX ray tracing and share experiences in integrating dxr into existing raster pipelines so that's sort of the cold my course goals so who are we well I'm Chris Wyman I'm a principal research scientist at Nvidia and I before joining Nvidia I was a faculty for ten years so Pete Charlie who's gonna talk next is distinguished research scientist Nvidia you probably all know him as the author of multiple widely used textbooks including ray tracing in one weekend which is where that sphere scene came from Sean Hargreaves is from Microsoft he's gonna talk about the low level API part he's a principal development lead direct3d at Microsoft he's worked on numerous games developed a sordid API is both at and prior to coming to Microsoft before working on direct3d and Colin Barr a bridge bob is a senior software engineer at seed Electronic Arts he was previously at Warner brother games and has an impressive list of games his credit he does a lot of blogging and shares his ideas at GDC i3 HPG SIGGRAPH and elsewhere I highly recommend you look at his slides for his HP G keynote this year so today's schedule is I'm welcoming you here are Pete's gonna give an overview what is rate racing then I'm gonna switch back and talk about what are the basics of DirectX shaders for this new ray tracing pipeline and then walk through some tutorials in a sort of incremental step-by-step fashion we'll have a break which we might also use for Q&A and then afterwards Sean will talk about the introduction to the the host side code after which Colin will talk about his experiences integrating DirectX into dude read DirectX ray tracing into existing raster pipelines and then we'll have some more Q&A so before I hand you off to Pete some some thoughts so our real time ray tracing this always meant five years away and I've always been a skeptic it would actually happen and now I'm actually convinced it is happening so of course we're not gonna match film quality renderings anytime in the near future and so we still need some R&D work here and I think this is really important for those of you who are thinking about what to do next that we need to think about how to develop things like hacks that make the image look good in 16 milliseconds you know hack is not a derogatory term here like those used commonly in all sorts of raster today what are those same techniques in the ray-tracing world and again a reminder all the images I've been rendering hair are rendered from the tutorial code that's available on the web page of course these are converged not at one sample per pixel putting on my Nvidia hat for a moment we're talking a lot more about ray tracing not throughout the week we have a a sponsored session later today and tomorrow morning and things the expo booth we also have some resources online including research code for some reconstruction filters in addition to more polished solutions on our game works product one last plug before I get off the stage there's a call for proposals for ray tracing jams if you're doing ray tracing stuff and want to publish your work in this book please see the information here all right thank you very much I'm going to pass you off to peach Shirley who's going to give you an overview of ray tracing [Applause] thank you um this my agenda in this talk is to let you know that the terminology about ray tracing is a little messed up there's 40 years of literature and we haven't been very good about being consistent and sustained you do ray tracing it's kind of like saying I do addition you know it's a primitive but sometimes in some contexts it means something much more specific and so I want to give you a bit of a decoder ring during this talk as well I don't know how to use Chris's laptop there we go so this is the paper that really gave ray tracing its entry into SIGGRAPH and graphics in general I was in college and this paper switched me out of physics into computer science and I reread this in preparation for this talk because this is what people often mean probably is a default when they say ray tracing is this paper the amazing thing is this paper is not very dated if you read through it they're very few cues that it wasn't written in the last year and this basic code that Turner Whitted wrote 40 years ago is pretty much what Chris was showing with a few changes which I will talk about the of the quiz for you guys is there is exactly one mistake in this paper I won't say what it is but if you find it out I'll tell you if you're right it's a small one Chris I really don't know how to use your space thank you okay so what I just showed you is witted ray-tracing so when someone says ray-tracing they might mean witted ray-tracing which is his algorithm and it's shiny stuff but then everybody changed their raytracer four years later when Rob cook out of Lucasfilm at the time Pixar now did these amazing soft shadow pictures and he uses he basically started calling Rand in the Ray tracer that's the change if you call Brandon the ray tracer and there's not much change people call that cook style ray tracing or distribution ray tracing and then kadia came along and said you know you can also call Rand on the diffuse surfaces and get diffused in a reflection people call that path tracing or they call it khadiyah style ray tracing and often they will leave off that modifier and they're just talking about a very specific one of these families and you can just ask them which one are you talking about people who do ray tracing all the time have to stop it what do you mean and my field can take credit for that we didn't clean up the terminology we had 40 years to do it but we didn't so here's witted style ray tracing a ray comes in it sees something shiny it takes a specular bounce there's this nice little formula vector formula for the specular bounce here is cook style ray tracing and and what it does is it says hey when I hit a mirror maybe I'm gonna scatter around that mirror direction and make things fuzzy and when when I send a shadow ray I will send it to an area of a source I won't send it to a point source it's a very brute-force approach and it needs more than one ray per pixel which is one of the big costs and why this didn't take off for a long long time so note this technique was invented in the film industry and yet it took 20 years before the film industry started using it all the time so there's simply a speed gap that picture is actually from 1984 and it even today it holds up that's an impressive picture so people often allude to this the 1984 picture that's a good exercise just try and duplicate it because it's got a lot of things going on in there but you only have to implement an infinite plane and some spheres one reason it looks so rich is there's a good environment map and so all the reflections in the ball soar into an environment map not a model and then in kochiya really there's if you take a cook style distribution raytracer and modify it into a kochiya style raytracer all you really do is call Rand on the diffuse bounces the the you change two lines of code but it's kind of a different mentality and that now the Rays are incredibly divergent and the program is super slow you need a lot of rays per pixel and so this has taken off even more slowly denoising has been a huge thing for that so here's an example of denoising which it's been an idea that's floated around the graphics industry for a long time but it's only really started being used in production recently the movie people been using it for a while particularly Disney and it works implausibly well I think this is the free lunch that has helped bring ray-tracing quickly to the forefront lately is everyone assumed you need a lot of rays per pixel and it turns out denoising is easier to do I shouldn't say is easy it works better when done right then anyone would expect I personally still find it surprising that the image on the Left can be turned into the image on the right with a time sequence and that can be done interactively so there's kind of convergence now of the hardware's getting a lot faster at tracing race and this algorithmic improvement by the industry has just given you an extra factor of 10 just like that and factors of 10 don't come along very often anymore so one thing is okay is ray-tracing gonna take over the universe and i would predict no and the reason is these are both perfectly good ways to get stuff on the screen and rasterization is pretty amazingly fast and we're very good at it and what it basically does and most of you are familiar with it it throws triangles at the screen and then they stick if they're visible and in front and the z-buffer keeps track of the visibilities you can do it in any order very clever idea back in the day rate racing just as a equally brute force algorithm that just takes pixels and throws them out into the environment it says what triangle do I see first and they're duals of each other really in there they're not that different it's really like what order does the loop go computationally they end up being quite different both in speed and how they walk through the computer but when you and this is a this is a slide Chris gave me because he understands rasterization a lot better than I do and so like wrapping your mind around what are these really bringing to the table for a visibility problem so I'm actually just gonna zoom in on one of these you can you can read this slide but there's a lot of complexity here I want to go to the acceleration structure neither of these algorithms ends up being simple in real life cuz in a game engine you might have a hierarchal z-buffer or some calling volume you're gonna have some trees somewhere that makes your program all complicated so it's really not simple like a rasterizer like a just a straight z-buffer and ray-tracing sounds so elegant you loop through the environment but you need to stick a tree on that thing too and then the model changes frame to frame so that tree needs to change and really under the hood ray-tracing rasterization both get complicated when you actually try and make it fast and they have different trade-offs and when one is good at something and one another is good at something the thing I've been experiencing lately is if you have a very weird set of pixels you need to compute like in some warped HMD then ray-tracing is a very natural way to go if you can have some very structured way your query and visibility z-buffer is good and that will probably remain true now you have it's not that ray tracing will replace the buffers it will it's another tool in your toolbox okay this is the part of the talk that I think is most important if you have written a ray tracer and you know kind of know your way around so you're kind of checking your phone during this part of the talk or if you're brand-new and people just present you if there is array array is just a 3d line and you can represent it any way you can represent a 3d line and pretty much 3d lines are not taught except to pretty hardcore engineers like computer scientists are rarely taught what are all the different ways to represent a 3d line in high school if you're from the US you probably saw 2d lines and that was y equals MX plus B can I just use that for Ray answer no so implicit equations like that the dimensionality of the object needs to be one off from the dimensionality of the space and array is a 1d object it's 1d and that you walk along it and say how far along this ray mi you need one number spaces 3d that doesn't work so you could do it geometrically I take two planes which I can write an implicit equations for I intersect them the intersection of two planes is aligned in 3d some ray tracers do do this that for intersecting certain types of patches that the mathematics works out well pretty uncommon much more common is to blend points in 3d and blend them in a way so you're constrained to a line so first let me just talk about points in 3d we usually represent that as a direction vector from the origin to a point so you have three numbers you declare a back three years thing to pay on what your infrastructure is and you say okay I can take some third vector see if I have vectors a and B defined and I can just do a linear blend of them like ten times this point and 32 times that point any two numbers you want and that will give you a point in the plane of a and B so any point in the plane of a and B can be a combination of a combination of a linear combination of a and B if you constrain that combination to be a weighted average so that the two weights on a and B sum up to one then you're constrained to the points between a and B which is a line so this is one way you can just map out a line segment in 3d which you often have to do for many graphics programs and many of you probably done this before in various contexts a key thing is there's nothing that says those weights both have to be positive so you can throw in a weight that say like 2 for B and minus 1 for a and it will allow you to migrate that one of the weights since they add to 1 you really only have a one-dimensional thing here which I've written is T as you modify that T from negative infinity to infinity this point just walks along the Ray when you plug in 0 it's at Point a when you plug in 1 it's at point B when it's between 0 and 1 it's on segment between a and B T when it's more than 1 it's past B when it's less than 0 it's behind de and so there's a certain implicit directionality this line if you think of positive as progressing though really it's just a parametric line equation for a line in 3d array sometimes and I remember I said when you say ray tracing I'm not sure what you mean unless you qualify that sadly that is also true of a race when someone says IVA ray it might mean several different things in this case it is a raise a half line so stuff behind day in other words T that is negative aren't part of the rain and that is a default is probably what people need and then there are plenty of variations of this and various graphics codes so I could say well I'll just store a and T and then I'll have this direction vector from A to B that'll be B minus a so you have one point one vector all of these are valid ways to do things and I'd say just the key is to have a really solid understanding of which one you were using and comment it because this stuff to this day gets me into trouble especially that last bit is the direction vector or unit vector and your ray tracer if you can assume it is than certain functions that take it can be more efficient but then you have to do some bookkeeping to make sure you manage those unit vectors and so here are just some examples of regions on the Ray based on what the T values are in the first one it's going to be most of the half line and notice that it starts at point zero five out there offered rays originated on surfaces and when one of the points on the surfaces do you array intersection and you hit the surface there you get trouble and so people often put in a fudge factor and but sometimes they don't it depends on the Ray so again context is really important for managing these little details in the middle red case the points are only between a and B so it's essentially a line segment and then the green case is just a weird thing for some special purpose you might do this these do come up this isn't like some almost ever happens often you take advantage of your knowledge that the geometric meaning of those intervals of T so what do we do with rays we do a lot of standard queries I'll just go through the first one and you can read down those but the first one is hey does this ray hit anything if I just send it forever usually that's a shadow kind of query you don't care where you hit it you don't care what you hit you just want to know if something is visible or if it's visible within a meter maybe you're doing ambient occlusion so they're they're often raced like that or that are occlusion rays but then sometimes you want to know what did I hit tell me what I hit and where and then there's a peculiar thing that is not done very often but sometimes it's like critical for what you need to do you want a list of everything that array hits and you might have to build that yourself depending your API or you might have it might give it to you if you ever want to do a CSG program this is amazingly elegant way to do it can be slow so how do you make rays faster and the bottom line is array is a search array intersection with the model as a search I have a million triangles I have one Ray which of those million triangles do I hit and computer scientists to a first approximation have one invention a tree and we turn order and algorithms into order log algorithms this is no exception in practice it is log in but it's not guaranteed to be so if you ever see yourself with amazing slowdowns your tree is probably about the culprit and the the bounding volume hierarchy there are lots of different acceleration trees used in ray tracing the bounding volume hierarchy has kind of one in practice so it's just like quicksort is the sorting algorithm you go to and it's probably some very into is implemented in your libraries same thing for BB ages so if you read a bunch of ray tracing literature and you're trying to get up to speed we all have a finite amount of time just go read the BBH stuff the interesting thing is theoretically people don't fully understand why they work as well as they do where why we build them the way we do so if you're a theory type this might be worth digging into I've tried so I can tell you it's not hanging fruit but I'm not a theorist okay I want to close and take a little more time on on this slide this is important for both the beginners and the veterans here let me just take a quick query who has never written a ray tracer maybe 20 percent of you and who has never written a complex Z buffer based program like an OpenGL or DirectX who's like a novice okay not as many of those so most people in here have written at least a rudimentary ray tracer and have some z-buffer experience there's a cultural thing about ray tracing that is evolutionary I think so back when I was in school if you just personally had not much interest in speed but really liked pretty stuff you became a ray tracer and if you like stuff to move rather than going home and coming in the morning you became an interactive programmer and the interactive programs didn't look very good and then along came shading languages and people said huh I can actually use this Z buffer for all sorts of things it was not intended for and did amazing hacks and the pictures look frigging awesome and every one of those little hacks is quite clever and when they came along you're like wow that was clever no such evolution ever happened in the ray-tracing literature we were like well it's already slow I'll just implement one of those you know witted ray tracer or cook ray tracer yeah I could probably make it 30% faster maybe I'll get a grad student to do that or something then Along Came you know various ray tracing libraries that did that for you but using the raises themselves to do something very clever in the spirit of the z-buffer community that really hasn't happened so we're in this very surprising state in graphics which is a very mature industry where there's this new tool that's been put in your toolbox and the army of clever people that have specialized in doing cool shaders like half the stuff at shader toy I'll stare at it for an hour and not understand why it works it produces this amazing picture that has not really happened with ray tracing and that there's like a two year I'd say I guess based on past history of lots of time for clever people to make this stuff work a lot better in ways that it wasn't designed for so just like shading languages the people that did those originally they did not foresee the crazy stuff that was done with that and my hope is that that would be true of ray tracing as well it's not an algorithm it's a hammer big hammer [Applause] thanks Pete that was great all right so now I'm gonna walk in and give you an overview of the hlsl shaders and walk through tutorials and and and we'll do that in a little bit again thanks Pete for giving a nice overview the basics and now you might be asking how do I do it in DirectX and of course instead of using DirectX you could start from scratch you could go write a CPU ray tracer there's plenty of resources really good resources out there and you could or you could apply or you could write your own GPU ray tracer which can be tricky and a bit ugly for those of you who have tried to do it you could also use vendor-specific a api is they hide a lot of this ugly low-level details on the GPU but they don't scale across vendor and they're hard to interact with raster sometimes so of course you're all here to hear about DirectX ray tracing and so that's that's the goal today so again as I said before it's now ray tracing is part of a widely used API namely DirectX which allows you to share resources with no copying between raster and ray tracing no copying no overhead of different API is nothing like that and it works across vendors either via vendor supported software and hardware or there is a standardized compatibility layer that runs on DX 12 class GPUs and so just an overview of modern graphics API is for those of you who may not be familiar you can sort of in a very high level of 10,000 foot view split it into two main parts there's code that runs on the GPU which are sometimes called shaders and it runs parallel on the GPU but you program it in a very simple-looking way it looks serial and then there's CPU host code that the API itself which like the needs often like things the DirectX API which it sort of controls everything else in the system besides what's going on in the Jeep you and the talk I'm gonna give you now is gonna focus on the GPU side the shaders and after our break Sean's gonna talk about that the host side here so for those of you who haven't been programming shaders all your life what are they so they're developed developer controlled pieces of software that control the graphics pipeline these are the parts that aren't automatically at all managed by the graphics API the drive or the hardware and other things and really this is where you get to write your GPU code so these are usually written in a C like high level language in DirectX this language is called the high level shading language HLSL and these shaders can either be used to represent incomplete GPU tasks for instance with a computer shader or they can be used to represent some subs some subset of a more complex pipeline for instance the vertex shaders in today's graphics pipe now before I go and talk about ray tracing pipeline let me just do a very brief overview of what DX raster u looks like today so in rasterization you send your geometry data your vertex data down from the CPU and you run through vertex shader which you can think of it kind of as it transforms it into the right relative location to the camera then you run these tessellation stages which combine these vertices together into triangles and dice them up into smaller triangles after which you get to run a geometry shader that computes things on a primitive basis maybe for some reason you want to compute a normal of these new tessellated triangles would be very very simple example after which you go to the rasterizer which in here is green because it's it's done by special purpose hardware or perhaps software behind the back and you don't get to control everything you just can get to control some parameters of this then the rasterizer takes these triangles and converts it into fragments or pixels on the screen after you write after which you run a fragment shader or a pixel shader to do your shading to compute a final color after which you output this to the frame buffer and again this is done using some sort of special purpose hardware to hide things like compression and different image formats and so if you squint a little bit this pipeline kind of looks like this right you input a set of triangles some code runs to transform this into a set of displayable triangles you run some special-purpose stuff that converts that to pixels after you at which you get another controllable shader to generate colors and then you output to a final image and so if you think about that you could kind of phrase ray tracing in a similar way so what might this similar ray tracing pipeline look like so again if you squinted and again this is a very simplified representation if you squinted you might want something like this you take a set of pixels as input you run some code to generate Ray's figure out where those what what rays correspond to what pixels you run some specialized stuff that intersect those rays with the scene and then you get to shade the hit points and then you output that to your frame buffer so in some sense this looks really similar to that ten thousand foot view of the of the raster pipeline except of course algorithmically ray tracing gives you this advantage that you can recursively spawn rays and your shading and go back through your acceleration structure to get multi bounce effects all right so now let's actually get into the meat what sort of shaders do we have in DirectX it turns out this very very simplified pipeline that I invented up there that's sort of what it looks like I'm pretending that's that's what it looks like is actually split into multiple different shader stages there's a ray generation shader this is going to run once per algorithm or per pass and that's this is what starts your ray tracing so this is where you take your pixel information and turn it into Ray's then there's a new shader stage called an intersection stater and this is how Ray's intercept this defines have Ray's intersect the geometry how do you intersect the ray with a triangle how do you intersect the ray with fear how do you intersect array with a bezier patch you do that by writing different intersection shaders then when you're tracing your rays what happens if you miss well you can control that using your miss shader what happens when you hit well that's controlled with your closest hid shader and this closest hid shader if you want to think of it at a very very high level is similar to your fragment shader this is usually where you're doing shading computations and then there's an inny hit share which is one run once per hit and I just like to make a note here as you go to use more advanced usages of this please read the spec this the definition of any may not match your expectations these collections of the miss shader closest hit shader in any hit shader define the behavior of rays and you can have different rays in your program you might as Pete was talking about you might have a shadow ray you might have a color ray you might have a primary ray you might have other sorts of rays and so you can have multiple of these shaders depending on how many rays you have now something I'm not going to talk about today there's actually a sixth shader a callable shader which allows you to call one shader stage from another essentially branch or shading stages and I'm not going to talk about it further you can read about that in the spec if you're interested alright so let's take this you know very simplified pipeline I invented and actually turn it into the real DirectX pipeline so again I said we start with a ray generation shader so this is where you've launched new Ray's and in HLS all that you did is cut you do this by calling the new trace ray intrinsic and then it goes into sort of this abstract ray tracing happens blob and you come back you get a color and you can accumulate that into your final image so now let's actually look and zoom in to what happens in in this nebulous ray tracing stage so a good mental model might be first we Traverse our scene to find what our ray hits so now we're thinking about an individual ray it's tracing through the scene what geometry do we hit so we go through this bounding volume hierarchy or whatever your acceleration structure is and if you find a hit point you'll shade it with the closest hit point or what excuse me it's the closest hit shader if you don't find any hit you'll shade with the Miss shader and so this is these this is closer to hit and mini should hit shaders could be thought of as your shading stage in raster now let's look through the traversal loop the first thing that happens is your Ray goes in and goes into some green box here some opaque acceleration process by which the GPU the driver combined find out what sort of objects might be potential hits for your Ray if you don't find any you immediately leave this loop and go hit your mission you go go run your missed shader if you do hit something in your bounding volume traversal you get to run this intersection shader and this allows you to control do you intersect this particular piece of geometry maybe it's a sphere maybe it's a triangle maybe it's maybe it's something else of bezier curve in in DirectX there is a built in triangle shader so you don't have to write a triangle intersection share but for anything else you're on your own you get to write custom software if you don't hit anything right you're you you run this intersection code and you say now I didn't hit sorry you go back into this acceleration structure and keep going if you do hit but it's not the closest one you also go back into Excel into the acceleration structure and keep looking for potential hits so you leave the intersection shader and you do find the closest hit what happens well you go over here and check whether it's opaque or not if it's not opaque you run the in II hit shader and again I'll put my little symbol hair in that I didn't really name this so in order to run the in II hit shader has to be the closest hit you've seen so far and it has to be flagged as not opaque if it's not it so in the in E hit shader you can ignore it for instance if you're doing alpha testing and you find a texture that's transparent you say ah this isn't a real hit keep going and you can cycle back into the acceleration structure loop otherwise you update your closest hit and keep going because you might have another hit that you haven't found yet that's closer so you keep cycling through this until you no longer hit anything at which point you go over here and you ask any time through this loop did I hit anything and if the answer is no you run you're miss shader if the answer is yes you run your closest hit shader after which you leave the trace for a loop and you get your control back wherever you launched your trace ray okay so in summary the high level view of DirectX ray tracing shaders if you want to control where you raise start you should do that in the Ray generation shader if you want to control how your Ray's intersect geometry that should be in your intersection shader or in one of your intersection shaders if you want to control what happens when your raise miss that should be done in your rays miss shader if you want to control how to shade your final hit points that's probably gonna be in your closest hit shader and if you want to control how transparence behaves you should probably see your any hit shader alright so what goes into one of these DX ray tracing shaders so starting a DX our shader you need to have an entry point right just like in C C++ you have main in DirectX shader points can be arbitrarily named and so in this case I've named it pinhole camera ray gen right above that is the hlsl attribute so this specifies that this is a ray generation shader so the compiler knows and it knows how to you know connect that up with the host side other shader types look like this an intersection shader you tell it it's an intersection shader and you give it the name it's any name you want I mean in both these cases there's no parameters required for these shaders for a missed shader you tell it it's a miss shader and this requires information about your array so that you can pass return values back for instance your payload might have a color and any hitch shader looks something like this you define it as any hitch shader you give it a payload and also where you hit your surface so the attributes tell you where you've hit your surface and then the closest shader looks like this you also give it your payload and where you've hit your surface so the Ray payload is user-defined so you can control what goes in here and it depends on what you want to do with your ray a shadow ray you might just want to return a binary thing did I hit or not a color ray you might want to stash a color in here and for more complicated rays you might want to do more interesting things in there the intersection attributes are also well these are also user-defined but but they define they they they they correspond and transport data from your intersection so you find out where you hit the surface into your closest hit shader or your any hitch shader where you might need to know where you have intersected something in order to shade it so what's a payload as I said before it's arbitrarily user name or user defined and user name structure and it contains the intermediate data needed needed for your ray tracing it's passed along with the Ray you can think of it as piggybacking along with with each ray if you're not familiar with HLSL and you're starting to see this code up here each also has a bunch of built-in data types bool and you int float and you can also put them into vectors like this float 3 which is an array of vector of 3 floats you can also do this with matrices four by four up to four by four one thing to note is you should keep this Ray payload as small as possible because if you make it large things happen like you spill out of GPU registers into memory and that will slow you down quite a bit so a simple ray might look like this right you have miss shader you pass in a payload which has this color that I defined up there in my structure and in this really simple miss shader I'm returning blue and then you might have a closest hid shader like this and this is also a very simple closest hid shader I'm returning red um this actually defines a complete ray type which will return blue when you miss and red when you head so what are these intersection attributes these allow you to communicate between your intersection shader and your shape job your any hit and closest hit shaders and they're specific to each intersection type there's a built in one for triangles and but if you want to define a sphere intersector you have to define a intersection attributes structure for your spheres the built-in one looks something like that it has a couple berries of a float two with very center coordinates where you hit the triangle and that's what you're given for shading potentially this is just me inventing things you might look your your sphere attributes might look like this you might have a Theta Phi and you might have a float 3 if you're doing a volumetric intersection I'm not claiming these are right these are just a explanatory you might imagine custom attributes looking something like that be very aware that these attribute structures are limited in size currently in the spec I believe these are limited to 32 bytes so you've got you know 8 floats to store intersection attributes and you should make that as small as possible as well so simple example let's think if you know if you haven't written DX code and HLSL what else what 1 and even if you have what what data do you need on the GPU to shoot rays well you need some place to write your output so let's define an output texture a read/write texture a UAV and in this case it's called out text and it is a float form so where are we looking you know in ray tracing we have to you know we have to put our camera someplace so we need some information about where our camera is and you need to pass this down to the GPU someplace or somehow in this case I've defined a constant buffer with a camera position and a uvw coordinate frame for this camera and we need to know some information about our scene like where we're exam what our acceleration structure is so we can intersect it so that's enough to get you started with a really basic DXR shader you also need some more complicated information about how to shade your scene and this will be different from framework to framework or renderer to renderer depending on how you're storing your shading and material values and and even your geometric values on the GPU and so you can look at the tutorial code online to see how I do it but in your render this will have change so let's walk through some a really really simple DX are a generation shader so in the upper right in that small text that is the the CPU GPU data declarations that I did on the last slide so here's my Rea generation shader the first thing I'm do I'm doing here is asking what pixel am i this uses the new interest intrinsic display rays index which what which says where on the screen you know where I'm on the screen MI and then the second one says how many pixels in total am i launching in this ray launch and you query that with an another intrinsic Dispatch Ray's dimensions and then I'm going to take that information and convert it into array so I find the pixel center for my current pixel in the 0 to 1 convert it to normalized device coordinates similar to raster somewhat and then using my camera matrix I turn that into a ray that goes through the center of the pixel and so if you want to squint a little bit collectively these few lines take a pixel ID and turn it into a right dimension or excuse me at ray direction once I have that Ray Direction I can set up my Ray to do this I'm going to use the new DirectX data structure called ray desk which has a couple different this is the built-in type here it has an origin a team in a direction and a team ax so if you want to define it in curly braces that's the order you have to define it here I passing the camera position down as the origin that makes sense this ray direction that I just computed through my pixel down is the direction t-men of 0 and at E max of number you could use float max here I was too lazy to type with all those characters team in here is zero I'm starting at the camera there's no problems self intersecting hair because I'm on the camera there's no geometry so I don't have to worry about any epsilon value and T Max so I'm going off as far as I can in the scene to see if I'm intersect anything between those two values that's what's going on there I need to set up a ray payload right this is going to contain the color that gets returned and so I define this simple ray payload that I used on a prior slide as an example that just has a color and I initialize that to black then I trace my Ray now I'm gonna step through this hair trace raise a new hlsl intrinsic sorry the first the first parameter here is the scene acceleration structure which acceleration structure are you tracing raise against and this needs to be passed in from your host side code and you you can see either the the tutorial code or Sean will probably talk about this a little bit and in his section the next flag controls any custom behavior that the Ray might need in this case there's none I'm using the default ray behavior the next parameter is a little a little confusing it's an instance mask and by default 0xff tests all the geometry that's probably what you want for basic things if you have more complex things like you don't want to test certain classes of geometry you might give them a mask and then you can easily skip that geometry with this this instance masks the next three parameters define which shader is you're going to use well traversing through this so which miss shader are you going to use which closest hit shade are you're going to use which any hit shader are you going to use and which intersection shader you're gonna use no sorry the the first one which I've labeled here hit group is a collection of intersection shader any hit shader and closest hit shader and these are defined when you you initialize your Ray your yeah you initialize your code on the CPU side and these are these are there's integer IDs the next parameter is the total number of hit groups that you have defined and the third parameter there is which miss shader you're using then you pass in the Ray which I defined up here and the payload which I defined over here and that completes your trace ray you go off and trace your ray things happen it comes back and you say all right grab that payload color and stuff it out to my my texture if you combine this with this really simple red blue shader where you you hit blue if you mentor you if you hit you get red you you miss you get blue you now have a complete DXR hlsl ray tracing shader that will give you a red if you hit and blue if you don't okay so now that I've give you in a simple high-level view while an actually a low-level view of what these shaders do let's actually ask a little bit more specifically what sort of features do you have available to you here so you have available all the standard HLSL datatypes textual resources etc and I would refer you to the Microsoft documentation if you don't know what those are you have the standard built-in HLSL intrinsics for math things like you graphics spatial manipulation 3d math you know square root cetera trigonometry vectors normalized length matrices and again I'll refer you to the Microsoft documentation if you're not familiar with all this already then we have new intrinsic functions specifically for ray tracing and those I do want to walk through because those are interesting so the first set of new intrinsics that we have are related to ray traversal so you've already seen trace ray this is a ray you know you this is this is what spawns the ray you can call this from a ray generation shader a closest hit shader or miss shader which allows you to do recursive ray tracing you cannot launch it from an intersection shader you're in the process of intersecting your existing ray you can't launch one there and you can't launch it from the any head shader report hit is what you use in your intersection shader to say aha I found one go shaded in the in II hit shader you can ignore a hit that you found in your intersection shader basically hey this is transparent I don't actually want to do final shading on it keep going that's what ignore hit allows you to do and accept hit and end search it's like oh I found the one I'm done stopped go shade this one don't keep traversing and that allows you to speed things up if you have a type of Ray where you can you can know that in advance all right another class of built-in functions are the Ray launch details and you've already seen these there's dispatch ray dimensions and dispatch Ray's index which allow you to query given your current launch of Ray's where are you in that you know grid of launches and you can use this in any shader here rage an intersection any hit closest or miss and so if you launched a full screen 1920 by 1080 passive Ray's you can get 1920 by 1080 back from your Ray dimensions and some number between 0 and 1920 and 0 and 1080 with your index another category of new built-in functions is there's just one is hit specific details gives you information about your current hit in this called a hit kind this is available in the in e hit and the closest hit shader that's the only place you're dealing with hits I mean this is actually user specified so in your intersection shader you can specify what sort of hit kind you have so that you can reason about that when you get you to your shader and in the case of triangles there's two built in hit kinds front facing and back facing so if you want to shade differently based on which side of the triangle you hit you can do that and you can query that with the hit kind function another class of new built-in functions are are for doing ray introspection so properties about my rave so you can find out what your current T value is along the what your minimum value is which was passed in as a parameter to trace rays what your flags were which was also passed in as a parameter to trace rays along with your ray your ray origin and Ray direction in world space which also were passed in as parameters of trace ray so basically these functions query parameters that you passed a trace ray and also where you are currently on the the Ray there's another category of built-in functions that allow you to do object introspection what object am I currently interacting with so you can ask what's the instance index instanceid primitive index along with transformations from your objects based coordinates to world space coordinates and your ray origin and Ray dirrection transformed into object space and these are all and these are available only in the intersection any hit and closest hit shader where you have both array and an object so they're not available in this shader alright so that was sort of my high-level overview of of HLSL DirectX ray tracing shaders I thought it would make this a little bit more concrete by putting together a bunch of tutorials and these tutorials you can download from the web page which is there and some of which you saw before the course started running on on this machine up here oh and I want to walk through these step by it a little bit step by step to sort of help you understand how simple this actually is to build a ray tracer once you have the the the host side code encapsulated in a way that you you're just writing HLSL basically so what are the goals with these tutorials so these the code that I'm providing online is they're there to provide a simple abstraction to help you get started I don't pretend these are well-written we're all optimized they're written for clarity not for performance I'm providing code that you can use as a starting point and what I hope to show you is that DirectX ray tracing can actually be straightforward if you encapsulate all the hosts details you can actually get up and running with DXR in ten or fifteen minutes hopefully this will allow some of those some of those of you who might be students etc to do some quick experiments with hybrid ray raster techniques in addition to encapsulating Direct X ray tracing I've also encapsulated Direct X raster and allow you to communicate between them in a very simple straightforward way so hopefully this allows you to also get started with raster if for instance you are not familiar with DX 12 and I've also abstracted the GPU CPU communication in a way that makes a lot of sense to me to help me think about to decode at the level that I think which is perhaps more abstract than some people so the tutorial code is structured into two pieces the first piece which I'm not going to talk much about except for a quick a quick overview here is a high-level C++ abstraction using a render graph style programming model and then of course there's the standard hlsl DirectX ray tracing shaders and these are mostly standard there's one tiny detail that's different that allows you to access our internal scene representation for our tutorial so this C++ abstraction which again is not the point of this tutorial but I'm just providing here to show you how easy it is to get started this this is a render graph this is how you initialize one of your render graph nodes you ask for the resources the texture resources that you might need while doing this computation in this case this is the simple red blue example from before I just need an output texture so I asked for the default output texture that's what's going on there here I'm creating my DX our Ray launch so here is the HLSL shader that I'm going to launch my shaders from and you've seen examples of this already these here are the names of the entry points so the Ray gen shader the miss shader the closest hitch shader and the in II hid shader that I'm going to be using during this race during this ray tracing this specifies the miss shader this specifies miss shader number zero in with my abstract the first one you specify is number zero the next one you specify is number one etc this one specifies hit group number zero again they're specified in the order in code so that you know which which number they are if you don't have a closest hit shader you can skip it by not passing one in if you don't have any hit shader you can skip it by not passing one in you compile and build the internal data structures and then you set your scene down there when you render your your code looks something like this using this abstraction you ask for your output texture clear it pass your variables down to GLSL in this case my variable that I'm passing down as a texture and it's it's named G out text in my hlsl shader and the thing on the right is the texture I'm binding to it and then I run my DX our launch and that's how many rays I'm launching in this case I want to launch a screen size bunch of rays so again that's that's the C side and let me let me be clear this is not the focus of the talk the tutorial goes here are quickly getting you started with gpu-accelerated ray tracing and this hides the the ugly dx12 stuff if you don't want to deal with that and gives you an easy-to-use abstraction layer it's not designed to expose advanced functionality if you want that you should listen to Sean talked after I'm done if you want to use these abstractions feel free the codes online grab them and use them I encourage that so the focus for the rest of these tutorials tutorials is going to be the HLSL shaders that we need to write so this is the same HLSL shader I showed you before although I've simplified and reorganized it a little bit again the name here I specified in my C++ code for the launch my output texture is this this texture that I also specified in my C++ code and these values here the camera position and the scene acceleration structure are specific Global's provided by our tutorial wrapper we need a ray payload and then this funk and hair is that a utility function that simplifies that five lines of mess that I had on a previous slide into a simple abstraction here and again you take a camera position and three vectors plus you are position on the screen to get you a Ray for your current pixel these values here are the min and Max distance to search along the Ray I'm using miss shader number zero which again I specified in that order in my code I'm using hit group number 0 corresponding to my closest hit shader and I only have one hit group so that's that value there the one non-standard bit is up at the top I add this non-standard hlsl import ray tracing this is actually from our slang preprocessor and the slant in the the slang language is actually paper being presented this week at SIGGRAPH see what you want to find more about that you can do that the high level is that line gives you access to our internal scene information and so any renderer that you put this into is going to need something to pull the scene information into your ray tracing and so it's non-standard but everyone's gonna have to do something to pull that information in and so with that C code and that shader on the right you get something that renders that thing on the top and I've put the scene rendered a little bit more interestingly on the bottom just so you know what you're looking at so you see blue in the window where you don't hit anything and read everywhere else so most of you probably understood what I just said but let me run through this in English so these first couple these first couple lines in the Ray generation shader take your pixel ID and convert it to array through the center of your pixel this code here takes that direction turn creates array and traces that out through the camera into the scene if you hit something you execute the hit shader here and you return red if you hit nothing you execute the miss shader hair and you return blue and then when you're done when you've either executed hit or the closest hit shader you come back here after you finish the trace ray call and you write that out to your buff so that's what's going on all right so let's go to a more interesting example and this is an example in the tutorials you can get online let's build a rate race of G buffer so for those of you who don't know what a G buffer is it's it's common commonly used in deferred rendering and the basic idea is you're gonna save all the visible geometry in a first pass and then you'll shade it again in some later pass using some more complex shader and so it might look like something like this and for those of you who are game developers you're gonna say wow you're gonna store a full position buffer but yes for this tutorial I'm gonna dump out a position a normal and a diffuse material so what might the shader code look like for this and in this case I'm gonna change a few things up I'm actually gonna write out my color in the closest hid shader rather than in my rate generation shader so what's going on here well I need to pass in the textures that I'm going to store out remember I had three out of position a normal and a diffuse texture so I passed all of those in as rewrite textures UAVs and DirectX and I'm gonna output the those values there here this is a tutorial provided utility for accessing our scene information this gets information about you know the current hit and your closest hit shader and returns it in the shading data structure if you think about what do you need to get information to shade you need to figure out your material you need to know what triangle you're on and you need to need to know where you hit that triangle so you need to pass in a primitive index and the intersection attributes which are barycenter X and DX are so the rate generation shader looks actually almost identical to what I just showed you in the last tutorial except I don't have an output at the end I once you return from the trace ray you just end and that's because again I wrote the output in the closest hid shader in this case I'm not actually using the payload currently the payload to be a structure so I define a dummy structure and pass that in along with the Ray and my miss shader I'm going to assume that you clear your buffer before you run your GG buffer pass so my mission or doesn't do anything and then again I add this non-standard import rate race and give you access to the scene acceleration structure and information about the scene to the top so everything below that is standard hlsl and that's not so with that you get that shader the shaders give you those output buffers for this scene over here alright so now that you have a G buffer and you could generate the same G buffer with raster and in fact one of the tutorial shows you how to do this with raster and once shows how you how you do it with ray tracing if you want to compare and contrast once you do that you can do something more interesting with what excuse me with that that buffer maybe you could do ambient occlusion so what is ambient occlusion for those of you who don't know it approximates the incident light over Hemisphere and gives you this sort of very soft shadow sort of like you see here this uses one ray per pixel and it's really simply to implement you shoot a random ray over the hemisphere you check whether you see any occluders within a certain specified distance your ambi inclusion radius and you if if you don't hit anything you return one that's lit if you do hit something you return zero for for black there's more advanced versions that but that's the simple way so how is this code different than what I just showed you well it runs in two passes it generates G buffer and then spawns rays from those hit points to compute ambient occlusion I just showed you how to do that so the next step is this part and we're going to focus on that so let's look at the rate generation shader for this ambient occlusion it's a little bit different because I'm not spawning rays from the camera I'm looking up their origins and the surface normal from a texture but because I'm doing some randomization I need to create a pseudo-random number generator so I'm going to do that using the launch index the the current pixel on the screen to help seed my random number generator this is fairly standard see the code for details after that I'm going to load the position from my G buffer remember this is the texture I output in the prior pass so this stores the position that I hit some geometry at that pixel I'm gonna load that and I'm gonna load the surface normal as well then I'm gonna set my default ambient occlusion value and why do I need to do this because well if you hit the background like these black pixels here I need to have some a me inclusion there I'm not gonna spawn new Ray's I'm gonna call it one for completely lit there and then at the end I'm gonna output this out into my color buffer once either directly or once I spawn Miami an occlusion ring for ambient occlusion rays I'm gonna get a random direction you can see the the code for the details of doing this is a pretty standard random selection of array on a hemisphere in this case it should be cosine weighted and that's also in the code and after that I shoot array now this shoot a array is is a wrapper I've written around ray tracing and I'll go show you that in a moment I'm gonna shoot the ray from my position that I looked up in the G buffer in the direction of this random ray and the min and Max distance I'm gonna pass in our 1 e to the minus 4 this is this fudge factor that Pete talked about in his talk so you don't do south intersections and the maximum value is something that I'm passing down from the C side so you can change the radius dynamically of your ambient occlusion if you want only nearby details you can use a small value and if you want whole scene a mutant clusion you can turn this up to a very large value so how do we shoot our AO ray well it's it's actually very simple and and no this code will be in the shade same shader file with what was on last slide in the shoot AOA array function I'm going to set up my Ray description that's the origin min T direction and Max T and those are just taken from the parameters to the shoot a array function I'm going to create a pale trace the ray and return that payload from this now one thing you might notice is my payload here which is defined up at the top of the of the of the slide is is either gonna hit store 0 or 1 0 4 I hit something and 1/4 I missed something and I'm initializing it to 0 which means I'm assuming that my ray hits then I'm going to define the flags for my trace ray this is this is a little different than the Rays I've traced so far and then I'm adding some special Flags this one the skip closest hit shader says we don't need to shade I'm either returning 0 1 and I'm assuming I hit so I've already set the value to what what I need if I hit so I can I don't need to execute a closest hit shader and since AO rays are basically visibility rays as soon as I have have any hit I know I'm done I don't need to continue searching just finding one is good enough and both of these speed my performance or improve my performance quite a bit my miss shader if I if I don't hit anything I set this return value to 1 if I if I hit something here I'm doing an alpha test I run a little program or excuse me a little function here alpha test fails which is specific to my tutorial because it has to query the texture if I don't if I hit a transparent part I ignore the head if I don't hit a transparent part of the texture I keep going and that is the code on the DX side for ABI inclusion so on the sea side just started to remind you I don't have to change very much remember our hlsl variables are this I need a position texture a normal texture and output texture and this ambient occlusion value here and in the C++ infrastructure that I have which I think this is the last time I'm going to talk about it you specify those variables in this way this don't I guess this hair is grabbing the texture for the position this hair is grabbing the texture for the normals from my G buffer this is my output text and here's how I pass a float value into this constant buffer I in the shader alright so that gives us a mean occlusion values that looks something like this on the left but what if you want more rays what if you want less noise let's shoot some more rays and use in this case 64 rather than 1 so to do that it's actually pretty straightforward I need to pass in the number of rays that I want to shoot and I now put a for loop around my my shoot ray and accumulate the values from multiple ambient occlusion rays and average them divided by the number of total raised that I'm shooting and so it's very straightforward to go from one sample one AO ray to multiple ones the only difference is you have to pass in a new value from your C code alright so next I want to add a few shadows and global illumination we've sort of been doing some really simple stuff with G before an ambient clusion let's go to something more interesting so of course the few sliding looks kind of flat and boring you can ask shadows which add a little bit more you know presence to those that those bits of geometry and if you add global illumination it adds quite a bit so let's let's talk about how you add those so again how is this code different well it shoots two types of rays this time it shoots a shadow ray and it shoots an indirect ray the shadow Ray's just test visibility and the indirect rays return of color so in this case shadow rays are identical to AO rays that I just talked about in the last set of slides except I've renamed them to say shadow instead of AO but otherwise they're identical there is one slight change though to the shadow rays is because I now have two types of rays the parameters to trace ray need to change rather than passing in 0 1 0 which is what I did for ambi inclusion I now have multiple types of of hit shaders and miss shaders so I have to pass in the appropriate user so for in in in my tutorials shadows are hit group zero in miss shaders zero and there are now two hit groups because I have both a shadow ray and a color array so this will be 0 to 0 rather than 0 1 0 as it was before so color rays are a bit more complex and let's let's talk about how so first of all the payload now now needs to be a little bit more interesting than just a single float which is either 0 or 1 and so in this case my I'm going to have a color stash in there and I'm also gonna have a random seed to allow me to pick a random direction and reuse the seed along all rays and a pixel rather than reinitializing it which would lead to interesting artifacts the mist shader needs to return the background color here I'm wrapping it in a function get background color that you'll have to define yourself this could just return blue if you want the hit shader is actually identical to the hit shader for shadow rays because all I'm using it for us to test whether I've hit a transparent part in my texture for alpha testing so that's actually the same as my shadow ray any had and in this case we actually have a closest hit shader unlike the shadow ray and so what's going here on here again both of these are shading specific um sorry the the first line of the closest hit shader here is querying my my scene representation to return this shading hit alright excuse me the shading data structure and then the second second function is actually shading it with the diffuse shading Shin which I'm going to talk about in a moment so shooting a ray is pretty simple it's actually simpler than the shoot a Oh ray because I'm knots best to set it setting special five special flags the only real key thing to note here is that since I now have two Ray types again the parameters for my hit group total number of hit groups and miss shader are one to one because that's how I specified them in the C code I'm now using the appropriate hit and miss shader for my color rays so now let's take a look at this function diffuse shade what happens here so in the fuchsia shade the first thing I'm going to do is I'm gonna pick a random light to do direct lighting from I'm not going to shoot a shadow rate to every light I'm just gonna shoot it to a random light and so i pick one of the lights in my scene randomly and i define what the probability of sampling that right light is i'm gonna sample them uniformly so it's one over the number of lights and you'll see why i get to this in a moment the next thing i need to do when i'm doing a few shadings i have to get information about that light so in this case i'm accessing a data structure that's specific to our tutorials our framework called g lights which stores the information about the light I'm pulling the intensity in position I'm going to compute the distance to the light and the direction of the light so I can shoot a shadow right then I'm gonna do some some shading computations again this is diffused so I'm going to use a Lamberson brdf here so I need to compute n dot L what I need to do that anyways but there's n dl I want to make sure it's in the range 0 to 1 so I'll saturate it and then I'll shoot a shadow ray which is that function I defined before and this will either return 1 or 0 depending on whether your lit or not my ray color is then the intensity of the light times that 1 or 0 that I return it's either bright or it's completely shadowed and then I add and then I return that shaded color from my function and you might say oh man that looks kind of complicated what's going on there as for most shading if you want to figure out how to do shading right you should really go back to the rendering equation and I'm going to walk through this very very briefly so you understand what the heck is going on there so this is the rendering equation right you get your brdf times your incoming light times this n dot l term and you integrate it over the hemisphere now most of the time we're not actually going to do analytic intersection or analytic integration we're going to do something like Monte Carlo sampling which will look something like this now in our case we're only shooting one shadow right here so n is 1 we have n dot L that we just computed above there in the shader we have our incident light direction which is the light intensity modulated by our visibility term and we have the brdf which for physical base diffuse is the albedo over pi and then we need to divide by the probability of our sampling which is why I computed that at the top as 1 over the number of lights and so that is my diffuse shade function so our rate generation shader is pretty straightforward we're gonna launch a ray figure out our a random scene we're gonna grab our hit position our normal and our diffuse color from our G buffer texture we're gonna compute our direct shading at the surface using that same fuchsia shade function I just defined so I'm gonna use that once here in the Ray generation shader and once in the closest edge shader um and then I'm gonna shoot my global illumination rate again this is only one bounce so here what I'm gonna do I'm gonna get a random direction to query what's the incident light in this random direction I'll shoot my color ray and then I'm going to accumulate it with my diffuse texture color and then I'll output the result to my output texture and you know this this accumulation here is acceptable but it's it's not really and for those of you who've written render ray tracing ray tracers before you've probably had to add random values of Pi and 3 and you know E and all sorts of other random things because you're not quite sure what's going on and so really whenever you see something deceptively simple like this you should go back and look at the rendering equation and so you might say well what's going on and the key thing that you may or may not know is that for cosine weighted samples the probability is this and so that allows us to cancel the NDL and then our brdf we should be using pi times the normal brdf and physically based rendering or PI so the PI's cancel and that's how we get this deceptively simple result so you should remember this when you're doing ray tracing and your what should I do for shading go back to the random equation so we get something that looks like this for a number of different scenes but that's only one balance what if you want to add multiple bounces so one bounce looks like well no bounces looks like this it's a direct lighting only adding one bounce improves things quite a bit adding two bounces gives you even more and so maybe you want to add multiple bounces to your render so these extra bounces can have a big impact but it doesn't really have a huge code change with with ray tracing so remember our closest hit shader that looks something like this you get the shading information from the scene and then you call this diffuse shade function we don't have to do much to add multiple bounces we just add some global global illumination stage right here which looks exactly like it looked in the rate generation shader I just showed you except for this if statement and the reason we do that is we don't we don't want to bounce infinitely or you'll get stuck in a you know an infant loop and your code will pause and die you don't want that you won't you want to limit the number of ray bounces to something reasonable and in fact I passed this in as a parameter so you can control it and add more bounces as you go this also requires your payload to have a Ray depth and that's provided there and so you compare your Ray depth against your maximum rate depth and don't shoot a GI ray if you've gone too far so with that you get something that looks like this now what if you want a different material model well that's pretty straightforward remember this diffuse shade you can pass in and you you can define a new function perhaps one that does a ggx model there I'm not gonna walk into that that's in the tutorials online and again you also have to change your your global illumination rays here to use a GTX model or whatever model you'd like to use so you need to change your shading and that's it so with that you might get something that looks like and again all these images are rendered using the tutorial code that you can download and most of them are free assets that you can play with as well one last example and that's the one I showed as people were filing in sphere rendering so all the examples I've used so far have only only been standard raster scenes that we've been using in our research team for quite a while meaning only triangles but one advantage of ray tracing is you can do arbitrary shapes like spheres and so just for the purposes of tutorial I want to walk through one of those which is spheres you can figure out how to do other ones a fairly straightforward way from that so what do we need to do to do sphere rendering well we need to define how rays intersect the sphere and I'm not going to walk through the math of this there's plenty of documentation online you can google ray-sphere intersection and find pretty good resources but what you need to do is you need to write an intersection shader in this case call it I'm calling it sphere intersect and in order to do this you need to know information about your sphere so the first thing I do in my sphere intersect is I ask the question what sphere did I hit pull the information from some buffer about my sphere in this case my G sphere data is an array of spheres in my scene that stores XYZ being the center of the sphere and the Alpha component the W component is the sphere radius so I just put those into a couple variables there then I do the math for Rachael Ray sphere intersection again I'm using the intrinsics world ray direction and world ray origin as part of this computation this is a quadratic equation you get a B and C you plug it into the quadratic equation and you may not hit you may get an imaginary number in which case you can just return you didn't hit anything but if you have a real solution you call report hit you say ah-ha I found a hit and in the case of the sphere you actually have two hits one in the front and one in the back and so you define sphere attributes and call report hit with the distance to your head the user definable hip type in this case I'm not defining anything interesting I just passed 0 and then the sphere attributes in this case my sphere attributes are really dumb I'm just passing through the center of the sphere I don't claim that's a good idea in fact it's probably a bad idea but that's what I'm passing through here and then for shading it's also pretty simple I get the information about the material of this of this sphere I compute the normal and I do this by finding the hit point on the sphere minus the center which is why I passed that to the attributes and normalize that and then I shade with one of a couple of different functions either a diffuse material function metal material function our glass material function I'm not going to walk through these these are in the tutorial code and that gets gets you this which is Pete's ray tracing in one weekend final image with a few added bonuses alright to summarize I've presented an overview of DirectX ray tracing shaders there are 5 new shade shader stages you need to pay attention to there's the Ray generation shader the intersection shader mists shader closest hitch shader and any head shaders and I tried to walk through the the new built-in HLS of functions for ray tracing it's sort of a fast speed we walked through a number of samples demonstrating their use for full code additional tutorials and runnable executables you can go online to the course web page and download all this code for yourself so what I hope you've taken away from this morning is the shader side of DirectX ray tracing looks very very similar to coding a serial rate racer in CPU Oh on the CPU and it's actually pretty straightforward to get started if you have some encapsulation of the host side code I hope you go out and try it you grab my code you play with it and and in this code hopefully hides the the this this the ugly stuff from you so you can play just with experimental prototypes for those of you who are researchers and may know nothing about DirectX all right so I think with that I have time for a few questions and we're on time for a break after which we'll come back at 10:40 and Shawn will talk I'm happy to take questions during the break otherwise get up and take a stretch so just a few questions I just wanted to make sure as far as your G buffer was concerned the positions and the normals or in world space as opposed to camera space yeah there in world space okay on the ambient occlusion did you incorporate and inclusion in that final sample of yours no no I don't so so the distance you were specifying for the hemispherical integration was arbitrary you were just just a fixed constant yep okay and then finally like when you were when you showed those multiple bounces were you attenuating the light based on the distance squared no is there a reason why you weren't because you don't want to yeah because obviously the dark scenes look darker and like you don't the the the math of the rendering equation does that handles that for you okay all right thank you yeah yeah hey hang on oh I see okay go I'll good perfect so your regeneration shader actually seen may be deceptively so similar to our standard compute shaders we have today is there any reason why the choice wasn't made to extend our existing computer architecture to support rate generation and to follow up on that it seems like the REA generation shaders have the capability dispatch outer shader stages is that something we might see in the future in existing rasterization or compute shaders stages well so I can't really talk to the design because I wasn't involved in the design of this API so you'll have to ask that to maybe maybe Sean can can answer that so on on the launching I mean part part of the reason there is this callable shader type is because I think people realize that people look at this API and realize you could launch new shaders and let's explicitly call that out as a callable shader type to make that clear I didn't talk about that here but if you want to call shaders you can do that thank you let's go in the middle there I I want to know if the the payload structure needed to be the same across all shader stages or can it be custom per stage so you optimize the vgp are uses its it needs to be the same across array so throughout throughout the Ray type you need excuse me you need to use the same same Peter shader payload different ray types can use different payload types and in in the in the tutorial the the shadows and the color both had different types okay so it's per Ray so you cannot optimize that part for so now the caustic the Annie had shader and the closest shader need the same payload and then I was also wondering about the geometry mask is there need difference between using geometry mass versus time moving that into the payload and then do that in your shader' directly well so doing it with like the instance mask allows the the presumably allows the driver to do some optimization behind your back rather than doing it programmatically I'm not entirely sure how that works thank you do current GPU doubler tools work or should we expect to receive updates or new tools for a retracing it yet I haven't used them extensively but other folks on my team have used a lot of the same developer tools and I as far as I know they're working pretty well but I I don't know much about you know if they're available now or when okay thank you similar question to the computer one if I currently have a computer that I'd like to do weight racing from but only sometimes what I just move it to a rate generation shader and then sometimes trace array sometimes not yep you can do that great thank you I was actually wondering and yeah you seem to have a lot of if branch chains and with multiple shaders multiple hit groups and whatnot it all sounds like it's very I guess offensive to shader alcalde so it's more of an architecture old question but are we still concerned about waveform performance and shader locality and that sort of thing so I don't want to talk too much about architectural details there there is you still have to worry some what about coherency and I I think at this point I I'm not sure I know enough about performance characteristics to actually answer you thank you let's go over here okay so you kind of abstract it away a lot of the details for getting material properties pixel shaders make it really easy to do things like map selection what's the like general for that what it's it's an excellent question there is no quads like there is in raster so you don't have you don't have derivatives automatically you can either do this explicitly with Ray differentials which are pretty heavyweight there's there's an approximation that I believe Tommy second molar was is publishing in ray tracing gems and that chapter I think is online they announced at HPG I don't have the link handy but that's sort of an approximation that you can use instead of Ray differentials and there's probably more work that needs to be done here to say how do we do this quickly in a in maybe an approximate way that this is faster than Ray differentials which it does add a fair amount of overhead thank you there is any conflict of LOD in this pipeline so I eat an object very far away and I want to switch from very detailed object world few so as far as I know there's no built-in concept of LOD you're gonna have to do that yourself oxidative structure when I eat a box or something you can use multiple acceleration structures I I have not done this so I can't provide you much more detail on this thank you yeah you should you should probably ask maybe Shawn or Colin who may have more experience with this so I want to know how do you guys provide the scene geometry to the ray tracing API knows there's no part of the code that sent all the geometry so the API can build the beefy age so the rate risk and trace the intersection geometry is it like implicitly you need to set or is explicitly well so so in these in these tutorials it's all extremely abstracted I haven't done a lot of profiling directly you're probably better off asking one of the guys who's done that sort of work hi in your tutorial coaches shot this function next rent yeah and so I wonder if a random number generation is something that is built-in or is it something after no no the the random number generation our utilities provided in my code there it's not a very good random number generator it's just one I picked you're probably better off using not random numbers but something else you know yeah and and mine were all uniform random entirely you probably want to jitter if you do use random numbers this is just get you started use something else probably for real thank you I have question about the noisiness is it possible to expose the attend sir called the noising to the XR is any plans oh why I can't speak to plans yeah III I don't think I can answer that right now okay hopefully murder things to bad job all right for those of you at the back and stop talking let's let's sit down again after the break and get started so next up in the in the chorus is Sean Hargreaves from Microsoft who's going to give an overview of the the host side the the API of code introduction direct decks a ray tracing the API good morning so I'm Sean Hargreaves I work at Microsoft on the direct3d team and at this point you're hopefully all interested to start using ray tracing and convinced that there's interesting things to be had here interesting research to be done I'm going to start talking about the the next step beyond when you're ready to apply ray tracing outside of sample framework just working in the shade is if you want to integrate this into an existing game engine or go write your own framework what does the API look like that you need to write to do this with the other way to put this this section of the talk is going to be talking about the ugly stuff that we've skirted over so far and there's some ugly stuff I should forewarn you so there's basically three things that you need to do on the host side using the DirectX API you need to build geometry into acceleration structures that can be raced against then you need to configure a pipeline state that defines all of the shaders that are going to be used for ray tracing and then finally you call dispatch race so I'm going to spend the next hour walking through each of these steps before I start doing that though I need to back up a little bit and talk about d3 d12 itself can I get a quick show of hands who here has written DirectX 12 code previously okay great that's about what I thought so some of you have apologies to everyone who put their hand up here I'm gonna zoom through DirectX 12 in about ten minutes and you're gonna be thinking I skipped over so much essential stuff how could I possibly simplify to this ridiculous extent I can't teach you DirectX 12 in ten minutes but I can hopefully give enough concepts for those of you who didn't put your hands up that you can then follow what applies to ray-tracing the reason I need to do this is that rate racing is part of direct3d it's not just a separate API off to the side and we built it that way for the specific reason that we think mixing rasterization and ray tracing is very interesting and important we expect there are going to be a lot of valuable hybrid techniques perhaps you rasterize to build up a G buffer in a deferred shader but then ray-traced to do reflections or shadows so having all of this stuff inside a single model or there's one way of scheduling work one way of describing resources and managing memory is really important for letting you mix and match but that doesn't mean step one in getting to ray tracing is you've got to get to direct3d itself first so they're really kind of high level of direct3d 12 it is a low level API that means that as an application developer using it you are in control of memory of lifespans of objects and of synchronization between CPU and CPU and GPU these are things that you didn't generally have to worry about with earlier graphics API is like direct3d 11 or OpenGL the general pattern of any programming with d3d 12 you start by creating some resources in memory on the GPU that will be no texture Maps vertices constant data you put data into these resources then you record a set of instructions that the GPU is going to execute into an entity called a command list and the sequence of API calls for doing that starts by resetting a command list you configure some state on it here I listed the set route signature API there are many more lots of lots of state to configure then you issue calls instructing the GPU to do work typically historically that would be draw instance to draw triangles or dispatch to dispatch your computer with ray tracing we add dispatch rays to that set and you record any number of different operations into a command list you can change some state draw some things change some ball state dispatch some Ray's change data gain dispatch some compute work but so far none of this is actually executed you just recording instructions that the GPU is gonna do later this is all just running on the CPU when you're done building a command list you submit it to a command queue by calling execute command lists and this actually kicks the GPU off to go and start processing those those instructions and then finally you have to synchronize on the CPU side to know when that GPU work is finished and this is really important because these processes are parallel they run completely asynchronously so you can't go and change any data that you've told the GPU to use until it's finished using it you're gonna have all kinds of race conditions if you don't get synchronization right and as a developer it's your responsibility to do this stuff correctly that's done through the signal API and defense object so I'm gonna walk through an example of how this parallel model works with something simple but concrete that every application is going to do and creating a resource and putting some data into it so you have two timelines here with time progressing downwards down the display CPU on the Left GPU over on the right we'll start by creating a buffer in GPU memory that we want to put some data into then we have to create another buffer which is called an upload tape and the reason for this is that the CPU can't write directly into GPU memory it can write straight to the upload heap because that's in CPU accessible memory so we can just scribble some some bits into it we could generate them we could load them from the file system then we issue a copy resource call which will tell the GPU to copy from that upload heap into the real GPU memory buffer that we want to use this is still just on the CPU timeline we've recorded that call but we haven't run it yet and now we call execute command lists this is what kicks off the GPU to actually start processing so it goes over and communicate to the other processor and starts running some work and now the GPU goes and starts actually processing that copy resource which might take a while if the resource is big meanwhile on the CPU we'd like to know when that has finished so we call signal on our command Q which is issuing issuing an instruction to let us know when it gets to that point in execution and then we can go to a fence object and say set event on completion which will give us back a win32 event handle and then we can call wait for single object which will block on that event waiting for it to be signaled and now the CPU is just waiting because the GPU is still processing the copy resource so that the event has not been signaled when the GPU gets done with its copy resource it reaches the signal call and goes and signals that event handled back to the CPU to indicate that it's finished that unblocks the wait for single object calls so CPU execution resumes and now we can destroy the upload tape because we know the GPU is finished with it it's safe to delete that object now our data is in the actual buffer that we wanted to use in GPU memory this is a greatly simplified example compared to what you would see in a large complex game engine where there's probably many many different fences and signals flying around synchronizing all kinds of different things but this is the kind of model you need to have in your head when you think about d3 d12 but there are two timelines and you have to be very aware you're submitting work from one to the other and then you have to track when that work is finished the other big piece of d3 d12 to cover is how we bind resources to the pipeline the sample framework that you saw earlier handles this all for you but there are some submit arresting options about how you is how you make textures and constants available for a shader to process and there are some basically multiple levels of indirection here we start with a thing called a descriptor this is a term you'll come across a lot in direct 3d programming conceptually this is a pointer to a resource on the GPU it's a little bit more than a pointer in some cases a descriptor to a texture also contains information about the size and format of the texture so it's kind of like a rich target pointer but conceptually this is a pointer then we have a thing called a descriptor table which is just an array of descriptors which you can index into so you could have a set of different textures that we use for animation or for different parts of an object and then you could use an index to go and pick which one was used out of that table the third entity we have is a thing called a descriptor heap which is an area of the GPU memory and in that descriptor heap there will be multiple descriptor tables different offsets it's entirely up to you how you want to manage this some some engines will just create one giant descriptor heap use the whole thing as one huge table put all the descriptors into it and then just index to select them others will put many smaller descriptor tables at different offsets within the descriptor heap and can update these and point different shaders to different pieces of the heap the final object that's involved in resource binding is a thing called a root signature and this is basically a convention in when you're writing a shader you will say ok this has a texture as an input and a constant buffer you give those names and then in the hlsl code you refer to these objects by name the root signature Maps how those names from HLS I'll get back to the CPU side and where the GPU actually goes and finds that data from so there are three types of thing you can put in a row descriptor you can inline some constants directly into it so any 32-bit value can just go straight into the root descriptor yeah an integer or a float there's relatively limited space for these so you don't want to put too many values straight into the root signature but it's efficient to update them if they go there you can also put a small number of descriptors directly into the root signature so something like a constant buffer that you change very frequently from one operation to the next you might want to just put the pointer to it straight there more commonly though the root signature will contain pointers to descriptor tables so the binding model there is that the the root signature indicates how to go find the table and then within the shader you index into that table to find the actual descriptor that you're gonna go fetch some data from so that was a whistle-stop tour of d3 d12 hopefully you've all mastered it or at least know the right search terms to go look up in the documentation later so when we add ray tracing a couple of new things get added to this picture new new complexities introduced by ray tracing come from the formative geometry ray tracing traces against acceleration structures not just vertex buffers these need to be highly optimized and there isn't a single canonical format for these because they're the traversal often will happen in hardware or will need to be tuned in some way to fit the hardware so unlike vertex data where we've agreed many many years ago what an index buffer is and all GPUs agree there's no disagreement it's a set of integers a vertex buffer is just a set of values so you can go and just put data into memory in the format that the GPU will rasterize from directly for ray-tracing that's not the case the implement the layout needs to be different for every implementation the second new thing with ray-tracing is that race can go anywhere the more levels of indirection you use the more extreme this gets but even in a very simple case any array could hit any object and that means that all the geometry and all the shaders need to be accessible at the same time we no longer live in a world where in rasterization you would circle a set this shader draw some triangles it's just this one object change everything set some different shaders draw another objects all of the geometry and all of the shaders have to be accessible the very moment you call dispatch rays and the third thing that comes from that is that different shaders might want different resource bindings it's very very unlikely that every single shady you use in a complex scene is going to have the same set of textures in the same constants and would be excessively limiting for programming model if you had to come up with a single kind of unified set of all of the parameters that might be needed for any shader everywhere and then go out date all of your shaders every time you added a new one with a second parameter so all of this basically means we need more interaction we need the ability to select shaders with very fine granularity within ray tracing and we need the ability to go find different sets of parameters and different binding conventions depending on which shader got selected to be executed so let's talk about what you actually do once all of these pieces are in place step one we need to build acceleration structures which is the geometry that we race against an acceleration structure is an opaque format it's an area of memory that is defined by the implementation is highly optimized for ray traversal typically this is going to be some kind of bounding volume hierarchy but the design of direct3d ray tracing doesn't require that an implement could could innovate and come up with different types of types of acceleration structure these are built at runtime on the GPU they have to be because depending on what GPU you're running on it might want a different format and they're more or less immutable I'll talk later about some options for animating them but conceptually you should generally think about them as being immutable the model is that you come in with a set of triangles or other geometry and you hand it to the driver and it does some work and builds a BVH and then you can just write trace many times against that vvh but you're not typically going back and changing its contents this is kind of different to rasterization where you you can change geometry very very late in the pipeline because in the geometry world you're just kind of throwing triangles at things one at a time the vertex shader can can come in and move a triangle somewhere else and that's how you do animation in retro thing that's a really different world you can't late move things when you need to have optimized a structure that depends on the position of everything so if you want to change the geometry that means building a new different BVH that will be optimized a different way there are two types of acceleration structure top level and bottom level so a bottom level acceleration structure is the entity that holds most of the geometry typically these will contain triangles they can also contain procedural primitives which are defined by an axis aligned bounding box plus a custom intersection shader and that's how you were doing a you saw a demo of using this to do spheres earlier this is also where you would do to have surface intersection do any kind of creative non triangular structure bottom level structures can take some time to build because they contain a lot of triangles building the BVH involves work then we also have a concept called a top-level acceleration structure which is basically just a set of pointers to some number of bottom level acceleration structures plus our transform so the way you would do instancing you would have one top level structure that had multiple references to the same bottom level structure with a different transform for each instance you can also just group a set of objects in one top-level structure top-level structures are much faster to build than bottom level structures because they don't contain large amounts of geometry they just kind of aggregating bottom level things that have already been built so rebuilding these is going to be significantly faster and so there's a balance to be figured out here when you're deciding how to structure the geometry for a scene if you want to optimize for rain and just raw Ray intersection performance you should prefer the fewer larger bottom level structures which will allow the implementation to create very efficient BPHS if you want more flexibility you should probably lean towards a smaller number of smaller number of bata of smaller bottom level and bottom level structures and then more references from the top level structure because then you can just move those around and change the top level structure so when you come to build these acceleration structures we need to talk about memory management the complexity here is that because the format is implementation defined you don't know how big one of these things is going to be until after you build it the build does a lot of optimization and depending on what kind of tree it settles on it might need more space or less space or might be able to merge some things so that's obviously a challenge how do we allocate memory for something when we don't know how big it's going to be and the way we do this we call this API get ray tracing acceleration structure pre build info that's the cautious estimate it runs on the CPU and it's conservative so it will return the maximum size that an acceleration structure with the number of triangles or objects in it that you specify could ever be in a kind of worst case that's how much room you need to allocate to build it once you've allocated that much space you call build ray tracing acceleration structure and that actually does the work of doing the BVH optimization this runs on the GPU and went so when it's done it can return the actual size of how big the thing for real turned out to be that's going to be available in GPU memory and it's not available until after the build command has finished executing there's actually two different sizes that come out of this there's the current size which is how big the structure really is today and then there's a compacted size compaction is an operation that can shrink a bounding volume hierarchy after it's been built so after the build you can get a an estimate of if you did compaction how much smaller would it get when you build an acceleration structure there's a few flags you can specify to control how the build up rates and the first couple are kind of performance hints you can indicate that you would prefer it to optimize for fast racing at the cost of spending longer building or you would prefer a fast build at the cost of perhaps slower tracing and this is really a trade-off of how many times you expect to rate race against this structure if you're building it just once at load time and then you're going to reuse it many many times on different frames you should probably go for fast racing if this is an animating object that you're gonna rate race against just once and then throw it away and build another version the next frame you're probably optimized for a fast build you can also indicate that you'd like to spend a little bit of extra build time to minimize memory usage you can specify a flag to say that you would like to allow compaction which means that you do plan to later use the compact functionality and you'd like to build to preserve some data necessary to do that if you plan on animating this acceleration structure you have to specify that by saying that you do want to allow updates and then the perform update flag is the kind of special case this is how you do animation you do a second build of the same acceleration structure pointing to the same destination location with the perform update flag and that means just make some incremental changes to what was already built so if we want to be efficient about memory usage we should think about compaction and this can really matter your geometry can be for a complex scene we could easily be talking hundreds of megabytes of geometry data so wasting space here is a big deal the basic idea behind compaction is to sub allocate multiple acceleration structures out of larger buffers when we're initially building them we use the conservative sizes then after the builds are finished once we have real size data we do some kind of compaction pass to shrink our memory usage down a caution here don't compact things if they animate because in that case even if the currents is smaller as you animated it might not be might need more space and be aware of stalls this is important because because the builds execute on the GPU what you don't want to do is submit a build then kind of immediately block the CPU waiting for that to finish to find out how big the thing is going to end up being another CPU is just waiting around for efficiency you want to find a way to get this stuff all running in parallel the simplest allocation approach is just not to bother with compaction you call the get pre build info allocate the maximum size that that said you need coal build and you're done that's pretty straightforward but you're going to waste the difference between the result data max size that was estimated versus the actual compacted size after the build fan finished if you're ok with wasting that space totally fine it's reasonable to do this if you want to do better there's a couple of approaches you should talk about in-place compaction can be done fairly fairly straightforwardly the idea here is to have one larger buffer with an offset and for every acceleration structure we want to build we build to the current offset within that buffer wait for that build to finish now that it's finished we know the size and we can just add the current size to the offset and then move on to the next one this is still going to waste some space it'll waste the difference between the current size and the compacted size but it's not going to waste as much space as if we did no compaction at all the big issue here is stalls because we have to wait for the build to finish executing to know how much to offset by well likely to end up with the cpu just blocked a lot a better solution is to do this in two passes with a full compaction implementation so the idea here is we loop over all of our acceleration structures that we want to build build them into a temporary buffer with a temporary offset offset that temporary value by the conservative size estimates now we can kick off all of those acceleration structure builds in one go wait for that to finish executing we can actually go and start submitting rendering work while this is happening and come back and do the compaction later like maybe on a subsequent frame after all of our builds are complete and once we have once all the builds have finished and we have real size estimates we do a second pass loop over all the acceleration structures and called copy rate racing acceleration structure from the temporary buffalo keishon into a real buffer specifying the flag that we want to compact is and then we can offset into our real buffer by the compacted size which we know because we already did the build this approach is going to not waste any space at all you can imagine this could be made more flexible you could do more dynamic allocation if you have things that are animating this is a good kind of starting point to think about how to manage memory for geometry in ray tracing we should talk about animation ray tracing static scenes is all well and good but if we want to do something interesting we probably want it to move the way you're used to thinking about animation animation in rasterizing doesn't translate directly across to ray tracing so for rigidbody animation the basic idea is we just update the top-level structures we've already built bottom level structures we leave those alone build new top level structures with different transform matrices or we could add or remove references to bottom level structures as objects join or leave the scene this is very straightforward if we want to do skinning that we need to do a full bottom level rebuild because we want to move actual vertex positions so the basic idea for skinning is to try to do one pass that transforms vertices into a temporary buffer using a computer to do the same kind of calculation you would do in a vertex shader for skinning if you were doing rasterization and then we take that new set of transform positions and applied them as an incremental update to an existing bottom level acceleration structure when we do an update that's much faster than doing a full bounding volume rebuild because we don't do typically the full optimization passes what will happen there is we'll just going overwrite the positions of all the triangles with a new set of positions and then refit the bounding boxes as things have moved will just grow and shrink to match the challenge with that is that the more things have moved the less good the BVH that comes out will be because that BVH was built for the original geometry not the modified geometry and if we just go and update bounding boxes while still using the previous BVH we could get a pretty inefficient tree so the chanting here is the more things have moved the less efficient this is going to be what you might want to do is do incremental updates for some period of time and then if your geometry has deformed so far you have some kind of threshold and periodically go into a full tree rebuild a lot of times this is fine there for character animation what you should typically think of doing is build your BVH with the character in a typical rest pose and then you can just do incremental updates forever and it'll be fine if they move their arm up the triangles in that arm and not gonna fit the BVH as well until they move it back down again and now you have more efficient ray tracing but unbalanced that's probably good enough obvious kind of caveat to that that's only true if you do your skinning in object space if I build my my bvh for a character in rest pose and then translate them around the world and rebuild new vertex positions where they're a mile away and facing in the opposite direction that BVH is not gonna be a great fit anymore don't skin in wall space if you want to do incremental BVH rebuilds a few other options that come into play as we're rebuilding acceleration structures there's a feature called instance masking the way this works is that every entry in the top-level acceleration structure has an 8-bit mask which you specify and every time you call trace ray you get to pass an 8-bit mask during the ray traversal these two-bit two masks get ANDed together and that section of the geometry will only be considered four hits if at least one bit is set in that result so this lets you include or exclude pieces of geometry very easily without building entirely separate acceleration structures this is a great way to do things like flag specific objects to say I want this to participate in my diffuse rendering but not to cast shadows and just have a bit that is this object of shadow caster and then when you cast your shadow rays you can just set that bit and off it goes this is potentially also useful for things like LOD you could have high end low detail versions of objects that are selected depending on which type of rays of being cast so you could have a fully detailed version for the primary rays and then three and a few strays could use lower detail versions of objects a lot of lot of potential to come up with interesting uses for these flags to avoid reap things that would otherwise require rebuilding acceleration structures or having multiple acceleration structures so great now we have geometry and we're ready to configure some pipeline states this is the second big piece of setting up ray tracing the concept that you need to understand here is a thing called a shader table the reason we have shader tables for ray tracing where is in rasterization in DirectX we have just one shape one pipeline state active at a time and you change it by calling set pipeline stage that doesn't work in ray tracing because Ray's could go anywhere they could hit everything anything if we have a complex scene we want different objects in the scene to have different materials on them and that means that we need to run different shaders depending on which object array hit if we didn't have this feature you would have to write a single giant shader that contains a huge switch statement saying what did I hit with cases for every possible material that's not a very entertaining programming model so we chose not to inflict that upon you and instead we built support for shader indexing into DirectX ray tracing so the basic idea is that we set up an array of pointers to shape to shaders which is called a shader table and then every time we need to run a shader as part of the ray tracing pass we index into that array determine based on which object was hit pick the shader that corresponds to that object and go execute it it's effectively an array of function pointers on the GPU so what does that look like in detail I'm going to introduce some new terminology here so we start with a concept called a shader identifier a shader identifier is basically a pointer to a shader it's actually 32 bytes which if you abused two pointers you'll think it's probably a rather fat obviously it contains more information than just a pointer but conceptually this is a pointer the next concept we have is a thing called a hit group and a hit group is a triode of shaders it comes an intersection shader if you're using one the the any hit shader and the closest hit shader and these are all optional so you could have you could have you know typically you'll have a lot of objects that have a closest hit shader but maybe not an intersection shader or in any hit shader you could have all three if you need to use them you gather these things up combine them to form a hit group and then that hit group is itself treated as a shader so the hit group itself has a shared or identifier that points to it and we can say that that hit group is used to shade this object the reason those three shaders are groups is because they're the shaders that depend on which object was hit and miss shaders obviously don't depend on the object that was hit because by definition if the miss shader is running it's because nothing got hit so they're not part of the head group the next concept we introduce is a thing called a shader record so a shader record is an area of memory laid out with a shader identifier indicating which shader to run and then a set of route arguments and these are the exact same things I talked about earlier that go into a route descriptor when you're programming with rasterization so this can be immediate constant values and just grapefruit descriptors that point to resources and shader tables which refer to indexable arrays of descriptors so we'll gather up all the parameters that this shader is going to need to execute and just put them in memory right after the pointer to the shader so a shader record tells us what code to run and what parameters to pass to that code and then finally we have a shade a table which is just an array of shader records however many of them however many shaders you would like to use for this ray tracing scene just put those one after each other in memory there's not actually an any dedicated API for creating shader tables it's just memory with a canonical well defined layout you can create this however you like you might create it on the CPU and go copy some data into a buffer you could also actually create these on the GPU so it's entirely possible to have a computer pass that runs early in a frame generates a shade the table puts a load of parameters wherever you need them and then later you can go and retracing using that shader table let's look at an example because I'm probably confused with that number of levels of interaction so this is an example memory layout of a very very simple shader table this one has 3 shader records each record starts with a shader identifier which is a head group and then it has a set of route arguments and you can see that the route arguments are completely different for every shader in my table I start with a shader that I'm going to use for my terrain so that has no intersection shader no any hitch shader just a closest hitch shader and it takes a couple of descriptor handles one of them contains some terrain textures the other one contains some light maps I don't know why we're using light maps when we have ray tracing but this is a we haven't quite got to the full integration yet we're not doing indirect lights bounces then we have a route descriptor that points to a constant buffer that's gonna provide some parameters for the materials and for some reason we put the fog distance just directly in the route in the route signature layout that's just to demonstrate you can put immediate float values as well as descriptors that point to other resources we also have a water shader which is using any hit done closest hit and it has a couple of different parameters and then we have some kind of volumetric cloud intersection thing which is using a custom intersection shader and that has a couple of route descriptors pointing to some parameters in a constant buffer and to a volume texture so these layouts are totally up to you it's depends on which shaders you write what parameters they want to take in you just go put those in memory after the pointer to the shader it's important to note that I then have to go and pad each of these rows in the table the reason for this is that this is a fixed size table the stride is up to you but the stride has to be constant for all of the entries so the size is determined by whichever of my shaders needs the most parameters you'd have to go pad all of the other all of the other records so that they're all the same size and then it also has to be 32 byte aligned so in this case my largest set of route arguments is the first shader I added 4 bytes of padding to round that up to 32 bytes then I had to put 20 bytes of padding after my size and sixteen bytes of padding after the third and that creates me a nice little little shader table which I can point the ray-tracing pipeline at so now I've got a set of arrays arranged into a table how do I control which one actually gets executed when I call this patch raised yes so far it's just some pointers to things there's a fairly large number of arguments and options that go into this when I called dispatch raised I provide three tables and for each of these tables I've just provide a pointer to the start of the table and the stride value which is determined by how many route arguments I had how big my records are gonna be I have one table for all my Miss shaders if there are multiple I have another table for my hit groups and I have a third table for my colorful shaders what's kind of skipping over callable shaders today they follow very much the same pattern as everything else so that's right at the start of ray tracing I provide my tables then when I call trace ray I get a couple more parameters I can specify a ray contribution to the hit group index I can provide probably the longest named parameter that exists anywhere indirect tax multiplier for geometry contribution to hit group index that's probably not even obvious from the name of it what it does but my next slide will clarify and we also have a miss shader index so that's every time I trace of rain I get to specify those three values we also have a source of data coming from the acceleration structure itself so the top-level acceleration structure for every single instance within that structure it provides an instance contribution to hit group index so how do these things work together well I'm gonna start with myths shaders because they're by far the simplest when we miss and we want to decide which shader to run we find the address for the miss shader by taking the start and stride that was specified when I called dispatch rays multiplying the stride by the index which past my called trace ray and that's the one we use it's pretty straightforward yeah the way to think about that is when you called trace ray you just specify the index to which miss shader you want to use and that index is just looked up into the mists shader table there's no other contributions to that and it doesn't need to be because there's only one kind of Miss right the reason you would specify different miss shaders is maybe the misbehaviour for your shadow rays is different to the misbehavior from your primary race or your indirect race so at the point when you trace you know which shader you want to run after the different options hit group indexing is a little bit more interesting because when we hit something we're going to want to shade it and the way we shade it is going to be different depending on what it was that we hit so the selection of which hit shader to run is a combination of the array contribution which was a parameter to trace ray the geometry multiplier typed times the geometry contribution and what's happening here the geometry multiplier was a parameter to trace ray the geometry contribution is just an index into the bottom level acceleration structure it's a system-generated value for every mesh within that structure that goes 0 1 2 monotonically increasing and then we also add in the instance contribution which came from the instance in the top-level structure that's kind of a complicated formula it's getting to the point where yeah it almost seems like why don't we just make that be something that your shader itself could just do in code the reason that's kind of obvious we can't is because this is the calculation that chooses which shader to run so we don't have any code running until we've chosen which code to run so this has to be more of a kind of fixed thing a lot of the times some of these terms will collapse to zero so the kind of very simple use that you'll see a lot is that you have your array contribution be zero and your instance contribution be zero and your geometry multiplier be one and that just means use the index of the geometry to pick which shader into the table and then you just put entries in your shader table matching the layout of the meshes in your structure and off you go you could also if you have different meshes that you want different offsets you might have a set of shaders that are indexed by the geometry that one offset and you use that for all your primary rays but maybe for shadow rays you just don't care and you don't want to run specific things so for shadow rays you would set geometry multiplier to 0 which means ignore which geometry got hit and just have the Ray contribution so this is the index for the shadow shader if I want to go run so there's quite a lot of flexibility in this equation by choosing different values for the coefficients this is a crucially important thing to understand and this is something that we expect there to be interesting innovation and as people come up with the most efficient way of managing shader tables and you have this choice of what shaders do you put at what what indexes within the table and then how do you configure this equation to pick which one of them gets run and you can use that in a whole bunch of different ways ok so now we've set up our shaders the piece I skipped over when I was talking about populating these shader tables is every shader record starts with a shader descriptor that's a 32 byte blob where does it come from so we should talk about that the general model for compiling shaders you start with HLS old shader code which you write right awesome algorithms and make it really cool then you feed that stuff into the shader compiler DX exe that gives you back a thing called a Dixon library this is a bit different to when you're compiling shaders for rasterization because typically you will just compile them all the way down in that world and produce a compiled vertex shader or pixel shader that you can pass straight into direct3d for ray tracing because we have this requirement that all the shaders are available up front they all need to be given to the implementation at the same time when you're initializing and that means there's a little bit more management involved to do this efficiently you might want to do things like give the driver the option to compile sets of shaders on different threads there's configurability over how this works so offline compilation we basically only produce a library which contains a large number whatever number is right for your application but all of the different fragments of hlsl code then not yet links together into a single thing that can run on the hardware so then at runtime you load in that compiled pixel library you create sub objects insert associations between them which is how you describe which pieces of code map to which parts of the ray tracing pipeline once you've configured everything you call create state object to create a state object type for ray tracing pipeline and that is the complete set of shaders that you're going to use for a trace ray when you call trace ray you configure you just give it that state object is the thing to use and then after you've done that you can look up shader identifier you can ask that state object give me the shader identifier for my generate ray shader or my cloud intersection shader and it will give you back a 32 byte blob for each of those shaders that you asked for and then that's what you're going put in your shader records populate your shader tables I'm gonna walk through this with a diagram to show hopefully clarify some of what I talked about about setting associations so this is a very very simplified example we have start with a diksa library which we compiled offline which campaign contains three shader functions I have a region shader I have a closest hid shader and I have a missed shader obviously a real application would have probably significantly more different types of shader than this so then I call some api's to create a sub object which is going to be a hit group and in this hit group i say that this is for intersection type triangles I'm not doing custom intersection that means that my intersection shader can be null my any hit shader is null but my closest hit shader is the closest to a shader that came from my diksa library I also have a local rig nature which I created separately using standard direct3d AP is the same way I would for rasterization and this specifies the binding Convention for how parameters are going to get into my shaders I now create an association object which links that route signature to my rage and shader and that says this is the binding of how these things are going to combine I can also create some configuration objects these are important for getting maximum efficiency out of the shader compilation so in the shader config I specify that the payload size for this thing is going to be 16 that's the payload structure that you heard about earlier that passes information between the different steps in the write racing pipeline I can also specify the max recursion depth and in this case I'm saying 1 that means I'm promising upfront that I'm not gonna do any recursive race configuring this is really important for the implementation to know because it lets the hardware and driver understand how much stack size needs to be reserved for running shaders theoretically ray-tracing could rehearse an awful lot of times hypothetically infinite if you just kept going obviously you would run out of resources and burn miserably if you tried that but you really you want to stop significantly short of infinite if the implementation doesn't know the maximum recursion it's like how do you how do you allocate a stack when you have a lot of different shade of threads running in parallel on the GPU even fairly small stacks get pretty big when you multiply them by the number of shader cores in a modern piece of silicon so telling the implementation whether you plan to recurse is important this example I'm not recursing at all yet the example you saw earlier the request count would be maybe 2 or 3 maximum then I take all of these sub objects that I've created combined them all to call create sub object and that's the point where this whole configuration gets handed to the implementation and the driver compiles this stuff down into what's really going to run on the GPU and now that I've done that I can call get shader identifier to look at my region shader my hit group and finally my miss shader which are the the 32 byte blobs which I'm gonna go put in my shader record that diagram is fairly confusing and I didn't even get as far as the actual code if you're feeling overwhelmed at this point here is a link to some sample code that shows you how to do that I would recommend looking it up a lot of times you know just start with the sample copy and paste it's not as complicated as it initially seems so now we have acceleration structures with geometry in them we have a set of shaders which have been compiled and linked and populated into shader records that we can index into so we're ready to call this batch race and this is what kicks everything off and actually makes some ray tracing happen this is the shortest part of my talk because dispatch Ray's is a really simple API it takes one parameter which is a d3 d12 dispatcher arrays desk that's it we're done ray tracing is completed should talk about what goes into that desk structure most of this should be fairly obvious based on what we've covered so far so we have the shader record for the array generation shader that we're going to invoke we have a table of Miss shaders a table of head groups a table of callable shaders and then we have width height and depth which are how many invitations of this row generation shader to invoke most often if you're doing that camera ray tracing for a camera or something you would have width be the width of the screen height be the height of the screen depth be one it doesn't have to be you can do is have one dimensional dispatch if you want to have height in depth both for one and just use with that's totally up to you filling in those structures that this refers to when we see a GPU virtual address range that's just a start address in size and we say range in size that start address size and stride which is the offset from one shader record to the other so you can see from this structure that we have just one ray generation shader because that's where everything starts but then we have an array of multiple miss shaders in multiple head groups and that's it we have succeeded in doing ray tracing before I wrap up I want to cover a few other small but fairly important details first off where do you get this stuff if you're all excited and you want to rush home and start programming up ray tracing what do you download where do you begin so this hasn't shipped for real in Windows yet what is available today is called the experimental preview this is what we released at GDC earlier this spring the experimental preview requires the Windows 10 April 20 April 18 update has to be exactly that build of Windows it won't run on any other variant of Windows 10 and you obviously need suitable hardware and drivers to be able to support this the preview is not quite the same thing as the final API that's why we released a preview we've been working with partners and responding to feedback and made some changes in response to that feedback but it's extremely close moving moving content from the preview to the final API is typically a few hours if not even just minutes of work and I have a link here to where you can get that it's on the forums DirectX tech comm site so this is something you can download today start using the official release of DirectX ray-tracing is coming in the next version of Windows which is code name redstone 5 that's actually already available through the insider preview program so if you're in the insider preview taking flights you will have access to that and inside a preview SDKs have all the final headers in labs that you need to start using this stuff the third option we have is the fallback layer which I'll talk about more in a minute that's a open source project on github and finally we have some sample code also on github which is on the DirectX graphics samples repository this is a great place to look if you want to see real code that does this stuff for real some samples that do ranging from just kind of hello triangle up to some interesting intersection shaders and dynamic geometry all kinds of stuff to look at the other aspect is checking at runtime whether you actually have support for this or not please don't ship applications there just assume that every GPU can do ray tracing because we're not quite there yet this is done through the check feature support API you check for feature d3d 12 options 5 and that will give you back a structure that contains a ray tracing tier and you check for the ray tracing tier being greater than or equal to ray tracing tier 1.0 obviously we're reserving room for future expansion here I assume that we're taking the first step onto a bold new path of ray tracing you can obviously imagine that there will be more tiers in the future with more capabilities let's talk about the full-back layer so if you don't have a piece of hardware that can do rate racing yet the full-back layer is an option to basically emulate the DirectX ray-tracing API via computers it's mostly complete it doesn't support a few corner cases like colorful shaders but it can do all of the kind of basics you can run build acceleration structures rate race against them do head shaders miss shaders custom intersection shaders because it's just a computer I suggest it's a very very complicated computer shader but it is a computer that will run on existing GPUs that don't have a new driver or don't have dedicated ray tracing Hardware support it does need a somewhat recent version of Windows 10 the fall 2017 update is sufficient to support this and it does need a GPU that supports retailed Ixil shader model that's the shader model 6 the full-back layer is a standalone static Lib it's actually open source on github so you can compile it yourself the output of compiling it is a Lib that you link into your application so it's not quite the same API as DirectX you just kind of redirect at initialization time instead of creating a d3 d12 device you create a d3 d12 full-back device which comes from this Lib but other than that is once you're going it's very very close to being the same API the nice thing about this is it will redirect automatically to the real native DXR implementation if the driver has support so if you want to bootstrap something using the fallback layer but then have it run on Hardware when you get the hardware you don't need to scatter your code with your if fallback then called fallback API else called real API the fallback layer will do that for you so just call the fallback layer and it will pass through the big thing to be aware of with this is obviously it's not going to be as fast as a true silicon implementation it's fast enough that you can certainly start experimenting and bringing things up and learning how the shader model works it's a great thing to start experimenting with it this is something you can go home today and start programming with don't expect to see the kind of performance that you're seeing from some of the demos running on that thing because it doesn't that thing the biggest difference in the programming model when you're using the fallback layer is how GPU pointers work so compute shaders in d3 d12 don't support arbitrary memory access on the GPU just a memory address but when you call build ray tracing acceleration structure it's full of pointers everything you specified this way so this was the one piece of ray tracing that we were not able to emulate automatically on pointers the way the full-back layer handles this it replaces every time that a GPU virtual address occurs instead you pass an index into a descriptor heap plus a byte offset from that index and that way you can just do indexing in the standard DirectX way so that means to use the fallback layer you have to go in create a view around any memory that you want to use for ray tracing and pass that in it's this small code change the documentation for the fallback layer gives some good examples of how to do this this code shows typically how you would do that you could say if I'm using the right racing driver then I go create a descriptor handle and bind that and create an unordered access view over it final thing I want to talk about today is debugging the pix tool is our tool for debugging and profiling all of DirectX 12 it fully supports DirectX ray tracing this is available on the blogs MSDN comm slash pix site this is a screen shot of pixel shader table viewer so that's actually very similar to the table I showed you earlier when I was explaining what goes into a shader table it can show you this from a capture of any DXR application and it will pull apart what you've put in your shader tables show you the layouts and what great signatures were bound and what arguments were bound I don't know if you can see it here but you can see there's some little yellow warning signs on a couple of the entries where things are not filled in properly this is invaluable if you're trying to fill in shaders tables and having a hard time getting the right data in the right place it's a great way to see what really happened and pix can also visualize acceleration structures so this is an example of in the event list I've selected a dispatch trace call and Bix is showing me the geometry of what dispatch rays was gonna hit test against so I can make sure that I've actually built them correctly and my objects are all in the place I expected them to be so hopefully that is enough for you to have a sense of how to start using ray tracing and I have eight minutes left to take questions hi so I have two quick questions one I'm currently using ray tracing within a rasterizer to apply reflections onto geometry that's has arbitrarily changing convexity and cavity is there any way to trace the Rays based on rasterization yes a deferred rendering is typically the way to do that you'd render into a G buffer and then do a pass over the G buffer that would trace rays from from G buffer pixels you can't currently trace rays from inside a pixel shader so you need to write the pixel shader results out to somewhere and then do a second pass to trace race okay and then the second question is if I have geometry that is deformable and arbitrarily credible is that doable with the BVH or would it have to handle it similar to what you do with skinning and animation or so that's fantastic question the deformable geometry is kind of the the hardest case for ray tracing because the whole thing that makes retracing work is based on taking the geometry upfront and optimizing it so you basically have two options here you can do the deforming as a pre pass very much like skinning and then build a new BVH that's going to be an OP front cast that gives you a structure that's very efficient to write race against the other option is to use a custom intersection shader and once you do that it's just show the code you can look up any data structure you like do any into any math you like this can be arbitrarily complex we tend to think and talk about intersection shaders as being for doing the classic race for your intersection but they can do so much more the extreme case it is entirely possible that you could ignore all of the built in BBH intersection have a custom intersection shader with a bounding box that was the entire world and you just go and do arbitrarily complex Ray intersection math against arbitrary data structures inside that shader the kind of pointless and you would not get any of the perf wins of the optimized implementation but you could do that so yeah if you want to that's really a plus trade-off it'll be more expensive to trace against things if you do all of that deformation in an intersection shader but you don't pay the upfront BVH billed cost okay thank you two very quick questions the acceleration structure what sort of like memory footprint are we looking at are we looking at like you know based on geometric complexity are we just looking at like big o log n I mean big o N or N squared or something and the other question is the compression scheme what sort of a compression ratio do we get or are these both Hardware dependent and they're extremely Hardware dependent so you're talking about memory usage your build time memory usage okay so memory usage is pretty much linear with the amount of geometry and typically I would expect to be looking at memory usage in the tens up to maybe hundreds but probably tens of megabytes a huge amount of this we don't really know yet and this is one of the things I'm so excited about this stage of ray tracing if we have some some content some interesting demos you're gonna see one of them in a minute but right now there are I can count the people who've built high-end content using ray tracing on GPUs on the fingers of one hand and a lot of that is prototype content so I'm looking to the people in this room to start using this for real and give us those answers and my guess is that it's SIGGRAPH next year we're gonna have a ton of talks of the form here's a year of worth of learnings about ray tracing and how to use this stuff effectively don't we just don't know yet as far as compression format that's completely implementation defined that's that's why this stuff is opaque interesting R&D going on the hardware and implementation level as well as at the API and how to use this level what information about the primitives that were intersected is passed into the callable shaders and the hitch hitters like do you get the meshing instance index or the primitive index within the instance yeah so if it's a standard using the standard trial triangle intersection you get index and barycentric coordinates and if you're doing a custom intersection shadow it's entirely up to you you've written whatever information you want out of the intersection or instance index are both both okay and then is that showing you the that gpus BBH for is pic showing you what the what your current setup generated for the BBH or is that just like a DirectX standard implementation this is a so the visualizer that pic shows for acceleration structures is read back from the driver so it's kind of up to the implementation how much that shows details of the BVH it's not actually broken down into bvh structure it just shows the triangles but in some cases the triangles that come back don't necessarily exactly match the ones that go in you'll find implementations do things like folding banding boxes together and potentially emerging planar triangles and there's a lot of interesting tricks in BVH building that you don't necessarily do the most obvious thing and some of that you'll see in pics right thank you [Applause] well well he's getting set up the next speaker is Colin is a senior software engineer at seed and Electronic Arts which is across disciplinary group working on cutting edge feature technologies and creative experiences and he's going to talk about some of his experiences integrating ray-tracing into some of the work they've done at seed alright Oh so um hi everyone and thanks Chris for having me part of this course and so I only have 20 minutes so my goal is to give you some insights on how us at seed have handled this transition from raster and compute to real-time ray tracing with the XR like I said I only have 20 minutes so I'll try to cram as much stuff as I can talk really fast but if you have anything you can come and see me after ask questions no problem and so if you don't know whose seat is well we're basically a technical and creative research division inside Electronic Arts basically across this blarin team with a mission to explore the future of interactive entertainment and at the end of the day enable or the goal to enable people to create their own games and experiences and so we have offices in Stockholm Los Angeles in Montreal and so one of our recent projects was this experiment with hybrid real-time rendering deep learning agents and procedural level generation and so we presented this at GDC this year five months ago in case you haven't seen it here's a quick trailer [Music] [Music] [Music] all right so pika-pika is a minigame that we built for our AI agents so it's not really for humans to play and so we use reinforcement learning and with that the agents basically learn how to navigate the environment that you see there on the right and they try to fix those the conveyor belts by fixing the machines there that break randomly and so if you're interested in the AI part of the paper we have a paper on archives so make sure to check it out it talks about like how we train the agents what they see and how they communicate with the game logic and so we built this minigame from the ground up with our in-house research R&D framework which is called Halcyon which is a flexible framework that's you know still pretty young but capable of doing some pretty cool graphics we've had the opportunity to be involved quite early with Microsoft and NVIDIA with under the on the on DirectX ray tracing and so we wanted to explore what what some of the possibilities that were with this piece of technology and you know from those visuals obviously this is not the typical thing you expect from a triple a company from EA but we wanted to explore something a bit different and a bit unusual and we wanted cute visuals that would be clean stylized but still grounded in physically based rendering and we wanted to showcase the strengths of ray tracing while also taking into account the fact that we only had two artists and so we use procedural level generation and so our algorithm basically drives the placement of assets and so if you want more information check out nastasia's talks that she's done and and she explains how the world was procedurally generated and so pika-pika runs at 60fps on a Titan V at 1080p and so to achieve this we built a hybrid of classic rendering techniques with rasterization compute you can see the combination of all the various techniques there and so we sprinkled some ray tracing on top I guess you could say that and we take advantage basically of rasterization compute and retracing and so to be able to do this we need you know a pretty standard deferred renderer with compute based lighting and we also have a pretty standard well full-fledged post-processing stack so for example we render shadows with ray tracing or we also support cascaded shadow Maps the reflections can be ray traced or screen space very march and the same thing with AO and it's only GI translucency in transparency in our in our demo that requires absolutely requires ray tracing and so you know the goal of this presentation basically to provide you with some you know tips and tricks and how to evolve your engine from being raster and computer only to supporting ray tracing with the XR and so I'm I'm hoping to give you some really practical insight you know on bringing these easy to easy to do I guess between quotes like core ray tracing techniques to production but also warn you about some of the challenges you might face as as you go through this really awesome transition and so one question you might have is well you know I'm a developer and I want to take advantage of DXR so what should I do well there's pretty many things you can do so first you know the transition will not be automatical some effects and techniques are easy to add like you're gonna see but like hard shadows or hard reflections or AO but it it starts to get a bit tricky when things start to get blurry and so the thing is most ray tracing techniques from books assume offline ray tracing like perfect yeah perfect or interesting and so there there's still some work to do there if you wanna adapt some of those techniques for real time and so this means you need to embrace like upfront you need to embrace filtering and adapting techniques the constraints of real time and that said you know from the previous talks you saw in this course the API is pretty intuitive and out of the box it's really easy to get you know something going so that's really neat so it should be really easy for you to get it up and running and implement stuff and really improve quickly and so if your engine already has a good dx12 implementation then your job should be pretty easy but so one thing to note is that you you really need like a good grasp on your resources like Sean said and because you know you need to have this those shader tables and it was like sorry some structure is ready upfront for for ray tracing and so some warm up might be required some stretching and so let's warm up a bit and so you know the first thing you might think of doing is well you might want to brush up a bit on the dark text ray tracing like you this course is pretty good but there's also various open source frameworks out there Falkor for example the Microsoft samples you could also check our GDC in digital dragon talks and you should also check out yesterday's presentation from at far on adopting lessons from what's called adopting lessons from offline retracing to real time rehearsing from the advances course it was really good he talked about managing variance and noise really important and so so let's get practical so first you'll need to break down passes so that code can be easily swapped in and out and and by that I mean you basically can reuse code between the various stages and so with hlsl it's easy because of their own table between raster ray tracing and compute so you'll have to start building these shared functions that you can call from any stage for example like material evaluation lighting code and various helpers like that and so also one thing that really helps is making sure your engine can really toggle quickly between techniques and so this is really essential for comparing cutting corners optimizing between proper ray tracing and the real time reducing that are you gonna get in the end and you know balancing for optimizing balancing ray counts and so preparing passes to be able to swap inputs and outputs as well is going to be really important so for example you should break your shadows into a full screen mask so that you can easily swap that you can easily swap for example cascading shadow maps and retrace shadows the same idea could apply for like other techniques like screen space reflections and the retrace reflections and so obviously there are a few techniques to implement first and those are obviously or should give you the most bang for the buck and in that difficulty difficulty order you know that could change but for us it was shadows than ambient occlusion then retraced reflections all right and so you know Rach erase shadows are great because they perfectly ground objects in the scene and I guess it's not too complicated to implement you saw from the previous talks I just launched every towards the light and if the Ramos is well you're not in shadow and you know while hard shadows are great soft shadows are definitely better for for conveying that that scale and they're definitely more representative of the real world and so those can be implemented by sampling random directions in a cone towards the light and so the wider the cone angle the softer the shadows get but the more noise you also get and so you'll you'll have to filter it and so obviously you can launch one ray you can watch more than one rate they'll require some filtering and in our case we basically reproject the results in a TA style fashion and accumulate shadows and variants and then we apply a slightly modified filter a slightly modified SVG F filter which we modified to be a bit more responsive for shadows by coupling it with a a bounding box climbed that's similar to TA and so this is based on Marco salvie's variance based bounding box method which uses a 5x5 kernel estimate for the standard deviation in the frame and then you know but the results are not perfect I know maybe you can see them on the video here or you can see them on the slides after the slides I have the video but you know artifacts aren't really that noticeable with full shading and so like I said another technique that Maps well is of course I mean occlusion which we apply to indirect lighting and so being the integral of the visibility function over the Hemisphere we definitely get some really nice to grounded results because of the random directions they're use during sampling and there's the fact that they actually end up in the scene like screen space techniques where the Rays can go while they're outside the screen or you know some hit point that's just not visible and and so just like in the literature this is done with you know your good old cosine hemispherical sampling around the normal and in our case we launched race from the G buffer normal and we use the mist shader to figure out if we fit something just like the shadows and you know just like the shadows you can launch more than one there per frame but even with one ray per frame you should get some pretty nice gradients after a few frames and we apply similar filter than the one for the shadows and so just like a oh the ability to launch rays from the G buffer allows us to trace pretty cool from smooth to rough reflections and so we trace them at half resolution so for every four pixels you get one reflection ray so that's about a quarter a per pixel and then at the hit point the shadows will typically be sampled with another ray so that totals about toros to half a ray per pixel and so the thing is this works for for hard reflections but you know you need to support arbitrary normals and varying roughness and so our approach first combine this with the screen space reflections for performance but in the end we just ended up ray tracing for simplicity and uniformity and we've talked about this extensively at GDC and Digital Dragon so make sure to check out those talks but if you haven't seen those talks I'm just gonna give you a high-level you know not going into too many details but a high level of how the reflection pipeline works and so first we start by generating raised with brdf importance sampling so this basically means that the Rays follow the properties of the material then like I said seen intersection can be done either via screen space reflections or ray tracing and like I said the N we only retrace but we support both and so once the intersections have been found with proceed with reconstructing the reflected image and so our kernel reuses ray hit information across pixels and we saw we set up sample at full resolution and we also calculate some useful information there that we use during the temporal accumulation pass and finally to kind of clean up the last remaining noise we have this last chance across bilateral filter and so like I said in the end we just use ray tracing for that and so if you look just at the reflection this is the raw results you get from one ray every four pixel so it's pretty noisy as you can see and this is after spatial reconstruction starts to look like something and then followed by a temporal accumulation and then followed by our bilateral I clean up that kind of removes some of the remaining noise it kind of over over blurs a bit but it's kind of needed for the rough reflections and so if you compare this SSR and for those of you implemented this are richer things a bit trickier because you can't really cheat and have that blurred version of the of the screen for like that pre-filtered radiance information and so we definitely get more nor more noise with us than SSR and so our filters need to be a bit more great but so to prevent over blurring we have this variance estimate from the spatial reconstruction pass from before and we use it to basically scale down the batter or kernel and the sample count and in the end while we apply tempura way because tempura is magical and and we get a pretty clean image and so you know considering this comes from one you know 1/4 a per pixel or per frame and you know it works with dynamic camera and object movement I think it's quite awesome what you can do with just reusing spatial temporal data like if we go back to the raw ray trace output like a yeah a bit of a difference and so you know in the case of reflections or any other technique that you want to implement with hybrid rendering it's also key to compare against ground truth like it's fine to cut corners but at some point you know you still need to validate that what you're doing is right and so internally we've built a path tracer that acts as our ground truth and this is in engine and so we can basically toggle it whenever we're working a feature and we actually use this constantly when working on pika-pika and so because of you know interrupts it's where there's minimal to almost no maintenance required because it's the same shader code that's shared like all those shader function I was talking about you know having a ground truth comparison tool in your engine should add too much work and so this is what it looks like if we toggle the path tracer and wait 15 seconds the image is nice but there's some noise just something you can see the sparkles there from the some of the fireflies from the difficult light paths and the caustics and you know compared to our hybrid approach which runs it's like I said 60 FPS on a Titan v it's not quite the same as the path traced version there's some missing caustics and like some small-scale inter-reflections but overall you know it looks pretty decent to me and so you know if we continue talking about some practical things so I make sure to take advantage of that you know DirectX enter up where you can easily share code and data and results between the stages like for example you know you can evaluate here it shows some material shaders from the ray-tracing stage which is quite convenient so there's no conversion required some of you might have done in the past with an external lighting tool or whatnot and you know actually you should be taking advantage of each stages strength and and and so that you can build this really awesome hybrid pipelines for example you can prepare a race to be launched and trace them on another GPU this is what we do for EM GPU support you can write to a UAV and then read that and then another stage like a racing stage so yeah you know in the end interrupts basically allows you to solve these new sparse problems that you know with ray tracing as a new tool you know that that's available to you and you know it makes even more sense these days and interrupts will be kind of become even more your like your new best friend or your best friend again and so there's there's some additional practical tips and tricks when working with the XR and so since you'll be launching many Ray's and I was mentioned by Chris you know the importance of minimizing what and Sean's all like the importance of minimizing what each ray carries can definitely influence performance and so depending on the kind of Ray's you're launching you can have different kinds of payloads some later than others so like on the right I think that's our standard lighting payload so we have all the variables there with some getter and setter functions and then you can see the shadow payload there which is you know much much simpler and so yeah being lean there is kind of important another thing is embracing lean recursiveness you know make your performance like make your IVs happy by not having like infinite recursive nests yeah often you don't need it but you know try to be nice to the interface and so another thing is so you can use shader spaces to organize your data you know I haven't seen many people use shader spaces but it kind of makes it useful in case you want to keep things organized especially if you want to update things at different frequencies so for us like we use two spaces we're like space zero is kind of the default space and the static data is there and space one is more like the ray tracing space or the U or the instancing data resides and so you know speaking of rays and performance like handling coherency is really key for real-time ray tracing performance and so obviously you're gonna get some adjacent rays that are gonna perform you know similar operations memory access memory accesses and they're gonna perform well and while you're gonna get some other ones there my trash cache and affect performance and so depending on which techniques you decide to implement you'll have to keep this in mind and so use Ray's sparingly and in trace only when necessary and so tuning rays and the count of race time to importance and adaptively is really is really key and so also make sure to set the max recursive 'ti on the state object the shared state object so that's one thing that's really important and so right now and they'd show a cell there's really there's really no way to to specify up front if array you're gonna launch is gonna be a coherent incoherent and so providing hints to an IV you know could be good and who knows maybe we'll be able to do this in the future and so as the API keeps evolving but it's not there yet but we'll see and so you know as you saw from our work reconstruction and denoising is really nice allows you to get some really great results and so taking advantage of spatial and temporal domains it really will make a difference for you and there's lots of great work in that field already and you know there's still a lot to do and another question you might have is well what about like texture level of detail and you know like in in in the raster world like MIT mapping is the standard method to avoid text releasing and you know this is supported by by all GPUs you know by shading quads and derivatives the thing is with ray tracing as many if you're aware we don't have shading quads and so traditionally what people have done an offline is they've used and many of you probably here probably know this but you know they've use radio firend shows and the thing is now if you're gonna compute rate differentials you kind of have to store those somewhere so you know and your payload and so having a heavier payload and the case where differentials could be it could mean up to 12 floats you know and you need to carry those down and update those you know that can affect performance on top of the other stuff that I showed that we have in our payloads and so that's the kind of stuff you could potentially optimize and you know another alternative you might have is all just sample mips 0 but if you've shipped any triple a game you know that you can't just sample MIT 0 all the time it just doesn't work and so and also MIPS you're a while' to aliasing like that's why we have MIT mapping it gets rid of that that sort of stuff and additional performance cost and so together with Nvidia research we developed a texture LOD technique for ray tracing which is not perfect and so we barely scratched the surface but it's based on the the properties of triangles curvature estimate distance and incident angle and so we get similar quality to rate differentials with a single trend linear lookup and we only store a single value in the payload so that's pretty cool and the paper is gonna be in ray tracing gems and there's a preprint available so if you want you can check it out and send us feedback it's kind of an example article so if you wanted to submit ray tracing gems that's the kind of format you could submit an article to but so I was saying like we barely scratched the surface like you know so we in our in our case we still need to look at anisotropy and dependent textual lookups and when you have these massive shader graphs it gets a bit more complicated also our technique assumes static cuvees so you know we need to what if you need to recompute those coordinates on the fly and so we use this for pika-pika but the difference was kind of minimal because of the super clean visuals you know even with tell normals on all objects when ton of TA and all that that filtering it was fine yeah and so make sure to check out the the paper and so you know this is kind of just a glimpse of some of the stuff that you may have to tackle and obviously you know we've we've I think we've done some pretty cool stuff but you know we're far from having solved everything and we still have a lot of work to do there's a bunch of open problems that we need to tackle together in the years to come so I think this is really awesome and check out my PG key know if you can where I go into detail about some of those some of these topics and you know hopefully we're gonna see you know with with with ray tracing now being so accessible to everybody some great collaborations between offline resourcing experts and the game industry and so you know to wrap things up ray tracing is nice because it kind of makes it possible to replace those hacks that we had and kind of unify things a bit more and you know kind of phase out some of these greenspace techniques they're a kind of artifact wrong and out of the box Rachel Getting is great because it enables higher quality visuals without painful artist time and but the thing is there's no free lunch and so you'll need to put considerable efforts into reconstruction and filtering like I said and so with the the ray tracing hardware and the possibilities that enable this is even more awesome and so tomorrow I'm gonna give some performance numbers of pika-pi run running on the new hardware compared to a Titan V during the real time reducing session in the morning so make sure to attend if that's of interest but don't forget that ray tracing is just another tool in the toolbox and it should be shows like choose to use it wisely when it makes most sense and I think it's very encouraging that we can get you know approached the quality of path tracing which is almost two rays per pixel like in the context of pika-pika and I'm looking forward to see what you do with with ray-tracing and you know take advantage of hybrid rendering with raster compute and retracing and so see you at the nvidia real time session resourcing session tomorrow i'm gonna like i said i'm going to talk about pika-pika performance on the neutering architecture as well as some additional experiments we did such as ray-traced soft transparent shadows and before i before I'm done academia always ask for content from the games people and often you know we don't we don't really give it out and so for pika-pika we have decided I think it's the first time in the history of EA to to release all the assets from pika-pika so you can download them on sketch driver on our website and you can use them for research for free a build your hybrid ray tracing pipeline and you know compared to ours and you know challenge accepted and so yeah please use them as much as you want and so I would like to thank Chris for inviting me in this course and would like to thank the pika-pika team back in Stockholm and the Los Angeles and the following Nvidia people for for all of this and I will be taking questions thank you [Applause] hello : you talk about ghosting how do you avoid ghosting from temporal uh accumulation for refraction and shadow erase because I I kind of remember for frostbite you have paper about how do how do how to avoid ghosting in SSR yeah I wonder if you do the similar things it's the similar variance calculation that Tom presented at SIGGRAPH three years ago and there's some additional tricks was so like the Marcus Alvey color clan bounding box really helps with that and the thing is with denoising like the more you specialize it around a specific term that you want to you know monitor variants across then that's the way to get the best results so right now there is no one solution to just like I just throw this denoiser and everything is just magically going to be perfect so I don't know if that answers your question or okay hello thanks for the talk cool so you mentioned at some point that of course coherent rays are more efficient so did you investigate anything related to race whoring like for multiple passes to do indirect bounces yeah no we then we didn't look at that but definitely in the future that's something we we want to check out for sure yeah thanks sorry about that Thanks hey since you're launching grace from the G buffer did you actually have any issues with self-intersection so we did and so what you need to do is in your in your Ray description you can define T min and T max right and so that's one that T min value kind of becomes a bit of a challenge and so there are some things you can do with like view distance to objects you can like fiddle with it we don't have that perfect and it's like per techniques specified but usually it's not a problem but it does happen so we've done four shadows like we we tweaked that number based on how close you are from the surface go within the lifespan of Rey where have you found that you can see more bottleneck or it's kind of it's hard to say I think it really comes down to coherency like you know if you're gonna if you're gonna trace primary Ray's you know just to figure out like oh like yeah just primary their rays you're not gonna get it's not it's pretty efficient it's when you start launching these rays that go in all sorts of directions I hit all sorts of materials and that's when it becomes a bit challenging but you know obviously our scene was kind of super simple and in the future we want to explore different kinds of scenes with like more content and so I think if you ask me that question in a few months so I'll give you a better answer that cool hey so have you tried ray tracing with animated meshes right so the robots right there it's basically just the top acceleration that updates right we kind of cheated a bit they're not skinned and yeah in the future definitely this is something we want to check out a bit more like you know bigger worlds with more animated stuff so I can't really give you some there anyone who has tried well so there was that there was an epic demo last year with the the Star Wars characters so you might want to ask those guys how that I know they were pretty high poly so you might want to ask them looked a bit like rigid wishes I mean there's still a lot of stuff to do and you know some of the some of the open problems do include did I write it there and while animated meshes it's it's a it's interesting cool thanks that's it thank you very much [Applause]

Info

Channel: ACMSIGGRAPH

Views: 94,130

Rating: undefined out of 5

Keywords: ACM, ACM SIGGRAPH

Id: Q1cuuepVNoY

Channel Id: undefined

Length: 199min 17sec (11957 seconds)

Published: Tue Aug 14 2018