I made it FASTER // Code Review

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey what's up guys my name is shawn and welcome back to my code review series yes it is that time of month yet again where you guys send me your code and i take a look at it so how i usually pick these is just based on if i think there will be some value for you guys now a couple episodes ago we actually took a look at a ray tracing example i'll have that video linked up there if you want to check it out because today we are yet again going to be taking a look at ray tracing specifically an implementation of something called ray tracing in a weekend i mentioned this last time in that rage racing video for those of you who don't know it's basically like a little website book thing that teaches you like ray tracing in a weekend that's kind of the idea now i did mention that i would probably take a seat at some point and do my own kind of ray tracing in a weekend that is implement like the ray tracing in a weekend try and improve it and make it kind of like an educational video series i am still planning to do that but when i opened up my emails today what i saw the very first thing i saw actually this is literally the most recent email there is ray tracing in a weekend with i'm going multi-threading in c 17 and i thought you guys would enjoy taking a look at this as well so let's take a look at this from valerio for mato format expecting his code to be well formatted yeah that's that's pretty cool nice joke hi china i'm a longtime fan of your channel while cursing time zones prevent me from watching most of your twitch streams from italy i'm from italy i don't think we've had an italian on the channel review series or the code review series just yet last weekend i had some time in my hands i decided to try to follow the ray racing weekend book minutes inquiry yeah i will have that linked by the way in the description below the raid racing weekend thing highly recommend you check it out if you are interested in graphics i have to say i had a blast see valerio certainly had a blast so yeah 35 years old physicist and while i'm not a professional coder i've been working daily in school plus for over 10 years maybe this technically means i'm a coder nevertheless but only recently i have become interested in learning advanced and more modern cbos plus so i take every chance i get to practice i tried implementing the book following my own style wherever i could and as a bonus i've also thrown in some multithreading just for fun i know you're feeling on projects that don't build out of the box these guys watch the series so i've taken extra care to take care of the dependencies either way sub modules by liking cmf that's good that's good i'm expecting a 35 year old physicist who's been working in seattle's daily for over 10 years to get this right so fingers crossed he has a cmake project but it was created by visual studio sure i mean i don't mind cmake projects so it should work out of the box if you want to take a look at it i would be immensely honored you can find the code at this and i will have this repository linked in the description below as well keep up the good work valerio bacco formatto the only thing i found publishing during this build is linker was able to find the d3d11 libraries out of the box i guess they're already in some defaulting path yeah if you install visual studio with like the kind of directx sdk which is usually included as part of like some simple plus for games distribution or whatever it's i think it's just included in them in the kind of standard visual studio like msvc library so that would explain why that is there a few things that i like about this by the way and why i've selected it for this video is because this is a person who is 35 years old not 13 and i think that just the age of the person can have a lot to do with what the code is like now not necessarily i don't want to be like ageist or something here and say that yeah all 35 year olds are better than 13 year olds no matter how many how much experience they've had that's not true at all but i think that maybe we'll see some differences in the code compared to someone who might be a little bit younger just because that life experience i think actually teaches you to be a better programmer believe it or not working daily in simplest for over 10 years interested in learning advanced and more modern c plus plus i try to implement the book following my own style and as a bonus also thrown in some multi-threading so these are all kind of things that i think will make the code interesting to look at hopefully we can take a look at a more modern simple plus example by someone who hopefully knows what they're doing and i think we're going to have a really good time with this one but first i want to mention this video is sponsored by skillshare skillshare has pretty much sponsored the entire code review series this year so let's just start off by giving them a huge thank you for those of you who don't know what skillshare is skillshare is an amazing online learning community where millions of creative and curious people from all around the world can come together into one place to learn a new skill or skills because there's a lot of stuff on that platform literally like any topic you're interested in such as illustration how to draw better photography videography how to make better youtube videos how to start on youtube marketing productivity all of that stuff skillshare literally has thousands of classes to choose from and they have an absolutely amazingly large and high quality library available for you to check out and these videos are very efficient they're not like long they're not youtube kind of you know don't forget to hit subscribe and i need my watch time no they're concise they're to the point they're efficient and they're gonna teach you those new skills well and personally that's one of my favorite things about the platform the fact that it's just so efficient at teaching skillshare is offering the first 1000 people who click the link in description below a one month free trial of skillshare where you'll be able to check out their entire range of classes for free for a month so definitely give it a go and sign up using that link i'm sure you guys will find something there that you are interested in yeah it's just a great platform really thanks to skillshare for sponsoring this video all right valerio the physicist let's see what you've got so we mentioned that he would have a good build process and what do you know it works color me impressed okay so we just have this kind of kind of weirdly fades out i've clicked somewhere and now this is dead okay maybe things aren't going as smoothly as i hoped let's hit start and see what happens my that is slow why is it so slow it's rendering nothing we're gonna see something oh some spheres maybe why is this so slow and it's using 100 of my cpu cores so it's using like 30 all 32 threads are occupied here and it's running this slowly rendering a few spheres okay something's very wrong here well he certainly wasn't lying about the multi-threading that we can definitely see but i'm just by the way this is running in release mode just in case you guys think i'm a fool okay we're gonna come back to the performance this might be a little performance episode would be in line with optimization october even though it's like december now let's take a look at the code and we will definitely circle back to that performance because i am almost 100 sure that i can make it faster okay so we initialize this app uh pretty standard stuff here let's see what this does so he is using directx 11 in some way possibly just for like the imgui rendering and the actual presentation i assume all the ray tracing is on cpu side certainly with that performance and that like core usage that we looked at um let's take a look at this stuff so he's using no discard which is a cool little attribute it basically tries to prevent us from discarding the result of this function so that we don't kind of call it as if it was a void function which is fine throws a compiler warning overall i like the neatness of the code i like the m underscores for all the private members here looks good to me obviously very win32 dependent you've got some directx backhand stuff all pretty kind of boilerplate i probably won't spend too much time looking i like all of the directx 11 stuff because this is all just boilerplate code to like create like the device and the swap chain and all the context kind of stuff update texture 2d that is probably what i expect is actually rendering onto the screen which again is pretty standard we're just mapping the memory there and then doing a mem copy from the image buffer which is where i assume the ray tracing is writing into so we have like a cpu side uh image buffer over here which is wrapped inside a unique pointer interesting i probably wouldn't have done it this way not a huge fan of like wrapping because what's what this it what this actually is if we take a look at where it was created which i assume is maybe okay it looks like it happens over here create image buffer and then you're really just doing an allocation of just raw memory like this i would probably make this like a vector so that you could basically have like an array that is the size of the size that you need but on the heap if you're doing it that way i don't know i just don't like the idea of wrapping it in unique pointer having a raw array wrapped in a unique pointer you could of course just yeah clank format that's cool using clang format pretty nice um you could just obviously have something like your image buffer as a raw buffer like this and then you could allocate it i may have done it that way in hazel we have like a class called buffer which basically wraps exactly this and has a lot of like convenience functions and all of that it doesn't own the buffer of memory so it won't delete it if that stack object goes out of scope or anything but it is like a wrapper over that raw pointer to make things a little bit safer and easier but i don't know unique pointer for a array basically for a raw kind of allocated array what do you guys think about that leave a comment below i probably wouldn't do it as i mentioned i would make this like a vector of un8t or something like that and then resize it to the appropriate size or maybe use a raw pointer like this okay so we figured out where the image buffer is the renderer i'm assuming this is our big boy who'll do all of the work so let's take a look at the renderer class so we have a bunch of render states ready running finish stopped i'm assuming some of these are probably necessary for like multi-threading and maybe also figuring out like when to update the image and all that if it's still running or not set samples per pixel and 500 okay well maybe that's why it's slow let's try something a bit uh more reasonable like 32 max ray depth 50 ray depth bounces 50 bounces my gosh let's make this like eight or something let's try it again with these settings we'll see how much this speeds up and then maybe we'll actually see an image okay so let's hit start now okay that's faster but when it gets to the spheres it's still quite slow why is this taking so long okay so a few notes about ray tracing and all that i haven't actually looked at the ray tracing in a weekend code i don't see why it would be this slow i mean when it gets to like all the bounces and there's multiple intersections because you can see the reflection of the spheres here i would expect it to be slower but this is like this is insanely slow and for some reason i doubt that it's the ray tracing in a weekend code so maybe you've done something weird i'm this close to just dropping this code review and turning this video into like an optimization video should we wait for this to complete there's a fair few spheres around which is nice so the image obviously looks quite noisy because we made the samples so low one thing i'd recommend though is maybe add a denoiser to this i mean i'm i'm assuming there's not a denoiser once this finishes i just don't have the patience to wait for this i think you know what let's do it i'm going to wait for this to finish rendering we'll see how long it takes to render then we'll see if we can do better by trying to optimize it one thing i might do really quickly is actually stop this from taking over all the threads because i need some like cores available to record this video so let me just quickly uh go into let's see um the renderer how does this render stuff we might as well look at this so render a render that was a good bet uh start rendering render state says is set to running um what do we have here we have a lambda we're going through this right pixel above okay right pixel to buffer pixel coordinate so it looks like we're running we're writing pixels one by one but assume this is spread across over all the cores yep we have a thread pull we add literally every single pixel from image size y minus one so that i guess is the top and then we go so we go from the top to the bottom instead of bottom top i guess which is why we've got this reverse loop and then we go through every horizontal pixel as you saw there and we add this to thread pool this i guess is the single threaded version of this versus adding a task per pixel not sure if i would separate every pixel into a task but it is a thread pool which obviously makes me assume you haven't made a hundred thousand threads um so how do you make the thread pull so thread pull just starts off there's the pull uh construct match okay thread count so we're using this or we're using just whatever's available i'm not sure which well i guess we you were using the default constructor let's just halve this so that we use maybe half of the cause this will obviously slow it down but it will mean that i can hopefully record the video as well now that i think about it i hope that those sections when i was running it um with all the cores actually recorded okay to be honest it doesn't even look like it's slowed it down that much but we should now be utilizing half of this stuff so you can see our usage is like 78 77 it's not a hundred percent so we're obviously not occupying everything uh with this ray tracing app cool all right well let's just run this for a while see how long it takes and uh i hope you print out the time that it took let me just quickly check that out okay logger debug rendering took and then we've got a stop time minus start time okay good so in other words we should actually see how long this takes when it's done whenever that is okay it's done 455 seconds that's seven and a half minutes to render this frame at 32 samples with all this noise at this resolution which is what like 1280 by 720 i think i could have written a ray tracer in that time all right guys throw this code review out the window we're gonna fix this back at it again with channel fixes your code okay so again i don't know what is going on here but i just know from experience that that is way too long so there has to be something wrong here so let's talk about how we might approach optimizing this what i'm gonna do is just a really quick scan over the code first of all there is not that much stuff going on um you know there's very like array tracer is most is mostly just mathematical functions like normalize square root reflect you know that that's really all there is to a ray tracer there should not be anything else really going on apart from the mathematics and i think it's really really important that you don't try and over engineer the language side of things because that's really easy to do you could get lost with all of simpleplus's features he didn't mention he was using superhero 17 he gave his own personal style to this whatever that means we're about to find out what that means but my initial fear is that something here is slightly over engineered because the physical mathematics here i don't think that it's possible for the actual mathematics that need to be done to take that long like i would expect that frame to well definitely not over a minute like that is wild so let's take a look at what's going on here okay so um dielectric scatter and that is what this is a material dielectric okay so we have dielectric materials public material okay so the first thing that i'm not really potentially liking here is the whole virtual function uh class hierarchy situation why is that an issue for performance because the problem is that i'm sure that he's got a bunch of different materials let's see what else is here metal lamb version as well scatter and they've all got scatter functions so per pixel immediately when we look up a material it's not as simple as just deciding what it is we need to do we need to also make sure that we dynamically dispatch that to the right function using a v table now again fundamentally not the slowest thing in the world however when you're writing a ray tracer you have to realize that you are processing each and every pixel one by one whether that be across different cores i don't care you're still going through each and every pixel on the cpu and you're running some kind of code over that to make matters worse what happens when the ray bounces well you have to do you have to cast another array somewhere else so potentially multiple bounces obviously per pixel i think we're doing like eight there could be other stuff going on such as lights and shadows like there is a lot of stuff that happens like these functions are very hot in the sense that they get called so much and because they get called so much we need to be careful with things like memory layout and like what what what it is we actually do per pixel because that's really really important and so immediately i don't like the whole dynamic dispatch kind of i don't like the whole like inheritance v table situation here because that is definitely going to be slowing things down at this scale now um you've also got an optional here which isn't the worst thing in the world but probably will require some additional processing at some point i could imagine some cycles being wasted there the other big problem by the way with having virtual kind of classes like this and the whole v table thing is that you kind of can't really have them be tightly packed in memory anymore like they become and and we'll see in a minute i might be wrong but i'm pretty sure this means that you're also heap allocating each and every one of these materials which again i think for a material is fine but um yeah let's see what the sphere situation is because that looks like you're on the object here you've got a hittable object list which is has a hit function hittable object and a sphere maybe a sphere is a sphere is a hittable object and then you do hit on that so where does that get called from hit object hit hittable object list hit and then this is all of the objects probably in the scene so we have a scene filled with objects and then we run hit on it shoot ray so we should array out per bounce as well and then we try and call that hit function which will obviously go through all the objects which are all spheres all on the heap or somewhere else in memory and then try and figure out if it hits them or not so from a memory kind of fragmentation cpu cache situation this is this is far from ideal and i think again at this scale that matters so much and i'm probably about to prove it to you to be honest but we'll see let's not get ahead of ourselves so i think i kind of understand how this works i just want to look at the renderer one more time so when we do the renderer so what we do is we go through and we use scd accumulate which again goes through begin and end samples which we have samples per pixel so we have basically 32 of these that we've made initialized to zero that's not too bad we're running over 32 of these um but then we go through and then we shoot rays which if it hits something we use the material so we look up the material to see how we should basically scatter that ray how we should proceed with that ray and then we run shoot ray again and we use scene.hit and scene what's seen seen as a hittable object which is probably that hittable objects list how do we so how do we add stuff scenes.cpp that's useful how do we add stuff to the scene here so if we're a default scene then we take scene and m scene is a hittable object list okay so we have that scene and then we add and yeah you can see all of these spheres are make shared right you're using a lot of heap allocated memory here every sphere is somewhere else in memory i would not do this what i would do is i would prioritize as much as possible to try and keep this scene as cache coherent as possible i want it to be in contiguous memory right next to each other i want these to be stack allocated objects right this hittable object list should not contain a vector of shared pointers it should contain a vector of these actual hittable objects which again might be difficult because they're all kind of virtual and very object oriented so you have to remember with ray tracing with these mathematical algorithms you need to prioritize the math and the data that is all that needs to be calculated you are basically getting your cpu to do a bunch of mathematical calculations they should be as streamlined as physically possible they should not be spread across memory like this so that's i'm not sure how much of a performance impact that's having but my bet is it's definitely going to be noticeable right we'll keep going and we'll we'll see what we can do okay so scenes are composed of all of these kind of uh spheres i guess you have a bunch of random i guess okay i guess this is just a bunch of random sphere materials and spheres are randomly placed as well you have some glass stuff you have three materials here so three i don't know which scene we're actually running uh let's see m scene type is default scene so it is in fact this first one okay sure so again you're creating the scene up and then the renderer will eventually go through that scene and shoot all these rays it will go to scene hit we know scene is that hittable object list so that will then once again seduce i'm not exactly sure why you're using std accumulate here i am not very familiar with this function um but this to me at the moment is kind of screaming out over engineering like what i assume accumulator is doing is you're just passing in this kind of lambda here and it will call this for each object in this range and then accumulate or sum up the values that's what i think is happening i would write this as a for loop because i think that would both make more sense and also will probably avoid like a lot of the overhead but i'm not exactly sure again how that's all gonna work but i do see what you mean by you've kind of updated it more using all of these library features again optional accumulate i probably wouldn't use any of these to be honest because what it's doing is it's adding a little bit too much abstraction for my liking to the actual mathematics and the actual data which is exactly where we should be prioritizing this entire code base okay so now that i'm more or less familiar with the code let's go ahead and launch the performance profiler so we're going to target cpu usage hit start again we're in release mode let's hit start and i'm going to let it run for a little bit especially over these spheres and i'm really interested in seeing exactly where that cpu time is actually going so if we go back here hit stop collection let's see if we can select a range that was after we actually hit start which is this section over here we've got a decent amount like 10 seconds here of sample data let's take a look at what's going on and it looks like there is no source information available why are you not like compiling with debug symbols or something or release let's take a look at like link uh debugging generate debug info no maybe this is for like some performance optimization but let's just enable that and then let's also just double check here that i have no idea where this is debug information format yeah let's just set that to like c7 and i'll go ahead and make sure that's done for the render this is a static library so it's not going to have those linker settings but debug information format c7 compatible and i think app as well i'm not sure if we're going through app or if that's a different program but just in case i'll set it all to c7 compatible let's go ahead and close this session and then we can just go debug and relaunch performance profiler okay so i've hit start hopefully we'll actually get some debug symbols going here all right that's enough let's hit stop collection i'm not sure if you can hear my computer struggling with this but my goodness networking 98 why okay so let's take a look at this so this is the hot path 86 51 percent again it's all so render issue ray is taking 97 that's what i would expect from a ray tracer let's take a look though at what exactly is going on so there's all of our source information uh okay attenuation let's just expand that hot path and we'll see what's going on scd accumulates 35 percent is spent here and yet so this is fear hit this is the mathematics and these are all like below 0.2 percent of total time the in fact this is 0.76 percent but this is 35 what is going on reduce up so that's the function that's the templated function we're passing in so that's this lambda which what does that lambda do object hit yeah which is sphere hit and this is an if statement that's the return statement that's getting that optional value how how is the where is all the performance going what i don't understand this at all i'll be completely honest i'm not sure how 35 of our time can be here what's first what's so we're dereferencing first we're passing in a value u first is uh all right okay so that's the that must be let's go back to here that must be um [Music] the actual object that we're running over so that must be the sphere yeah there it is because it's that parameter in the lambda as well okay i don't know why accumulator is so slow but here's what we're going to do the problem with this the problem that i don't like is that again this is redirecting to some other code that obviously we didn't write and furthermore is using this kind of lambda which again goes here but it's just it's a little bit weird like this lambda it's not showing me the actual details of the lambda for some reason so in other words it's not it's not showing me what part here is slow because again i don't really see what's going on like i don't see why any of this would be so slow but let's let's try and rewrite this as a for loop because i want to actually see the profiler tell me what part of the for loop is slower rather than just being like yeah it's this whole accumulate accumulate lambda thing because again i honestly do not know why this is so slow so where was that slow thing um std accumulate this is what file is this hittable object list probably yeah hittable object list so let's go to hittable object list and let's go ahead and rewrite this so how would we rewrite this um well first of all i want to do this in such a way that we can easily go back to the other implementation so we can test the difference between them um so let's go ahead and copy this and okay so how would we do this well we'd go we start off and i'm going to write this in as simple of a format as possible so i'm going to i'm not going to use a range based for loop again i'm trying to make this uh as simple as possible i've got that clang format thing i'm trying to make this as simple as humanly possible right because this should be all kind of line by line so we can see if retrieving some kind of memory is being slow like what is taking up all this time that's what i'm interested in so we'll go through all this so we need to get the object so let's say okay so what is the type the type is an std shed point again i'm going to write all of this stuff out so this is going to be a const reference of this hittable object we'll call it object because it's this thing so const order is what we're taking in so we've basically duplicated this here i could have made this order but i'm trying to be explicit so objects i so we have the object and now we just need to do this so if temp hit then object object hit and then return so return temp hit um [Music] so closest so far equals the value return temp hit oh that's that's what we're accumulating i guess what are we accumulating city optional here so this is probably the initial value so let's start off the initial value record and then let's do hit record return temp value and what's a hit record all of this stuff is that face normal okay so okay so if we've hit something closer i guess we can just set record equal to temp hit otherwise it's going to be temp value which was the first parameter that we took in which would have just been whatever we set it to last time okay so i think we don't even need that okay so if i'm not mistaken if we hit something closer than the other thing then we overwrite the value basically with this and we keep track of what is the closest so far by keeping track of the distance of that ray to the thing that it hit and again if it's closer then we override it otherwise we don't we just keep it and then i guess at the end we just return well we returned this so at the end i guess we would just return this record over here so let's return that and then let's get rid of all of this okay that looks pretty reasonable to me let's just run it to make sure that like it works i guess and then we'll see whoa look at how fast it's running what does this look the same as the other thing have i made a mistake look at how fast that is 23 seconds what happened how is that 23 seconds is that should i take a screenshot like before or not it looks the same doesn't it what on earth let's quickly run a profile over that what i mean i have never used a cd accumulator before like ever i don't think i've come across it but i've never actually written it before is it really that slow is there a bug in this msvc library or something what's going on why is it so fast now and that's not even this is still running on half my cause but look at this so where are we spending the time in now hit a list hit still okay so object hit is taking up 32 percent as you would expect and then let's just expand that hot path sphere hit okay this is this looks better so you can see that we're actually okay so some of this optional stuff might be slowing us down a little bit you can see it's taking like three percent maybe i wouldn't use optional as well but you can see that we're more or less inside this function now because we can see that we're calling hit and it's being called and again those samples this is a sampling profiler most of those samples 30 are ending up within this function right and then the rest by the way are probably also ending up within that function but in a different stack frame so really they're all ending up within that sphere hit function which is again what you would expect now to take this a little bit further i wouldn't use uh optionals here at all right because what you're doing is you're saying that in certain cases right let's just return like an optional with no value and then that way obviously back here we can use that uh if statement that we had here right well we're still using it here we can use this if statement to determine that it was not hit so therefore you know whatever let's not update the record and you know because there was something that was closer i guess or whatever or it just didn't work out so we return optional here we return optional here as well find the nearest route lies in the acceptable range sure i probably wouldn't do that and then also you're creating a hit record here and then you're returning it packaged inside an optional what i would probably do is basically bring out like a hit record like this maybe set some kind of like default value for t like minus one or something um this returns an optional that's fine because obviously this isn't per pixel or whatever i assume well this is per object maybe this would be maybe we don't want to return an optional here as well but if we did we probably could just check t if it's negative one let's return you know a blank optional however you even do that which i guess is well you did that before so it's just this otherwise we'll return that record or something like that and then we'll pass this record as the kind of uh you know target of this hit so that way we don't have to create a hit record copy it do all of that stuff it's already within this stack frame and then also it means that we uh you know we don't need to return optionals or do anything like that which seems to be taking up some some of the uh profiling cycles that we can see over here right we don't have to waste all that time doing this stuff um because we can obviously just uh you know we can just well return i guess we can make this a void function and it would write everything into that hit record that we actually pass in and in this case just not write anything right because that's effectively what's happening anyway so we could definitely and then we wouldn't need any if statements because this function would be the one doing the work because i i guess from another kind of code like looking through this code and this algorithm another thing i don't like is you've got a bunch of if statements for stuff here but then you're doing it again here so then you're checking this if right these if's should almost in my mind be enough to say okay don't write anything in here so therefore we don't need branches within this function and other branches outside of the function at this level because that's just like that's just a bit unnecessary so there's still a lot we could do um that's not even to mention the whole shared pointer situation um because again i wouldn't use shared pointers okay so this code review is already dragging on but let me know what you guys think of all of these proposals if you want to see me implement this and take this project as an example and try and make it as fast as possible remove the shared pointers remove all this optional stuff maybe see how much performance i can squeeze out let me know in the comment section below i will make an additional video where i go through and try and optimize this specifically maybe this code review series is going to become an optimization series but let me know what you think of that because otherwise i know that maybe not everyone's interested in the optimization part of this and also i think we've got some pretty good pretty decent gains if you know what i'm talking about i mean we went from like uh seven and a half minutes to 20 seconds so i don't even know how like i don't even know how that's possible let's maybe restore this to like what it was before so this was an scd optional let's hit f5 but yeah that's like i mean this still i think could probably be optimized a bit more it is still in my mind taking a considerable amount of time probably for the reasons that i've mentioned there's just a lot of like shared pointers and memory all over the place but 22 seconds man from seven and a half minutes with half of using only half the threads because i've still got like the divided by two there i think that's decent okay so let's let's quickly wrap up this code review we'll see what else we missed and what we have to look at um again there's not even that much stuff going on here i just like to summarize i don't like the object-oriented nature of this i think it's a little bit also over engineered in some areas i think that um i mean you know you've got like 10 years of experience as you mentioned with this the code itself is very well organized you you definitely like you know you've made an effort to make it good but i think again and you know there's exceptions and all of this stuff which is fine like in the uh compared to some of the other stuff exceptions are fine i'll forgive you for that because that's obviously more of a personal opinion anyway um everything looks really clean uh maybe some more comments regarding some things would be good um but to be honest like this is pretty standard this is like you're making a projection matrix basically and setting up the camera um this is totally fine okay so you've got you're using some randomly kind of generated uh rays as well uh which is again pretty typical to be honest most of this is in fact math not sure why you're using scd instead of glm here i don't know maybe it's faster or something in some cases but i would expect you to use glm probably everywhere um yeah i don't really have too much to say i mean all of this stuff is is pretty decent like again i don't have any um negative feedback about the actual code style or the way you've written this i think it's nice but i would kind of reduce it a little bit more to be like you know kind of closer to the metal so to speak i would kind of reduce some of the abstraction and the fluff and the kind of hierarchies of object-oriented objects and uh you know heap allocations with like smart pointers and optionals and like i would reduce all of that stuff i would kind of focus it more on the math and focus it more on the memory and the way that the cpu is processing this i mean honestly like i like this code i would i would be happy to take this maybe i'll do it myself with the ray tracing in a weekend series that i was talking about but um i think this is a great case study for how we can take some over engineered code that is running slowly as we saw and make it as quick as possible the other thing that would be cool to do as always is to use a gpu to do all of this computation because that's obviously going to be much much faster and you won't have to manage some kind of thread pull with only like even on my computer 32 kind of threads running you could have like thousands on the gpu running in like fragment shaders or compute shaders right that could be that could 100 be done and that would be way way way faster so that could be another way that you could obviously extend this this is i'm getting motivated hearing myself like explain all of this now i want to write like some kind of gpu based very very fast ray tracer ah but anyway hope you guys enjoy another tracing code review let me know what you guys thought about all of that but of course i think the way that we looked at this compared to the other ray tracer that we looked at was very different so i hope that this was helpful for you guys hope you guys enjoyed this video if you did please don't forget to hit the like button below as always you can send in your code to cherno review there will be some instructions in the description below whatever it is you're working on if it is interesting if you need some help if you think it'll be a good learning example or you want some guidance as well feel free to send it in i do get a lot of emails as i mentioned but who knows you might just be the one thanks for watching guys don't forget to check out skillshare using the link in description below and i will see you next time goodbye [Music] you
Info
Channel: The Cherno
Views: 113,891
Rating: undefined out of 5
Keywords: thecherno, thechernoproject, cherno, c++, programming, gamedev, game development, learn c++, c++ tutorial, ray tracing, ray tracing in one weekend, ray tracing in a weekend, optimization, performance, making code faster, raytracer, ray tracer, faster c++, c++17, code review
Id: mOSirVeP5lo
Channel Id: undefined
Length: 38min 45sec (2325 seconds)
Published: Fri Dec 03 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.