Vulkan API Tutorial - 12 - Swapping and Clearing

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone welcome back to my Vulkan API tutorial series this tutorial will be a bit longer but we are finally going to get something visible on the screen still relatively basic stuff but necessary nonetheless what we are going to do in this tutorials is a three step process acquire image clear image and finally present that image on the window surface aka swapping images this is also the first time we really need to start thinking about parallel processing as all the stuff we sent to the Vulcan queue will execute alongside with our program I'm going to try and keep the design of this code relatively simple and I'll synchronize the GPU and CPU once in every frame this design will allow the CPU to calculate logic while the GPU is rendering as well as automatically presenting that image when the GPU is done with it this should be a good starting point for any projects bokken allows you to design your timing from start to finish so care must be taken we don't accidentally crash our program or the GPU we'll start off by defining our program into two sections logic updates and rendering everything inside begin render and end render is our rendering routine including those commands we begin our rendering routine by acquiring an image from the Swap chain inside our rendering routine we're going to write a command buffer and submit it to a Vulcan queue and when we end our render we're going to present that image which is done through a Vulcan queue now let's see about the dependencies our image that we want to present depends on the previous operation on our queue normally this wouldn't be a problem but commands in the queue are not guaranteed to execute in a similar fashion as in the CPU the queue might start presenting the image before it's completely rendered or before it's memory is made available for presenting the image so we need a pair between the two secondly our command buffer right depends on the command buffer not pending execution that means that the command buffer that we are using in our Walton queue must not be altered until its execution has finished you could double buffer your command buffers to increase the rendering speed but this would increase the complexity a lot so we're just going to stick with one command buffer present image depends that we have acquired an image previously in our specific case our entire render routine depends on having an image from the swamp chain to render - because we're rendering directly to the swamp chain image you could have an image that you render to first and then copy its contents to the swap chain image right before presenting it but I'll keep this a bit more simple in here it's also possible that the acquire image didn't return us an image but that usually means that we messed something up finally acquire image may depend that the presenting of an image has completed acquiring image may return an image that it's still being used by refer the presentation engine if this is the case we need to wait until the presentation engine is done with it this isn't the most efficient synchronization and you can choose to do a lot more overlapping which should increase the performance the difficulty of keeping track of the program executions will shoot up really quickly though and the possible speed benefits might not be that great at later stages you might want to try something different but for now well just keep it simple finally in theory our entire rendering loop should work something like this rendering three frames the first initial frame second is the between frames and the last is our ending frame you so the first thing I want to do is structure our program like this and we'll start off by creating this begin and end functions in the window actually this is because our window has the handles for the depth stencil image render pass and frame buffers so basically all of our rendering is going to happen inside this window if you don't want to render to a window directly in that case you have to make your frame buffers color image death stencil image and render path somewhere else this way you don't have to use multiple images you cannot use just one frame buffer one image one depth pencil image and right before presenting you can just copy the contents of that image to the swap chain image but in our case we just gonna use this our window to do the rendering of everything so let's go to the window dot H file and in this file we're gonna add a couple more functions again so a couple more functions begin render and end render and I'm gonna create these functions in the window dot CPP file like so and in the function begin render we are going to acquire an image and we can do that by calling a choir next image khr this one takes in the device this watch chain our timeout value after this this function will return regardless of did it did it acquire an image or not and this value is in nanoseconds if you give this a value of zero in that case this function will immediately return the VK result this function returns will tell if we did acquire an image or not another value you can give this is you int 64 Max and this will wait forever if that's what it takes anything in between zero and this max value is in nanoseconds the next is semaphore and fans want us to put a 0 in here for now I'm gonna explain those later finally is our return value or index for our new image and for this we need a new member variable which I'm gonna call active swap Jane image ID we're gonna go back to the window dot H file and add this variable in here under swap Jane image count the title feu int 32 t and the name will be active swap jane image ID by default this is going to be you int 32 max next let's goes back to our window dot CPP file then finally we got a semaphore and fens when this function returns our image the very available image it might still be used by the presentation engine and the semaphore and fins will tell us when the presentation engine is done with this image so two things in here a choir next image gets us the image the next image if there's one available and that available image might still be under use by the presentation engine and in this case I'm gonna use a fence so let's go and create one in the window down to H file I'm gonna create a new function in here and I'm gonna call it in its synchronizations also new function D in its synchronizations and let's create these functions in the window dot CPP file and we need a fence so let's give a let's create a handle for a fence I think I'm gonna just create this handle in here right onto the active swap chain image ID and the name will be swapped chain image available and by default is a null handle ok let's create this fence then let's go to our new functions in it synchronizations and let's create a fence by calling a function VK create fence first parameter will be the Vulcan device and the second will be the create info third one will be the allocation callbacks and the last returns our fence handle our fence creates info will be a kind of small one and relatively pointless at this stage P next is not going to be used and flags only allows us to put us this fence into an already signaled state and this is something we don't want so that creates our fence it's also destroyed our friends like that that should be a relatively self-explanatory and let's go back to our begin render function and let's provide that fence in here also we're gonna change the semaphore to VK null handle now that we got our fence we need to wait for our fence so VK wait for fences first parameter will be the device second will be fence count how many fences do we want to wait for next is list of fences and we only have one dog and provided as a pointer we capable 30 to wait for all or waits all this doesn't matter because we only have one fence but this basically means I've if we have more than one fence should we weigh for all to be signaled or is one of them enough if one of them is signaled for correctness I'm gonna say VK true and the timeout is the timeout how long are we going to weigh for this fence to be signaled until we return from this function and this is provided in nanoseconds and again you in 64 max is the value that holds the execution forever now that that's done we also need to reset our fence again the first parameter will be our device second will be the fence count and the last will be a list of fences which is again going to be only this one and I'll provide it as a pointer now that's done the one last thing is in this function I want to synchronize our GPU or our Vulcan cue with our program so I'm going to use a command called vkq weights idle we can also wait for our device to become idle if we have more than one accused in that case it will stall the execution of our main thread while we wait for other queues for example there might be a queue that's pushing some data into the GPU we don't need to wait for that unless we really need to wait for the entire device to become idle we should use the function vkq wait idle this one only takes one parameter and that parameter will be at the rendering queue that's it for this function but I'm gonna tidy this off a little bit let's go ahead and do our end a render function at this point we want to present our image in here so we're going to call a function VK q % khr the first parameter will be our Q and the last parameter will be a pointer to our present info then I'll create the info structure in here okay inside this structure s type is our structure type P next is not going to be used by us and waits in a four-count for now let's not use this at all so wait semaphore count for now will be 0 and P went semaphore it will be null pointer we are going to provide these later swap jain image count this tells how many swap chains are we going to update at the same time we only have one so we'll provide only one and then we provide a list of swap chains and the list size will be swamped chain count in our case we provide a pointer to our swatch chain next is P image indices this is a list of image indices that we want to present the inside this work chains so we only have one swap Jane and we only have one active swap chain image we provided this as a pointer pointer to active swap chain image and the size of this list or so called list in here is the size of sort chain count and finally P results is what we can provide it it's not necessary for us but just because we should always check for this the P results in this case will be a pointer to our percent results this would be a list of return results for every one of these swap chains so let's go through that again swap chain count tells how many swap chains we want to update or how many swap chains we want to present to at the same time P swap chains is a list of swap chains we want to present P image in diseases are active indices in these four chains we want to present on each one of these warp chains and P result is the return value for all these womp chains and it basically just just tells us how did it go and I check for errors in here now I should also check for errors in here okay that should make our lives a little bit more easier if there's an arrow that is also let's check the error from our percent results like so and this is all we need for our end render function let's go back to the main dot CPP file and we can start using this functions already it'll create an error but still now open window actually returns our window handle so we can use that like so and so W now presents our window it's also include a link window in here begin render W begin render end render will be W and render now let's try and run this I can tell you that it does not work right out of the bat but we'll see what happens ok let's start running this line by line in the beguine render function let's go and run that well VK acquire next image khr semaphores fence cannot be both we can all handle there would be no way to determine the completion of this operation ok well small mistake I actually forgot to call this function in the constructor so let's add this as the last function in the constructor and the T in it will be the first function we call in the destructor ok let's try again now that should be working a little bit better so our active swaption image ID will be nonsensical again let's run that now it's zero so we count our first swap chain of each ID we now have full access to the swap chain image at ID zero we're gonna wave for fences we're gonna reset fences when we're gonna wait Q to be idle that works fine and render we're gonna submit all this information here we can get an error images pass the percent must be in layout VK image layout presents source KH our bellies in VK image layout on the find well this is because we didn't write to those images when we originally got these images from the acquire next image function by default the image layout is undefined and then we go and present those images as on the find that raises an error and in fact this does nothing for the images if we're just going to acquire an image and they were going to just present the image so the next step will be to actually write to these images and this will be a little bit involved the image we get when we call VK acquire image can be easily cleared when we begin the render pass and then we just exit the render pass that is enough to clear the image so what we need to do in here is to record a command buffer that begins the render pass then enter the render pass after that we can submit that demand buffer to the Vulcan queue and that altogether clearest image so first of all we need to create our command buffer but before we do that we need to create their command buffer pool so we K create command pool the first parameter is a Vulcan device second parameter is them creating 4/3 is akka allocation callbacks and last is our return handle for the command pool I'm going to create the command pool and they're creating for next ok inside this command pool create infrastructure as type is a structure type P next is not going to be used flags as a couple of flags that we need to look at and they sue flax our command pool create transient bit and reset command buffer bits transient means that this but these commands allocated from this pool will be relatively short-lived that just tells the implementation that these command buffers might be reset or freed often so it just tells them extra information to the implementation and we should use that because we're gonna reset these command buffers in every single frame the next one is reset command buffer bit now this ones is tells that are we going to reset these command buffers or not if this is not set in here that we cannot reset the command buffers allocated from this pool individually I want to do that so I'm gonna set it in here last is cue family index database command Paul is compatible with now we can get that from renderer like so and that's that now when we create something we should also remember to destroy it and the destroyer function is pretty self-explanatory next we should allocate some command buffers from this pool so we came allocate command buffers first parameters the device then allocates info and the last parameter returns our handle to the command buffer I'm gonna create the command buffer handle and the command buffer allocates in for structures now okay in this command buffer allocate in form S type is a structure type Phoenix is not going to be used command Paul is going to be either handle where where you can allocate this command buffer from command buffer count is the amount of command buffers we are going to allocate now in the level let's see about this okay there's two options for this primary and secondary primary can be submitted to a Vulcan queue and the secondary can be only called from the primary so secondary cannot be even directly submitted to a queue so we are currently choose the primary in here and because of this command buffer is allocated from a pool we don't need to specifically destroy it or D allocated it's going to get destroyed or D allocated with destroy command port function ok let's start recording our command buffer we're going to call a function VK begin command buffer the first parameter is a command buffer second parameter is going to be their begin info and in this command buffer begin info S type is a structure type P next is not going to be used and flags will tell a few things command buffer usage one-time submit bit means that we're gonna only submit this once before we are going to reset it so we can say that and that just gives more information to the implementation in here render pass continue bit this is only considered for the secondary command buffer so we're not gonna worry about that one simultaneous use bit means that this command buffer can be submitted again before the previous execution of this command buffer has finished and we don't need this so we end up using only the one-time submit bit and P inheritance info is used for the secondary command buffers and for primary it's ignored this begins our command buffer next we're gonna end it VK and command buffer the only parameter is the command buffer inside this command buffer we're gonna begin our render pass the next function we're gonna call actually recalls a command so this syntax will be a little bit different VK CMD like command begin render pass and for all these commands the first parameter will be the command buffer second parameter will be pointer to Brenda pass begin info and the third parameter will be a flags value this basically just tells our the rendering commands are they going to be submitted as inlined in here or are they going to be submitted as secondary command buffers for now we don't really care about this so by default we're going to use the inline so the last parameter will be VK sub pass contents in line next I'm going to end the render pass VK CMD and render pass and the parameter will be a command buffer after that we should provide the render pass beginning for structure in this render pass begin in fall s type is the structure type P next is not going to be used render pass is our render pass handle now we need to write a couple of getter functions in the window class so let's go to the window down H file and after and render I'm gonna add a new function that returns render pass and I'm gonna name this function get fork and render pass and I'm just gonna get that render pass in this function and in here I'm gonna provide over in the pants W get Vulcan Brenda pass now the same thing for the frame buffer if we had our render pass and frame buffer somewhere else we could just return that one frame buffer but in this case because one swap gen image is bound to one frame buffer we need to return the proper frame buffer okay and in the main file again W get active frame buffer next is the render area and this will be of type VK rect to D so let's create that one and this contains two fields offsets and extent offset actually has x and y in it so let's do these ones and these will be 0 extent will be the size of our surface where we are rendering to and the type of this one is VK extend to D so let's make a new getter function I'm gonna call this function get Vulcan surface size and this just returns the surface size back in now main dot CPP file we can use this W get Vulcan surface size and then we can just give the render area in here next is the clear values and these corresponds to the render passes attachments so we got two attachments in our render path if you remember from a couple of tutorials ago our first attachment is that that the stencil attachment and our second attachment is the color image so how are we going to do this thing then well I think the easiest way is to do an array the type of this array is VK clear value it's going to be two slots in this array let's also include array clear values at index 0 is going to be able our depth stands all attachments so we'll just use their depth stencil in here these fields in here determine our default values that we are going to put this image into self of depth we want 0 and for stencil we want zero depth is of type float and stencil is of type u and 32 t that's it for the first attachment the second attachment which is a color image is at index 1 and we're going to use a color field for this now if you go and take a look at this this is a clear color value and this is actually Union now which one should we use them well that is figured out when we created our surface so if we go to the window dot CPP file and go to the function in its surface in here we are figuring out the format that we want to use for this output image and if we take a look at this format my operating systems surface supports this one this is a first option it provides now if you take a look at this this is au norm u stands for unsigned and norm stands for normalized what this means is that these bits in these colors these four color values instead of integers they should be considered floats with range from zero to one values themselves do go from 0 to 255 so ensure this is a type of a packed float so we should use a float value if we use the you int then we should use the UN value if we use s int then we should use the int so depending on the the format that you swap chain images are in you should use appropriate field from here my system says that it's you nor so it's a normalized so we use float32 in here like this and then we just use it four times once for every color and these go from our G be a and to be more appropriate in here I guess I should write it like this now I do expect most of the operating systems to support the new norm type of surface and swap ten images but this is only guesstimating it so when you actually get into a production of making a game engine you should not guesstimate these values at all you should actually go and check the format and use the appropriate fields in here okay that takes care of that let's go and input the clear values and this should clear our color image and our depth stencil image as well begin render path automatically clears the image because if you've remember a couple of tutorials ago in the window dot CPP file in this function in it render pass in these attachments when we define our attachments depending on what did you say on the load operation in here determines if the values in here are the clear values in here are they going to be used or not so these clear values are only going to be used if the attachment load up clear is defined for the load operation so from here we can see the load operation for the depth aspect of this image is cleared the stencil load up for the time being is don't care and for the color image it's being cleared so the stencil is actually ignored completely for now but I'm just gonna leave it in there anyways okay next thing is we need to submit our command buffer now that we created it and we do so by calling a function V K Q submit first parameter is the Q the second is how many submit enforce we are going to process this is one then there's a list of submit info or in our case I'm just gonna provide it as a pointer unless this offends which is gonna be we can all handle after this I'm gonna create the submit info structure okay the first field S type is the structure type P next is not going to be used wait symma for count and weight semaphores these to define the semaphores that we want to wait before we start executing this command buffer we are already synchronizing our GPU to the CPU when we begin rendering and so far this is the only command buffer so we actually don't need this one for the time being I'm just going to leave this field seen but I'm gonna put a 0 on all points in here P wait destination stage masks tell us when we can start Rick's executing this command buffer and this is a again a list of stage masks or needs to be done in the previous command before we can before we can start executing this one that we don't have any so this is gonna be a null pointer next command buffer counts how many command buffers are we're going to submit this is gonna be one you only have one and then the list of command buffers this I'm gonna provide as a pointer then single semaphores we actually do want to signal some semaphores but for the time being let's put a zero and null points in here and let's try running this there's my crush might not we'll see that actually works well this is pure locknut it actually works because we can't be sure that the rendering is done by the time we are presenting this image in our case it's done before we present it and yeah this is a black image let's change the color but before that actually hmm let's go and make our program a little bit more safer because now if I close this there's gonna be errors attempt to destroy command Paul with command buffer which is in use so basically we're just trying to destroy a command Paul that has a command buffer which is still pending execution in a queue this keeps on giving more and more errors let's do the obvious first in the window dot CPP file in the destructor before we start closing anything we're gonna wait for our cue to finish so we came cue wait idle let's provide it with a cue in the main dot CPP file right after the main loop when we exit we want to wait for this cue to finish up so the same function bkq wait idle and then the cube now let's go to the window dot CPP file again in the end render function now we got two options in here we could either wait here on the CPU side until the rendering is done and then we can present the image but that would be inefficient and I would stop the CPU here until the GPU is done rendering while we still could be going forward and calculating the logic for the next frame so what we can do in here is we can introduce a semaphore let's go to the window dot H file and in here at end render function we are going to give a list or vector of semaphores to this render end render that we want to wait before we actually start presenting the image like so vector of type viic a semaphore waits in the fours now let's go back to the window dot CPP file and modify this function a little bit in here where it says wait semaphore count we're going to provide the size of this list and key weight semaphore is going to be the date of this list that's everything in here let's go back to the main dot CPP file after we allocate our command buffer let's create our semaphore first parameter is the device second is a create in foam third is allocation callbacks and the last is our return handle for the semaphore I'll create this handle and simha for creating four structures now the s-type is the structure type P next is not going to be used and flax is reserved for future use so relatively pointless creating for structure above anyways and when we create something remember to destroy it as well before we destroy our command pool we are going to destroy our simmer form like so again relatively self-explanatory now we are going to provide this render complete semaphore to signal semaphores in here so signal semaphore count we'll be 1 and P signal semaphores will be the pointer to render complete semaphore and our end render we are going to provide the semaphore like so now they should work a lot better it runs now should also be safe and when we close it it should close just fine okay let's have some fun with this let's go and draw a red color that works yellow that works yep I think we can conclude that this works but I'm actually gonna try something okay let's try rotating these colors in a sine wave pattern this code should be easy enough just a simple sine function and a few Defiance on top of the file and let's see what happens well a lot of colors happened so I guess this is more or less a success one more thing I want to do I want to see the FPS there's gonna be like 2000 or something okay a relatively simple changed I just added a timer and a counter frame counter since we are synchronizing hours GPU and CPU wants a frame this FPS will be true so let's see now yep I get about 2000 frames per second which was to be expected this is a really really simple operation for the GPU to do ok let's close this and let's go through the program a little bit what what is exactly happening in here we create our pool we allocate a command buffer we create a similar form all that good stuff we do our CPU logic calculations and we begin our rendering this begin render only gets there acquires the image now we want to use we start recording this command buffer we give some information to the begin render pass and the render pass is a command that we give to this command buffer begin render pass also automatically starts the first sub pass so our image will be converted appropriately now if we go to the window dot CPP file and to the function init render pass we define our image layout in here the final and initial layouts and layout during the sub passes yeah now this final layout the layout of the image must be when it's presented in musti in the % source khr and this is what does it let's go back to the main dot CPP file to continue this the next command that we're going to record is the end render pass and immediately after that we're just gonna end the whole command buffer and this also compiles the command buffer after this we submit the command buffer and define a semaphore to signal whenever this operation is complete and when we end the render we also tell in here that we want to wait for this semaphore before we present the image the CPU will not wait in here this is going to return immediately and the CPU will happily continue doing our cpu side logic calculations in here until we begin our render again this is where we sink our CPU to our GPU this is all for this tutorial and I'll see you next time you
Info
Channel: Niko Kauppi
Views: 15,937
Rating: undefined out of 5
Keywords: vulkan, api, tutorial, specification, graphics, programming, gpu, graphics card, c++, cpp, code, coding, advanced, close to metal, glnext, swapping, images, clearing, command buffer, submit, render, change color
Id: IP_xKQL2TcU
Channel Id: undefined
Length: 41min 22sec (2482 seconds)
Published: Mon Dec 19 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.