restrict: the only C keyword with no C++ equivalent

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Thank you so much for sharing this! Still a little confused as to how to properly use this, but I'm sure I will learn.

👍︎︎ 1 👤︎︎ u/ForLackOfABetterNam3 📅︎︎ Aug 16 2021 🗫︎ replies
Captions
hello and welcome i'm james murphy in this video we're going to be talking about the restrict keyword in c it's the only keyword in c that has no analog in c plus plus and it's actually an important one c plus was originally started as essentially an extension for c to support classes and as you can see over the years c plus has gained a number of features that c just doesn't have c is a much smaller and simpler language than c plus plus there are a few keywords on the left hand side here for c that don't appear on the right hand side for c plus plus like generic but most of the things like generic have some analog in c plus in this case generic is kind of c's way of doing what c plus uses templates for however restrict is a different case there is nothing in c plus plus that does what restrict does in c restrict is what's called a type qualifier like const or volatile except restrict can only be applied to pointer types so it doesn't make sense to have a restrict int restrict is a promise that lets the compiler know that the programmer guarantees that no object that i access through a restrict pointer will be accessed through any other means besides the restrict pointer making this promise allows the compiler to be able to do more optimizations that it might not have been able to do before for correctness reasons look what happens when i get rid of the restrict qualifier you can see that the assembly has six instructions but when i have the restrict in there it's only five instructions if we take a look at the assembly output here on the right we see that the instructions without the restrict qualifier do something that's seemingly redundant first off it starts by making a copy of the amount variable and then adding that into the star x variable basically and then it makes another copy of the amount variable the one that we just read two lines before and then it does the subtraction and then returns so why did the compiler choose to read again and make another copy of the amount variable when it just did the exact same thing two lines before well the reason is the compiler doesn't know what these x y and amount are pointing to it's possible that someone passed in the same thing for x as they did for amount and in that case this line where we're changing the value that x is pointing to is also changing the value that amount is pointing to so when i go to do the subtraction in the next line the value for amount has changed since it's theoretically possible that the value that amount is pointing to has changed the compiler has no choice but to read it again suppose though that as the programmer we know that it doesn't make sense to pass in the same thing for amount in x then we can tell the compiler i guarantee you compiler that this variable is not pointed to or accessed by anything else and the way that we do that is with the restrict keyword as you can see once i add in the restrict keyword on x the compiler deletes the redundant read logically speaking it probably makes sense to mark all three of these pointers restrict but in this case it doesn't allow any extra optimizations in this example we're implementing a simple addition of two vectors of length n so we have source one and source two which are of length n and we're supposed to store the answer to the addition in the destination variable looking at the assembly output we see that there are about 46 instructions but when i add the restrict keyword then it goes down to only 29 instructions that was a pretty big difference by adding the restrict keyword onto the destination which is essentially saying that i'm guaranteeing that neither of the source pointers are overlapping with the destination then just by doing that i've cut the code size quite significantly what exactly got cut out in all that mess glancing at the assembly we see that there are a lot of these xmm word kind of things going on and what these registers are for are vector instructions so what this code is trying to do is do the vector addition using the vector instructions on the machine meaning it's trying to add multiple elements together at once as you can probably imagine though if the destination overlaps with either of the sources it might not be correct to do multiple of these additions at once let's compare to the assembly that's without the restrict keyword in this case we still see that there are a bunch of vector operations happening we have this label l4 and then a bunch of vector operations and then a jump back to l4 so that's a vectorized loop but we also see that there is an unvectorized loop here so we have l3 and then a jump back to l3 just using the regular registers so basically what's happening here is there's a whole bunch of extra instructions that the compiler does to check to see if it would be okay to use the vectorized instructions and then it does the vector thing if that's allowed otherwise if the source and destination are overlapping then it's forced to just do it you know one at a time in the normal case when your destination and source pointers are actually pointing to different things there's no overlapping going on you're not going to see much of a performance improvement and the reason for this is because it does just check to see if they're overlapping and if not it does the vector operations so if you have a really long array then the vector operations are still going to be able to work on your really long array there was just that one if branch at the very beginning that had to check whether or not it can do those operations but the code for this is quite a bit bigger so that could cause an instruction cache miss and there could also be a cache miss if the processor weren't able to predict the you know branch that goes through all these things and decides that the vector operations are the correct ones so should you be going around slapping restrict onto all of your pointers probably not remember restrict is a promise you're promising the compiler that you're not going to be aliasing this pointer nothing else that the restrict is going to access is going to be accessed by a different pointer if that makes sense in the context of your function then go ahead throw it in there you might get a speed up but if it doesn't make sense then don't use it you can run into some really really hard to find bugs if you use restrict inappropriately consider for example this fibonacci function it takes in a destination which is pointing to enough memory for n elements and then it populates those elements with the first n fibonacci numbers so suppose the implementer says oh you know what the formula for fibonacci is just adding right so fibonacci of n plus 2 is just fibonacci of n plus fibonacci of n plus 1. so why don't i just use the vector addition and call it with destination plus two destination and destination plus one this is now violating the restrict promise since the destination is just one or two elements ahead of the source pointers so reading these pointers here is going to be reading from the destination pointer essentially a few iterations later but we promised that we wouldn't do that that nothing else was going to be pointing maybe i didn't know what restrict meant and so i just went ahead and did it and well it seems like it's giving the correct output all my tests are passing so let's go ahead and you know push it to production i'll go ahead and compile it you know and for my release build and then all of a sudden i start getting the wrong answers remember the purpose of restrict was to enable more optimizations if i was in a debug build with a low level of optimization those optimizations might not have happened then when i go to compile for the release build i get a different answer i get the wrong answer now and it doesn't matter how many times i test it if my testing build is not 03 then i'm going to get the right answer in all of my tests this is an extremely tricky bug to find this is one of those cases where because the n plus two term depends on the n term and the n plus one term i need to do things in order i can't do two of these operations at the same time like i would in a vectorized situation that means that for correctness in this case i really shouldn't have destination as being a restrict pointer now i get rid of that and i go back to the correct answers even at o3 a good practice to follow when you have one of these situations where you think you want to add restrict but there might be some situation where the pointers may be overlapping is to just have two versions of the function one that explicitly allows overlapping and then one with the restrict keyword then just make sure that you use the correct one in your function if i just call the correct vector ad that allows overlapping then there's no more issues with the restricts pointer and i can still use the restrict pointer version for the majority of cases where my pointers are not overlapping if you've ever heard of mem copy or mem move this is exactly the strategy that these two functions use they essentially do the same thing you have a source pointer and a destination pointer and you're going to copy n bytes from the source into the destination the only difference between mem copy and memo is that with mem copy it has the restrict pointers and so when you do a mem copy you're not allowed to pass overlapping source and destination regions but for move you are you can see how this might force the implementations to be different in mammoof you might have to copy into a temporary buffer and then copy the temporary buffer into the destination but in memcapi since there's no overlapping you could potentially copy directly from the source to the destination so what's the deal with c plus plus how come restrict is allowed in c but it's not allowed in c plus there was a proposal around 2014 trying to pave the way to add restrict into c plus but there was a lot of pushback namely because restrict is really hard to make work with classes how would restrict work with the this pointer in member functions and what would it mean to mark a member variable restrict there were just too many questions that didn't have really good answers and it would have been a ton of work so it just never really made it in however i have deceived you just a little bit if you're willing to move away from standard c plus meaning the c plus that's actually defined in the actual standards document then there is a way for you to use restrict in c plus currently every major compiler including microsoft's clang and gcc all support a use of the restrict keyword which is not part of standard c plus they support a language extension that allows you to use it anyway and it seems like underscore underscore restrict is the way that it's spelled so if you change all of your restricts to double underscore restrict or for some compilers it's double underscore restrict double underscore then you can actually use a version of restrict in your c plus plus code however the version of restrict that you get from your compiler may be different than the version of restrict that you get from a different compiler they may work in different situations some may support references and others not some may support certain optimizations and others not and the actual semantics of what it means to use restrict you now have to dig into your compiler manual and figure out exactly what the compiler is guaranteeing and what you're promising to the compiler when you use the restrict keyword it's not as simple as in c but if you're really trying to squeak that last ounce of performance out of your compiler it might be worth it to make code that's not portable to all of c plus plus and just make code that is only supported by that one compiler and that's something that's done a lot in real world applications if you really really need speed but for most cases i'm guessing you're probably not going to need it hey everyone thanks for watching i know that was a really technical one and i'm not even sure how much my audience knows c and c plus plus but i think there are enough of you in there that it was worth making the video so i hope you appreciated this little you know technical bit of c as always thank you to my patrons and to my donors for supporting me and allowing me to make more of these videos i really appreciate your support lastly if you enjoyed the video don't forget to like comment and if you especially liked the video please consider subscribing or becoming one of my patrons thanks and see you next time
Info
Channel: mCoding
Views: 78,979
Rating: 4.9832335 out of 5
Keywords: C++, C programming, restrict keyword, restrict pointer
Id: TBGu3NNpF1Q
Channel Id: undefined
Length: 13min 16sec (796 seconds)
Published: Sat Aug 14 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.