Stupid C++ Tricks: Most Dangerous C Functions (E02)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] are you ready because i'm about to show you the top 10 most dangerous functions in the c runtime library and what you should be calling instead so that you don't get pwned complete with working code examples that will walk live in the debugger [Music] hey i'm dave welcome to my shop i'm dave plummer a retired operating systems engineer from microsoft going back to the ms dawson windows 95 days and i've been coding in c and c plus for more than 30 years now that i'm finally getting good at it i thought i'd take some time to show you a few of the cool and essential things i've learned along the way sometimes they'll be about performance sometimes about style and sometimes just ways to make your life a lot easier or stay up to date but either way i'm confident that you'll discover something new that you can integrate into your own c and c plus development all that plus the story of how and why one day at microsoft everybody on the windows team suddenly just stopped what they were doing and worked exclusively on fixing stuff in windows xp service pack 2 which turned out awesome as you might recall there were two times in my career at microsoft where i witnessed the book tear through the collective hive mind of the developers in my area i worked at a group that included the graphics and gdi teams where the developers tended towards being both high horsepower and a little esoteric the first case i remember was a book called snow crash by neil stevenson in the mid 90s now i'm too scared to read it myself but i'm told to cover is history linguistics anthropology archaeology religion computer science politics cryptography memetics philosophy big list some like xbox project co-founder jay allard have cited the book as being highly influential even making it mandatory reading for their xbox developers but it seemed like the book led several others to simply pull up tent stakes and leave microsoft to pursue new dreams dreams that snow crash that apparently instilled in them that either inspired them or haunted them one person michael abrash was so moved by the networked 3d world and the metaverse and snow crash that he left to try his hand at creating something in that direction and working with carmack the result was quake the game's second life was also inspired by snow crashes metaverse the other book was perhaps surprisingly a work of non-fiction in the early 2000s many were discovering a favorite book of my own called writing secure code by david leblanc and michael howard contained within was all the philosophical and practical information needed to take a developer from the days of kernighan and ritchie to the modern realities of code red and slammer chapter 1 is literally the buffer overflow the book focuses on some basic principles such as never trusting user input not trying to achieve security through obscurity never mixing code and data how to operate with the least possible privilege and so on it also looked at the day-to-day practicalities of buffer overruns and raw sockets i was not alone by the time they were done reading it i assumed most any developer would be converted apparently it actually became required reading around the entire company at some point and probably for good reason once enough of us had the religion it was time for a little physician heal myself as we turned our attention inwards towards windows two things were now really clear first the security must be designed in from the beginning much as it had been in the system kernel not added on later in response to exploits being discovered and second that windows had some catching up to do in that regard because it was an older product intended for a different time that's when we as a team collectively put the brakes on no new features no futuristic longhorn work for now windows xp sp2 would be everybody's mission and its mission was security that was achieved through a parallel approach of fixes to existing code and new security measures many had been planned for longhorn but some things like data execution prevention simply couldn't wait that long we had to do them right then and make them available in the service pack sp2 also included with his firewall the removal of those raw sockets the disabling of the ever abused netsend message system and the addition of the security center applet for those of us on the digital front lines as developers it meant going through the code to find and fix the most common security bugs like potential buffer overflows in the vast majority of the time the c runtime string functions were somehow to blame earlier on i referred to the days of kernighan and ritchie by which i just spent a largely simpler time where a connected system might have an attack surface amounting to a single dial-up modem today a laptop connecting to a hotspot in a coffee shop might have a hundred ports exposed to a billion ip addresses it's not that the classic c runtime functions can't be used to write secure code but it's very very hard to do properly they were conceived in a very different time with very different needs and that's why we didn't even try to just patch things up or find potential exploits rather we aim to entirely remove and replace any use of the hazardous runtime functions for xp sp2 i spent a good deal of my time reworking existing code to remove the use of the c runtime functions and replace them with something much better now if you think i'm going to say some fancy c string class is going to solve all your problems for you well one step at a time we'll get the string classes one day but you've got to walk before you can run so first let's make sure we know how to safely measure copy concatenate input and output a string in a manner that doesn't open up your code to the most basic buffer overruns today we're going to replace your c runtime functions with better c runtime functions by which i really mean we're going to stop using a bunch of 1970s functions and replace them with their 1990s alternatives if a picture is worth a thousand words then a line of code has got to be worth a few dozen at least so let's bust up visual studio where i've written an application that takes us through the top 10 most dangerous string functions in the c runtime and how to properly code their replacements complete with working examples that will walk in the debugger here's the app i've written to show which functions to replace and how to do it now there may be other functions in the runtime that are also hazardous but these are my personal top ten and most of the time it'll come down to being able to give the compiler and the runtime functions the information they need to do things safely like how long the buffer is that you're copying into think about that for a moment for 40 years at least people have been calling old functions like stir cat as you likely know stir cat concatenates one string onto the end of another there's absolutely no mechanism to prevent you from running off the end of the buffer and into application memory the onus is entirely on you as the programmer too at a minimum pre-test the lengths of the components to make sure they're all going to fit into the output buffer that's not usually what happens though usually it seems programmers will apply computer science to the problem and in this case computer science really amounts to picking what seems like a reasonable length and then doubling it for a safety margin that might work when you're writing a recipe tracker for a relative but if you're writing code that lives in some network facing api it's a recipe for disaster as a foundational principle you should assume that all user input data is malicious that there's a clever hacker on the other end trying intentionally to compose strings that your code will choke on once you shift your mindset from maybe there'll be a weird edge case and a recipe with a long name to maybe somebody will intentionally pipe the oxford dictionary directly into my zip code field then you know you're going in the right direction the downfall of most of the runtime functions is this trust problem in that they trust that the output buffer is already known to be large enough given that that's rarely known to be true the c runtime has added new variants to the popular string functions that do accept the length fields you can generally identify the revised functions by an underscore s at the end of their name which perhaps you can remember is just meaning safe and so stirling becomes sterling underscore s now maybe you're thinking sterling really how can sterling be attacked well at a minimum the string itself might just be missing its null terminator without an upper bound on how long the string can be the runtime would have no choice but to veer off into other memory as it ran off the end of the non-null-terminated string a string which could be a memory map file on my hundred terabyte nas i suppose so there are theoretical risks as well the new version can be called with a null pointer and can be told the max possible buffer length so it doesn't run off into space as for headers the only one that you'll likely need to add is string.h which contains most of the new string functions i've also added the standard arg.h header because i have a function that uses a variable number of arguments and then two headers that allow me to modify and control the search that happen when a string fails validation next i've got the forward declaration of two functions to be found later in the file test var args is just a helper demo function to test a string function and turn off asserts as you can see is the first thing that main calls if we drop into it briefly we can see that it does two things of note first it disables the abort retry fail dialog that would normally pop up if a string failed validation and then it sets a custom handler to be called whenever that does happen that gives us the notification and control over how to respond when a string operation just can't be completed let's say you're trying to concatenate two strings but they don't fit into the buffer you could return zero or minus one maybe to indicate failure and agree on what that means but it will also in this case fire the string validation failure handler normally as i noted that gives the user the old retry fail dialog and then just terminates the program and that's why we're going to change the default behavior back in main the program itself is pretty straightforward first it declares an output buffer s z buffer that's pretty small just 16 bytes including the terminator next i define two string constants including a short one that would fit into the buffer and then a long one that i know won't we can use the long string to trigger validation failures to see how they work one of the benefits of the new stirling s function is that you can call it with a null pointer that removes the need for the classical test where you use a tertiary operator to return the length of the string or zero if the pointer were null it cleans up your code by pushing those tests down into the function the compiler these days is smart enough to tell when you're doing something obviously boneheaded like passing a null to a function like sterling so it would warn here but i disable that warning 6387 to prevent that and today's random tangential c plus plus trick is how to disable a warning temporarily the proper way i'm absolutely not a fan of disabling warnings in fact i was the guy that took the entire windows shell and made it compile at slash w4 i guess i should tell you what that means warning level four nwx which is any warnings are equal to errors so any warning even a little even warning level four would break the build once it got to work in that way and then people had to be very careful because if they made any boneheaded little errors that triggered warnings it would actually stop the system from building but of course it held us to a higher standard and i think it was a good good thing to have done i never disable a warning without a very legitimate reason just like this specific case and you'll notice that before i disable it i save the old state of the warning by pushing its state onto the compiler's internal stack of states as soon as my code is done i then restore the previous date of the warning i figure a lot of you may never have seen this done before but this way i leave the warning in whatever state it was before i got there enabled if it was already enabled and off if it wasn't it's sort of a courtesy to other developers on a team and even if you're working alone you may not wish to have a random header file force a state of a warning from then on forward our next function of interest is stir copy obviously the big problem with stir copy is that you could be copying a string to a buffer that's too small to contain it and the standard c function will just happily trash whatever happens to be a memory after the buffer in my example you can see i passed the buffer and then the size of that buffer and then the string to be copied if the string is too long the validation handler will fire if all goes well the function will return 0 or it can return constants for invalid and out of range what's important to know is that if you pass a valid non-null pointer it will never copy part of your string if it fits it will be copied but if not the output buffer is just terminated at character 0. in other words it does not copy and truncate your string for you at the end of the buffer when it runs out of room the beauty of the validation failure handler system is that they are fired before any damage is done your system is still in a good state when they trigger unlike a stack canary they might report damage done after the fact a validation handler fires because it has prevented a problem why didn't they use exceptions well some people think that exceptions for validation is bad mojo but either way that's up to you because you can certainly throw an exception from the handler which i tried here just for fun with the handler simply throwing a negative one as a simple exception my code in the try block can execute safe runtime string calls on questionable strings with impunity and if anything goes wrong it will simply wind up in the catch handler where you'd then request new input or display an error or do whatever is appropriate for a situation stir cat is much like stir copy except it starts writing at the end of the existing string in the buffer stir cat underscore s is the same as your regular stir cat except for the addition of the buffer size parameter and some handy runtime checks the runtime handler will be called if the source or destination pointers are null if the output buffer is zero bytes long and the output handler will also be called the string currently in the buffer isn't properly terminated so as you can see these functions go well beyond just accepting a length parameter it's not just that they do a fair bit of common sense validation at runtime as well our next function is a more complicated one sprint f4 string print formatted it's a very hard function to wield correctly because you don't usually have a quick calculation for how long the expanded output string could be when you're done if it has substrings and format specifiers for numbers it could be a variable length then the whole thing is really length variable so how do you know what size is buffered allocator whether it will fit in some buffer sn printf underscore s doesn't really roll off the tongue does it perhaps not but it does provide a number of handy runtime checks if you use any substrings with the percent s thing it validates that those strings are non-null it validates that the format string and the buffer pointer are non-null and that the buffer size is neither 0 nor greater than r size max which is the upper limit for these things and by the way our size max is basically max int but with the largest unsigned value so it's not really a useful balance you can rely on it will also fire for any encoding errors during formatting and finally of course it will fire the handler and stop the operation if the total space needed winds up being more than you provided i think it's worth reiterating here that the safe string functions don't leave partial results behind in the event that the buffer isn't big enough the old stir and copy would leave behind unterminated results which is like about the worst case so now we move on to some less common functions that some of you may not even use before like make path which is windows specific because it includes a drive letter you provide the drive letter folder file name and extension and then it properly combines those to make a valid full path the underscore s version allows you to specify the maximum output buffer size and the validation handler will fire if it needs more space i've seen a lot of code for paths that simply sets the buffer to max path and then goes to town but remember any user provided data could still be too long next we have split path which is the reverse it takes a path like the one we just conveniently created and parses it back out to a drive letter path file spec and extension with the underscore s you also will provide the maximum buffer size of each component so you wind up with nine arguments to this thing it's a little verbose but such is the price of safety s-scanf which in case you've never stopped to think about it sas for string scan formatted you provide a format string much like the printf function but backwards and it uses that to parse the input back out we'll jump ahead right to the end version that's also safe known as s and scanfs we give it our output buffer maximum output length and then a format string for each argument that we hope to parse out of the input we need to provide the appropriate variable address and size and finally the most complicated of them all vsan printf underscore s it stands for variable string print formatted save if you've never written a c function that takes a variable number of arguments well you're in for a bit of a wild ride here for a moment i could likely do a whole episode on this topic and probably should but here's a rough summary you include the standard.h header and you indicate that your function takes the variable number arguments by specifying the ellipses as the last argument as the author of the function you access the incoming arguments by declaring a va list object and calling the va start helper on it with the name of your last fixed argument in your function signature which in our case is the format argument we want to pass the set of variable arguments off to a function like sprintf but we'd need a version of that function that can accept a va list of arguments and that's where vsn printf underscore s comes in we provide the buffer the size the maximum number of characters that can be written into it the format string and then the variable set of arguments that accompany the format string when vsn printf underscore s is done with the arguments we call va n to clean up and we're done but why would you ever do such thing well believe it or not it's one of the first and handiest things i did when i started windows development on my own side projects i wrote a couple of functions d printf and printf and they did printf style output to a debug window or to a message box with no extra work both were implemented using va list args and yeah you could do with a macro i guess but what fun would that be if they're still a little fuzzy however don't feel bad i'll revisit them in a future episode so make sure that you're subscribed to the channel but for today's purposes all i really care is that if you do call vsn printf you call the safe string version and finally back in maine a simple example of how to call a safe version of get s which adds a buffer size parameter and runtime validation the validator will fire if the size is zero if the string is a null pointer or if it's read all the way to n minus one without hitting the end of the line or file yet let's fire up the debugger and take a quick walk through the code to see it in action all right we're now recording an obs studio so i want to see if i can make these look like a fairly s seamless transition normally i record with an atomos ninja 5 recorder right off the camera but now i'm going through obs studio in order to get the coding so if i do this we should drop me into the code editor and from here i can go back up to the top of the file and first thing the main does is it goes in and it turns off a search by calling crt report debug mode to b0 for the assert next we set the envelope parameter handler to be our parameter validation failure handler here we see the buffer and the two constants then our first call the sterling worked we came back proper 4d hex now we're going to call this uh sterling with a null pointer make sure that works which it does did not fire oh we should put a breakpoint in the handler let's do that oh there already was one so we know it didn't hit now this one should hit however because what we're going to do is we're going to copy the long string into the buffer the buffer is too small so the moment that happens boom here we are we've printed out the information somewhere presumably it's small but there it is when we throw the exception we're back out in the catch handler where we print we got an exception let's check that that came out caught an exception there we go by throwing an exception within the parameter validation handler we're able to do it really just as a standard tri-accept block sc buffer is currently empty and we're going to concatenate abc into it which give us abc as the whole buffer which it does now we're going to print the short string into sc buffer which gives us short string in sdbuffer now we're going to make a sifu bar text path it's going to be on drive c in folder foo with the name bar and the extension text we're giving it the size of sd buffer and if you notice i stopped putting parentheses around things that i'm taking the size of at least four types and non-complicated things uh kind of just cleans it up i like it this way i've been putting parentheses around my size of parameters for 30 years i'm sure but today i changed there we go sc buffer looks correct now we've got a valid path in there see foo bar text we're going to split that these are all empty we're going to split it into those by calling split path when we do so we get drive c folder foo file bar extension text just as we wanted now we're going to scan f and try to just pull a simple string from sc buffer into sd folder out and we have to tell it to size as well when we do that let's see folder out now contains the string we scanned out of it slightly more complicated case we have the string this is a long string etcetera etcetera we're going to parse four words out of that by doing percent s percent s percent s percent s with spaces in between and what we get this is a long just as we would expect now here's the trickier one we're going to call our testvar args with sc buffer and the size of the buffer here's the format which is the last fixed parameter and then here is our variable list which is only a single parameter in this case to keep it simple we're just passing in the string variable to be used by the format string if we walk into this code we can see we begin with args it's a system type we don't get to see inside but we'll now print whatever came in as arguments into the buffer so we don't even know what came in we're relying on the caller to have passed things in but the buffer could be any size so we tell it what size it actually is and it's able to not blow up when we're done with the arguments we call va end and we're done oh put a string input string press enter and we're done i judge the response of an episode like this in terms of the likes comments and most important new subscribers in fact i'm really just in this for the subs and likes so please be sure to leave me one of each before you go thanks for joining me out here in the shop today for stupid c plus plus tricks volume two i'd be especially grateful if you'd also give me a couple of c plus topic suggestions what is it you've always been genuinely curious about and wanted to know and understand better in c plus plus that you thought maybe your knowledge was a little shady about let me know in the comments in the meantime in between time i hope to see you next time right here in dave's garage the other book was perhaps surprisingly a work of nonfiction in the early 2000s many were then discovering a favorite book of my own yes i knew of it first no i didn't it's called writing secure code by david leblanc and michael howard gotta grab a microphone almost forgot today in dave's garage the top 10 most dangerous functions in the c run time and how to properly code their replacements complete with working examples that will walk in the debugger the new version can be called with a null pointer and it can be told the max possible bus for a buster buster length the handler will also be called if the buffer still is never for each argument we hope to parse out of the input we need to provide the appropriate variable size or normally as i noted that gives the user the old retry fail dialog and then just terminates the program back in main
Info
Channel: Dave's Garage
Views: 88,997
Rating: undefined out of 5
Keywords: c++, tips and tricks, learning c, c programing, c++ programming, coding style, safe string functions, strlen, variable number of arguments, learn c++, computer science, learn c++ programming, learn c++ fast, learn c++ visual studio, learn c++ coding
Id: B9DouAlkZlc
Channel Id: undefined
Length: 23min 33sec (1413 seconds)
Published: Mon Aug 30 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.