CODE RED: The Billion Dollar Buffer Overflow & How it Worked

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] the infection was released into the wild just two months before the 9 11 attacks on a friday the 13th initially just a handful were infected but the virus proved so contagious that within 10 minutes more than 75 000 were infected within hours the police would stop responding to 9-1-1 calls in the seattle area and shortly after that atms scattered around the nation and then around the globe would soon stop dispensing cash airline reservation systems were overwhelmed that in some quarters panic began to erupt once infected each new machine would join a zombie army silently striving to infect as many others as possible once the worm had successfully spawned itself the host would become useless effectively rolling over and going to sleep the only symptom of infection was a cryptic message claiming chinese responsibility but the silent countdown had already begun the virus had been programmed to enter a cycle of infection attack and waiting when the clock expired the attack would begin it would be a massive attack initiated by hundreds of thousands of the world's most powerful servers hijacked from all corners of the globe each bot in the army would use all of its available network computing resources in a coordinated distributed denial of service attack aimed squarely at one family's home a home located at 1600 pennsylvania avenue the american white house in the meantime the virus will continue to mutate and a week later a new variant appeared that spread much more rapidly the white house took cyber security more seriously than most and an astute administrator noticed suspicious activity early on in the web server logs he found thousands upon thousands of requests from random servers all around the world all looking for a file that simply did not exist more about the ghost file later he quickly brought it to the attention of two high-level security researchers at a company known as ei digital security who knew immediately that something serious was happening but what they could not yet tell the research went late into the night fueled by a caffeinated energy drink known as mountain dew code red and that is why one of the most destructive computer viruses of all time responsible for more than a billion dollars in damages will be forever known as code red it's but a single example of perhaps the most common security exploit in computer software the buffer overrun today in dave's garage i'll not only tell you the story of code red i'll explain in detail how a buffer overflow works down to actually writing one and walking the code with you in the debugger so that you can watch the hijack in action and see it with your own eyes [Music] hey i'm dave welcome to my shop i'm dave plummer a retired operating systems engineer for microsoft going back to the ms-dos and windows 95 days i've been coding in cnc plus plus for more than 30 years and the buffer overrun bug that made code red possible is a classic problem for code written in either of those languages coming up we'll drop into the code editor and i'll even show you how to write a buffer exploit in c hack the stack inject your own code and watch it run live in the debugger just promise to use your powers for good [Laughter] to appreciate it though we first have to revisit some of the details of the buffer overrun attack itself and so back to the story of code red within a week on the 20th of july those initial 75 000 servers had spread to in fact a total of an estimated 360 000 servers because codered targeted an exploit in the server product known as internet information services or iis his victims were almost exclusively the web servers in large data centers or corporate offices running a company's public presence on the internet that meant two things it meant that bit by bit some of the internet itself started to disappear as hundreds of thousands of the world's web server simply stopped responding but it also meant that the machines infected were often the biggest and most powerful systems of the day the big iron of the web if you will in fact client machines like windows 95 were generally immune as they simply lacked the iis server software that contained the vulnerability being exploited as a windows nt guy however i was personally running on working on nt and win2k systems all of which contained iis out of the box but if there's anybody who's up to date with the latest fixes and patches it's usually the developers working on the actual product where you often define up-to-date as have you installed this morning's build yet among the updates that i already had for example was an update named ms01-33 it was boldly labeled by microsoft as critical the fix had been made available publicly by microsoft about a month before code red hit the streets the title of the update was sobering uncheck the buffer in an index server extension could enable web server compromise this was in the days before the automatic system updates however and the onus of performing regular updates therefore fell to the system administrators but when combined with a fairly casual approach to security in the halcyon days before code red and slammer and so on very few systems were properly updated on a rigorous schedule but how many machines would that leave without updates unpatched and vulnerable to infection there's a federal government agency known as the national infrastructure protection center whose job it is to worry about such things the nipc as it's known was formerly a division of the fbi and by the end of the summer they estimated that more than 650 000 unique server ips had been infected codered was interesting in that each server kept track of any others that had already infected and would go back periodically to refresh or re-infect those same machines that meant that even if code red were removed from a machine but the underlying exploit were not promptly patched and fixed the server was likely to be reinfected again in short order code red was very aggressive spawning 300 threats per server to go about its nasty business of infecting others that's unless the machine were shut to chinese in which case it would use 600 threads even though code red randomly searched for ip addresses to infect it turns out not to have been very random at all as it used the same seed for the random number generator each time that means the simulate random sequence was in fact always the same and while this had the side effect of dutifully reinfecting some servers the second or third time as i mentioned it also wasted a lot of time and resources attacking servers that were already infected but soon after the initial code rate hit it was followed by a new variant that chose a different seed each time in so doing the random ip zip probe were different each time and all the resources could be invested towards new infections as a result the new variant spread even faster it was given the identifier crv2 to distinguish it from the original crv one regardless of the variant though the code red worm operated in three distinct phases based on what calendar day of the month it was on the 1st through the 19th of the month it would spend its time randomly searching for new hosts to infect on the 20th through the 27th it would launch its denial of service attacks on days 28 and higher it would rest while trying to find new machines to infect it did not discriminate and it searched for iis in order to try to confirm the vulnerability was even present or check a version number or anything that's because it simply looked for a file called default.ida and it didn't even matter that the file didn't exist the mere act of asking the server about the file was enough to trigger the worm's attack vector on a vulnerable system by exploiting a buffer overrun in the iis web server extension code all of this begs the question of just what a buffer overrun is and how it can be used to inject malicious code well put most simply imagine you have a web page that accepts a username and a password as a visitor you type your answers into text boxes and the server reads them and copies them into buffers in its own memory now let's say those buffers are reached some reasonable length like 64 characters each that seems plenty long for a typical username or password but what if due to a bug in the webpage it allows much longer ones to be entered what if some joker comes along and types a hundred letters into the username box what happens to the extra characters beyond 64th where do they go that is the key once you run out of buffer those extra characters the payload if you will runs straight into whatever is in memory next in many cases that memory will be on stack and a carefully constructed attack can even include code that is then executed heap attacks are also possible as well but they're more complicated to execute it's important to note that when i say the malicious payload code that gets stomped into memory will be executed it is executed by the server in the context of the server process your code has made the leap from web browser straight into executing memory of the web server you're visiting just like a true viral infection might leap between species but how can that even happen to understand it we must have an idea of how memory is laid out for a process with the lowest addresses at the bottom and the highest at the top let's look at a map of how a process is laid out in memory typically it starts out with the program code down at the bottom in a secure system this area can't be modified once running above that in memory we have pre-initialized data like built-in graphics images cursors global tables you might have defined that kind of thing above that we have uninitialized data area reserved for variables that will not be created until runtime above that is the program heap where your mallocs and calyx and new objects come from everything else in a little memory is fixed in size but the heap expands upward as you need more of it now up at the top of memory is the stack it's an area of memory that goes down as you add things to it the fact that it goes downward in memory is a little counter-intuitive but it's done that way because the heap is growing up from the other end it simply puts them the furthest apart the address spaces are big enough with the x86 and especially with the ia-64 that you'd likely run out of memory long before the heap actually reaches a stack the traditional metaphor for the stack is the good old stack of food trades in a cafeteria it's a last in first out affair where when you add a new trade to the stack it becomes the first one that will be popped off next with a computer stack data can be pushed onto the stack and much like the trays they get popped off in reverse order as a really simplified example imagine the only thing the stack were used for were the return addresses of functions let's say that a called b we push the return address in a onto the stack and we jump to b now b calls c we push the return address for b onto the stack and jump to c perhaps similarly c calls d now when d returns it pops the first thing off the stack that it finds and that's the return address back in c when c returns it pops off the address in b when b is complete it pops off the original point back in a given how a stack works it lends itself very naturally to nested function calls in reality it's slightly more complicated than that for a couple of reasons first the function arguments are also pushed onto the stack by the caller before the return address is pushed on second following the return address a pointer to the previous frame is pushed onto the stack and third this is followed by space that's reserved for local variables finally the function itself can make use of the stack as long as it cleans up anything it pushes on by popping it off before returning note that to create local variables the system merely reserves space on the stack by adjusting the stack pointer down it doesn't really push them onto the stack and instead it just subtracts however many bytes it needs from the system stack pointer the little patch of stack belonging to a function that contains its return address and local variables is known as its stack frame now you can access any part of your own frame but no random access to stack memory outside of your own frame is typically allowed the hardware doesn't prevent it it's just kind of a convention so we however are going to break that rule we're actually going to break a lot of rules but that's what real hackers do they're fast talking rebels that play by nobody's rules not even their own but first we'll look at what happens when you call a function first any arguments that need to be passed to the function are pushed onto the stack in right to left order finally the instruction pointer which indicates where execution should pick up when the function returns is pushed onto the stack and that's key because we're going to cause our own code to execute by replacing the real return address of the current function with our own address that way when the function returns the cpu will do what it always does pop the address we came from off the stack and jump back to it little will it know that like folger's crystals we've secretly replaced the real address with our own causing our malicious payload to be executed as soon as the function even tries to return to see this in action let's write a simple c program that simply calls a function that's all it does and then i'll use a string buffer overrun inside that function to show you how to mess with the stack and inject our code but first let's get the basic demo program running if you'll give me a moment i'll crank out a little program to do exactly what it is that we want here we have a basic c program with three functions the entry point main is down at the bottom it simply prints a message to let us know that it's running and then it calls function one function one just copies a string to a buffer and then prints a message confirming that it ran and then it returns the program exits and that's it that's all it does but notice there's another function called function2 it's defined in the file as well it will represent our malicious payload but nobody calls function2 and hence it should never run or execute our exploit will change all that but for now when things are working properly unmodified if we run the program we get a console window and then the messages in the expected order first the main function runs and then the message from function one is displayed and then it exits now comes the exploit i'm going to overrun the buffer by giving stir end copy too much source data to work with because of the way local variables are laid out on the stack i know that right after the foo buffer of eight characters i'll be able to find the frame pointer or ebp because pushing evp is the first thing that the function does above that is the return address that we came from which is what we're really after now as soon as we overrun the foo buffer by even four bytes we will overwrite the copy of ebp on the stack the next four bytes will overrun the return address on the stack but what do i provide is the new hijacked value for the return address i need to know the address of function too because that's where i want it to go to figure that out let's run the program under the debugger and see where function 2 lives when we run it under the debugger i can pause execution right before the string copy call and inspect the address of function 2 which turns out to be 1 0 0 1 1 5 e 0. thus in little envy and reverse order the bytes we want to stuff in our memory are e0 1 5 0 1 1 0. let's update the code to pass those until string copy we now have 16 bytes to work with let's run the program and stop it again at the break point on the string copy call i can use the debugger to see that the foo buffer is now located at 1 9 fe 88 with the memory window open in the debugger i'll select those bytes with the cursor so you can see where the string is that's going to get copied to the first eight bytes worth the ones i've selected are completely legal it's okay to write there but the next four bytes comprise the 0 0 1 9 f e4 value of the ebp frame pointer that was pushed at function entry the next four bytes the one zero zero one one six one c they form the return address we're gonna use our string copy to fill out the eight legal bytes and then overrun into ebp with the number one two three four and then the magic part the address of function two will be poked into the next four bytes which we know are the return address we can sanity check that by seeing what now lives at one zero zero one one six one c the address that was currently on the stack and sure enough you can see that it's immediately after the call to function one just as we'd expect if we didn't change anything it would return and then happily execute that xor instruction let's see what happens when we do change things as soon as the code calls string copy note how the bytes in the memory window turn red to indicate that they've changed two more bytes than colored red were actually written out but because the values 0 1 and 1 0 stay the same before and after the call they don't get highlighted by the debugger but they all changed let's see what happens when we continue on to return out of the function we normally would have been going back to main but if our hackery has worked execution will wind up in the payload code we proceed step by step and as soon as the ret instruction is executed boom there it is we're at the top of function 2. if i run the code sure enough the output confirms that the sinister payload code that should never run just ran to keep things simpler you'll notice i didn't preserve evp but that only matters if i return back out of function 2. when i try that it winds up in a loop of main function 1 and function 2 sinister payload code executing an order forever hopefully you now have a sense for how these can work and you're no doubt surprised at how little code it took particularly with the sample exploit being a single string copy line can't be that easy can it there must be some defense and indeed with a modern system there are several layers of defense the first is for programmers to follow the practice of banning these unsafe string functions like i just used from their coding lives the ones that don't accept the buffer length that have undesirable or unsafe behavior otherwise microsoft has a list of these and depending on the project settings the compiler will even warn you or generate an error if you try to use any of the old unsaved functions it's not just some weird paternalistic or idealistic notion you really should upgrade to the modern alternatives just to avoid common frustrating bugs even aside from the security implications of the old functions i'll be looking at that whole issue in detail in a future episode of my stupid c plus plus tricks series so please make sure that you're subscribed to the channel in order to see that as well as other cool and interesting technical and historical material the next level of protection comes from microsoft's depth for data execution prevention on windows but simply prevents running as code anything that's actually in a data segment let's say you managed to get your payload into a heat buffer and wanted to modify the return value of some function using a string exploit just as we did earlier to then jump to your code with depon you'd be protected because your cpu would not be allowed to run code from a page on the heap like that similarly and perhaps more important you can prohibit the execution of code on the stack back in the olden days of coal mining which is to say up until about 1986 the workers would be accompanied by a bird such as a canary carried in a little cage and exposed to the same gases and vapors as the workers in the mine tunnels if the air became toxic with carbon monoxide or other gases the bird would become ill first and act as an advanced signal that the air had become unsafe this would allow the workers to be safely evacuated that's the origin of the phrase canary in the coal mine and that's the reason why when the compiler places guard blocks of bytes before and after your buffers those blocks are known as stack canaries if their value becomes disturbed by anyone or anything the system knows that something has corrupted them and the application has become unstable stack canaries are great except that the corruption isn't usually detected until the function tries to return when they're inspected the health of the canary is then verified but the damage could have been done long ago somewhere earlier in the same function in order to catch buffer overruns live as soon as they happen the compiler can enlist the processor's memory management unit or mmu in the fight for each buffer an entire page of memory is allocated before and after it the hardware mmu is then instructed to deny rights to those pages and any attempts to do so are caught right as the offending instruction is executing it's nearly foolproof but it consumes extra memory pages for each buffer i'm no cpu talking smart guy but if i were i'd assume that there are limits to the number of entries you can have in the mmu's page table even if there were no practical limits i've got to imagine that there's a performance impact of some kind involved with massively growing the mmu's translation look aside buffer now i've never used guard pages in a shipping product but i have turned them on briefly for the heap rather than the stack in order to chase an elusive heap corruption bug once long ago it was very slow in operation but it found the problem on i think like the first run that i tried it with the two questions we'd like to know the answer to next though are how specifically did codered exploit a buffer overrun and how did it get its payload onto the machine apparently codered exploited a bug in an iis extension where a component that wound up inspecting get requests had a buffer overrun bug specifically there was a completely unchecked buffer in idq.dll which is an is api extension related to the indexing portion of the iis web server when presented with a really long string for get requests such as the one shown on the screen here the buffer is big enough to accommodate everything through about the last capital n after that it's all carefully constructed binary payload much as i demonstrated in the debugger the program counter was hijacked to cause the payload to execute as instructions that would serve to begin the infection and the propagation of the worm meanwhile back at the white house the iit staff had to contend with a new onslaught of denial of service attacks that would begin like clockwork on the 20th of every month they would last a week before they'd go dormant again and each month's attack was larger than the month before it the infection would continue to spread unchecked like fire in a dry field until a fellow named kenneth eichman discovered that just like the dinosaurs of jurassic park the virus's creators had engineered a lysine deficiency of sorts into the worm except this one was sort of backwards in the movie the giant tyrant lizards would die out if not periodically fed a special nutrient thereby hopefully preventing them from running amok without their creator's consent without the lysine supplement being intentionally fed to them they would die out eichmann discovered that code red worked in a similar fashion but with a twist one of the first things it did upon starting up was to look for a file on a c drive called wait for it no worm if a file named no worm was present and it didn't matter what it contained as long as the file was just there at all the infection would go dormant once good old kant discovered this backdoor method of shutting down the attacks it became widely known that protecting a machine was as simple as creating a file by that name that gave system administrators the breathing room needed to remove the malware and update their servers with the latest security patches this discovery of how to shut down code red earned eichmann a trip to the white house of course they would still need to contend with the slammer worm just two years later and with any number of would-be exploits over the decades but the important thing is that security efforts became more proactive rather than responding to attacks after they happened administrators began keeping machines up to date as part of regular maintenance modern programming practices combined with updated operating systems and server components written with security in mind from the beginning rather than tacked on as an afterthought have combined to make the buffer overrun much less frequent than in years past zero day exploits are becoming increasingly rare and perhaps further and further apart but they can never be eradicated entirely and it only takes one as the wannacry ransomware attack of 2017 demonstrated there really are no links to which a malicious actor will not go up to including encrypting all the data present on the machine holding it for ransom and even completely deleting it if you found this story entertaining or if you learned anything from the buffer overrun explanation please drop me a like on the video to let me know it was worthwhile as you can see my channel is still fairly small so if you can share with a friend please shoot them a link if you're interested in this kind of content yourself please make sure to subscribe to the channel these days a subscription doesn't do much on youtube in terms of content but it does serve as a vote of confidence in the channel and it makes me personally happy since i'm really only in this for the subs and likes anyway so please consider it thanks for joining me out here in the shop today in the meantime and in between time i hope to see you next time right here in dave's garage let's go back and redo this part a new variant appeared that would spreed spreed it would spreed damn it they simply lacked the iss that meant that even if code red were removed from just read the words that you wrote when you were thinking at the speed that you can type and think as a result the new variant spread even faster it was given the identifier identifier the first eight bytes worth the ones i've selected are completely legal we're allowed to write there the next four break ah dan boom there it is [Music] that's good enough boom there it is two sinister payload pages in main function one and function twos says payload uh payload no payload for me black door i see a black door and i want to paint it red that's the origin of the phrase canary in the coal mine and that's the reason when the compiler places i can't win man too many leaf blowers the server was likely to be re-affected again in re-effected it's gonna be a re-affected don't know what that [Music] gotta get up in the means because i have to do it anyway
Info
Channel: Dave's Garage
Views: 71,805
Rating: 4.981143 out of 5
Keywords: code red worm, buffer overflow, buffer overflow attack, ethical hacker, buffer overflows, ethical hacking, buffer overflow exploit, buffer overflow tutorial, c++, programmer, learn, virus, worm, attack, penetration testing, penetration tester, learn hacking, cyber security, buffer overrun, 2001, worm 2001
Id: 7YRyFMv-tY8
Channel Id: undefined
Length: 25min 49sec (1549 seconds)
Published: Fri Aug 20 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.