Python 2 vs 3 for Binary Exploitation Scripts

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so the other day somebody wrote me on Twitter and told me that they are working on a buffer overflow challenge but they ran into some issues luckily the person sent me a screenshot so I was able to see what was wrong it's again one of these annoying issues that you might not think that that's the cause but that's why providing the technical information and sharing the screenshots and sharing the details shows it immediately and I thought that's perfect to show you a typical pitfall that you might run into which can be very very confusing when you would just start out and we can combine this with just some additional tips and tricks for working with Python when you develop these kind of Epona below exploit the person was telling me that for some values the values appeared on the stack but some others didn't and then I looked at this the person highlighted here the issue to the left you can see here a binary string it looks like a typical I guess stack address at least without a SLR its attacks encoded so you would hope you get the raw bytes out but to the right you see an output from gdb including the stack and you can somewhat see the values of course you have to look here at the proper Indian s so this is interpreted as an integer here's the raw bytes so here's the new line which is fine a print adds a new line at the end then comes the 0 0 the 7f but then you would expect FF but it's not there you see here a BF c 3 BF c 3 and when i saw this this immediately clicked for me I once made a video about recognizing patterns that you should train yourself to identify patterns and being able to recognize them and this is actually one of those patterns that when I look at them I immediately know what the problem here is we have here a Unicode encoding problem so here I'm trying to print the raw byte FF so obviously this is not a printable character and we see a question mark let's pipe this output into a hex that - see to see what the raw bytes are and we can see we get the FF as well as the newline from the print but this was Python - let's see what happens when we try python 3 C 3 bf and this is exactly what we saw here bf C 3 or C 3 B let's look at this without the hex dump look it's a special unicode character here it's like a why with like a lot of dots on top of it I don't know let's actually turn this string into a byte string by prepending the double quotes with ABI maybe then we are we can use raw bytes and yes that's kind of true but see what happens when we print this we print this which might look like a it first but if you piped it into hex don't - see you see that it literally printed that and not the raw bite and this probably the main reason why so many people still use Python - for expert development because something like this is just so annoying because we don't want to deal with all this encoding stuff we just want to deal with raw bytes and of course you can do this in Python 3 but it's just a little bit easier quicker and dirty or with Python 2 but let me also show you how you do that in Python 3 of course I have no clue how to do this so I just google it was one of the first results why do you even ask me just Google yourself and you can see here that you need to use the Seuss module and then access STD out basically directly and right to it that's actually still pretty good advice let me show you what the source sed out is useful for even in Python 2 so here I'm using Python 2 I imported source then I access the STD out of this Python program with sister STD out and then I directly write the raw by its string FF to it and we get that and when you compare this to print you can also see that we now don't get the new line because print always adds a new line at the end but if you directly access STD out and just directly right to it it will only write work you want it so be a bit careful with print and in doubt use kind of like so says the out route directly so let's see if this also works with Python 3 we try to write the string FF but again it got converted and if you would try to write the byte string then you actually get here an error because right expects a string and then all this automatic conversion happens so what you actually have to do is you take the byte string but you've right to the buffered STD out I don't know why it's just how it is and here it works now you were able to write the raw byte string FF I mean if you write an expo you can just write yourself a quick function that you don't have to use the whole thing but you can see how kind of annoying Python 3 is in that regard but oftentimes you also don't really directly output to SD out anyway on typical CTF challenge you interact with a socket so you send raw bytes back and forth between a socket connection and there you can work with bytes anyway you can see here in the Python 3 socket documentation that socket dot send expects bytes as a parameter so there you can just send a raw bytes string all good a very typical programming issue then when you work with Python 3 is when you mix regular strings with byte strings so here's a regular string ABC F I don't know why I went directly to F instead of D it is what it is and then we have two byte string here prefixed with the B that doesn't work so how can you deal with this well first of all you can convert a regular string to a byte string by calling encode onto that string and you can say that that string is an utf-8 string so you want to encode it into the raw bytes that represents this string in utf-8 and then you get the byte string and the inverse is decode so you can call decode on the byte string and you say these raw bytes please interpret them as ASCII characters and give me a string that represents that and of course if you have a byte string that for example contains hex FF and you try to decode this as asking you get an error because ASCII doesn't go up to FF ASCII only goes to hex 7f so you would for example have to encode it as utf-8 in that case ok that also doesn't work you basically want to parse this byte string as if it is a utf-8 sequence but it's not a valid utf-8 sequence because apparently utf-8 can't decode the byte FF is an invalid start by it and this is so damn annoying with Python 3 especially if you do this low level stuff generally don't work with strings ok generally always work with byte strings especially when you work with these raw bytes it makes no sense to work with strings always work with raw byte strings let's cover some other Python basics that you should know of so generally working with Python it really makes sense to have a virtual environment just a quick reminder I mentioned this in another video of mine I linked it up there this is not creating a virtual machine or anything it just creates isolated typing environments that just means the dependencies are not interfering with different projects so you can run your projects in a specific folder and you know whatever you install for that project won't affect the whole system or the other projects okay to set this up simply Google how to set up virtual end and then whatever you have like Mac Windows or Linux or whatever so let's quickly do this first I am installing Python 3 pip why is this so slow what the heck what's happening what are you doing are you who this whole machine is kind of frozen what the heck is what is it doing god damn I hate computers let's kill this there we go so pip is like the Packard manager for Python modules so let's install a virtual end via this Packard manager you just look up how to set this up on your system okay I'm just like quickly just making this work here anyway now we have virtual end so what you want to do now is so let's make a project and let's create the virtual environment I called it VN Finn's always clear and you can see here I created this folder the end it's now in here in here you can actually find a typical Python environment structure with bin include and live and now let's activate this virtual environment which looks maybe bit weird but you know just like dot and then you call the activate program of binary script whatever in VN bin and now you can also even see that the particular cell i'm using is indicating if you are inside a virtual environment so now the path is basically set up to use use the modules that are in here rather than the modules that are globally installed on this machine when you now execute Python it defaults to Python 3 because you are now using the Python that is included in in that bin and when you knowdo pip install for example requests because that is a very useful python module and you can use it in your programs you can import them but when I open here a new tab and even when I'm in the same folder when I execute Python see now I still use the regular system Python and of course if I would try to import requests it doesn't work there's no module requests so I can again enter that environment basically just setting up all the paths correctly and when I now go into Python then now I'm in the Python 3 environment and I can import requests so this is how you can separate your projects and to get out of a virtual environment you can just type deactivate let's create the second project to show you something else so let me create a different type environment so if you check out which versions I have installed with Python top I see for example here I have Python 2.7 installed we can use minus P and specify whatever Python runtime you want and then create the virtual environment and now it sets everything up to use that Python interpreter it's now using Python 2.7 and we now go into that environment set up all the paths when we now use Titan we use Python 2.7 and now we have a 2.7 environment and now you can install all the Python 2.7 modules that you need so here I just installed also patent requests for Python 2.7 so here I've just opened that folder that project 2 folder in visual studio just to show you quickly the hierarchy in here so you know I installed the request module with pip install requests and like I said it's a typical Python project setup so you we can go in lip and then Python 2.7 and then inside packages this is where all the installed modules get placed you can find in here the requests module so you can find here the code I guess it it relies on get and post so get and post is in our separate packages here so here is get and it relies on query string ok this is like a little bit of an annoying module and then this relies on here on public whatever you can see here all the source codes of the modules that you have installed and this is also very very useful if you run into for example an error message that you don't understand and Google's maybe not very helpful you can just go in here and you can modify even here stuff you can for example at debugging printf s and you can look at what the function expects as parameters and all that stuff don't be scared to look into source code of these modules sometimes if you can't figure out how to use it reading the source code and maybe debugging it a little bit with just some additional prints can be very very helpful now I know a virtual NF is not like the cool nowadays anymore I guess a lot of people are nowadays using PI n simple Python version management it's probably nicer I don't know I got used to virtual and I'm just letting you know this also exists maybe go straight to this instead of whatever I use whatever works for you I don't really care it works fine for me a lot of Python programs also include a requirement store txt you might notice that when you pull something from github and so this makes running some Python projects online very easy you clone the github repository you create your local virtual environment like I showed you before with virtual and for pie and and then you can simply call pip install - R and then you pass in the requirements dot txt file and then it will install everything you want and let's say you created your own project and installed all the dependencies you can write out all your dependencies with pip freeze and write that into the requirements txt so that's basically how you work with Python and then I also should mention pone tools which is a Python module specifically a library to create the like exploitation it has a lot of awesome features I typically don't use it in my videos mainly because I don't want to have this problem of making sure everybody sets up all the dependencies and all that stuff this is extremely awesome extremely helpful and you are super super fast with that a lot of stuff that I code by hand is already done and better implemented in pone tools but it's easier to follow along if I don't use a dependency like this but it's really really useful for you to set it up and learn with pone tools there are a lot of different examples in here and a lot of the solution scripts that you that you can find in write-ups use polling tools if you ever see this line here from PO and import star and then you know it uses pawn tools it's also why you shouldn't name your script port I because then if you are relying on port tools you know you try to import your own local file instead of the module you know it creates issues and that's that probably one of the most important Python modules that you will be using a lot especially in binary exploitation is the struct module struct can unpack raw byte strings into integers or pack integers back into robot strings it can do a bit more than that but this is the basic functionality that we are using and we can also specify the Indian s so if you for example wanna pack the integer 1 2 3 4 hex 1 2 3 4 into a 64 bit string then we can do it like that and here you can see the raw ID output and you can also control the Indian s with greater than or smaller than signs so here I flipped the Indian s you can see the byte orders reversed with a greater than sign before q is 64-bit or 8 bytes I capitalized for example an unsigned integer capital H is 2 bytes and capital B is a single byte and obviously this causes an error because you can't pack this to bite large value into a single byte for this please always consult the documentation it's all in there you can define here the byte order the greater than smaller than that I showed you in little or big-endian that's the most important here and then here you can see the different data types that it supports the most important I guess for us are always unsigned char it's just a raw bite it has its 1 byte unsigned short its to bite capital H unsigned int 4 bytes 32-bit capital I and nowadays because you do a lot of 64-bit binary stuff you might often find yourself using capital Q which is an 8 byte unsigned integer and of course there are a couple of other things you can even do with floats and doubles and char strings but often you only deal with this integer conversion the reverse works similar the reverse were similar we need to specify what the type is that we want to unpack and we say you know it's a 64-bit string or 8 bytes string and let's just use this from earlier let's try this out now you can see it returns a tuple to access the first element of the tuple we can we can just access it here with the 0 but this is like a super large number that's not kind of what we expected let's look at this in hex and you can see here 1 2 3 4 so the endianness is a bit screwed up so of course you can now also affect here again you say oh the data raw bytes I mean big endian so please interpret this as a big endian integer and now it's correctly interpreted as 1 2 or 3 4 in hex um it's probably also worth mentioning and shout out jupyter notebooks a lot of people work with it mostly for example in academics when you do data science or whatever but it can also be very useful for CTFs in general when you are very just explorative with your scripts you don't know exactly what your script has to do yet and then you can use a jupyter notebook to slowly like explore and so here i'm just like running an online test basic version but you can also install this locally I don't want to go into here how this exactly works but basically you have here a Python environment so you can ask for example a sign here a and I just have here an environment B is a plus one you know I can write your Python code and with shift-enter I can execute that and now it's executed and now for example I could access be here oh I could access a in here and I could go back and change oh I wanted this to be a 3 instead execute that again execute that again execute that again you know so this is a very fun explorative way to work with with work to work with scripts it has a lot of amazing features with inline images data visualization stuff do Python notebooks or iPad notebooks are extremely powerful are used heavily for especially like in academic circles but obviously it can be also very useful for CTS because it's a very cool environment to develop your small scripts and programs that's all I got for you today I hope this kick starts your Python stuff always remember read the Python documentation get comfortable with them it's super important to learn how to read the Python documentation always pay attention to Python 2 and Python 3 obviously path and to get slowly deprecated so you should be switching to Python 3 mostly get comfortable with Python 3 but I fully understand that for a lot of these exploit scripts it's still like just easier nicer to work with Python - so be aware of that if I see a single comment screaming at me that I still advocate for Python - shut up just kidding but you know I get it I know it's gets deprecated we should be moving to Python 3 but it's a bit annoying as I have shown you with a few of the examples I mean it just makes it a little bit harder for beginners and that's a little bit unfortunate I hope this quick patent introduction helped you you really don't need like a book or a course or anything on there just practice just use it and I hope I gave you enough pointers to find the stuff that you need [Music]
Info
Channel: LiveOverflow
Views: 85,283
Rating: undefined out of 5
Keywords: Live Overflow, liveoverflow, hacking tutorial, how to hack, exploit tutorial
Id: FxNS-zSS7MQ
Channel Id: undefined
Length: 18min 42sec (1122 seconds)
Published: Thu Dec 19 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.