Manually Unpacking the UPX Packer [Malware Analysis Techniques]

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Nice video! Thanks for sharing it

👍︎︎ 2 👤︎︎ u/bf_jeje 📅︎︎ Jan 08 2020 🗫︎ replies
Captions
how's it going everyone kindred here and in this view we're gonna be doing some manual unpacking of the upx Packer the reason I'm doing this video isn't because it's really needed in the community because upx actually already has a built-in unpacker within its like binary and I mean upx has been around since like the 90s I think so I mean this thing has been analyzed and reversed hundreds of times now the reason why I'm doing this video is because number one there's an Associated blog post that kind of goes into more detail about packing in general and upx specifically and stuff like that but the reason I chose upx is because I think that it's a great introductory Packer because it's very simplistic and I think that it it's representative of kind of the basics of what packing and unpacking is and at a generic level and kind of how Packers tend to be built and obviously the more complex ones are gonna be done with more caveats and more you know tricks and stuff like that but at its core functionality upx represents what packing really is so I think that it's a great starting point like I said there is a blog post that goes along with this video obviously it'll be in the description I would highly recommend you read that before you watch the video because that blog post is gonna go into much kind of deeper levels into what packing itself is at like a philosophical or conceptual level I'm not gonna dive into that too much in this video I'll just give like the generics of what packing is for those of you who aren't familiar so packing is basically a way for you to distribute an executable in either a compressed or encrypted state technically I think that encryption would technically be a krypter and not a packer but I tend to treat them kind of the same because they have the same kind of idea behind them it's just the functionality is different but there's a multitude of reasons why Packers exist both for legitimate and illegitimate purposes but the fact of the matter is is that when you use a packer you're taking your actual payload right the malicious data and you're putting it in a compressed or encrypted state which means it's much harder for defenders and I oughta mated software and stuff like that to get into the payload and detect malicious activity because the only code that's actually sitting there and readily readable is what's called the stub code which is it fundamentally malicious it's just doing some manipulation of its own memory and stuff like that so it makes it a little bit more difficult to to kind of reverse engineer and things of that nature so with that let's go ahead and start working on this the sample that I'm using is just a super small C program that's going to print out a string and then sleep for 10 seconds and return I have already used the upx binary to pack this and the PAC sample I have loaded into a ladybug and also into Hydra so we have a disassembler and a debugger ready to help us when it comes to analyzing packed binaries the debugger tends to be a bit more useful because the payload is gonna be written somewhere into memory dynamically and if you debug it properly you can get to a point in time in which that payload is fully readable and extractable by by us so with that let's kind of get started and I'm gonna start walking through kind of the metadata around this particular binary and some of the indicators that we have that lets us know that this is a upx packed binary okay so what I did was I threw our binary into a tool called PE studio which does some very granular analysis on the PE headers and stuff like that to get more detailed information about the binary itself so in the blog post I kind of go over some of the common indicators for packing I'm not gonna walk through every single one individually here but I'm just gonna kind of point out the ones that stand out the most to me the first thing that I'm going to check out is the imports this is always something you want to check notice that we only have it looks like four imports or five imports that's very strange right when it comes to a Windows executable or even the most simple programs are gonna import way more libraries than this because you just you need more than this to get real functionality right this is not enough to run an executable right load library a get proc address that's probably gonna be doing some dynamic resolution of api's and stuff like that virtual protect and exit process again probably doing something with some memory allocation and things of that nature but this is not nearly enough to run a real program which tells me that the binary we're dealing with is probably a packed sample and these libraries that are imported are the only things needed by the stub code and eventually the rest of the libraries are gonna be loaded in by the stub code dynamically so that's something you always want to check out for is are the number of imports suspiciously low the other thing that's definitely stand out is these sections so you pxi meticulous you P X renames the sections to you P X 0 1 & 2 that's a dead giveaway that you're dealing with upx but packers in general tend to do this tend to rename sections and stuff like that so that's a great indicator that you're dealing with something strange in terms of some more detail here notice that our sections have writable and executable privileges that's very strange you rarely have to have sections be able to be writable and executable because having the section be executable indicates that that's where code is and you're gonna execute that code right and it's very rare I don't even know if it ever happens in legitimate software in which you are dynamically changing the code in the binary at runtime which would be indicate why this is writable right so for some reason this binary is going to be changing the code within these sections at runtime which is a little bit strange so that's always a decent indicator as well another thing to notice is that the entropy for you px 1 here is very high 7 point 8 to 7 that means that we're probably dealing with some sort of compressed data and we'll look at this in a second but entropy just kind of describes the the patterns or predictability of of strings and because any sort of language any sort of means of communication whether that be assembly code the English language whatever there it's always has some sort of pattern around it right every sort of language does but when you're doing with compressed data or encrypted data there's rarely ever any actual pattern it's just completely random so the higher entropy you have the more likely you're dealing with compressed or encrypted data so we probably have our compressed payload within this upx one section the last thing I want to mention in terms of sections is notice that the raw size of our upx zero section is zero bytes meaning that on disk right now there is absolutely nothing in this section you px 0 there's absolutely nothing however I have run time when we load this into memory notice that we're now allocating 77,000 bytes of memory which is kind of strange right why do I need to allocate this much data to this section if there's nothing actually in it on disk so this is usually an indicator that you px0 this section is gonna be where the payload is gonna get decompressed and then written to so that's another decent indicator they're dealing with packed samples when there's there are sections that have very small R all sizes and very large or comparative to the raw size very large virtual sizes the last thing that can kind of point you to this that this is a packed sample is if we look at the strings defined strings in Gita there is very little right there's apps there's almost none besides our imports this doesn't make sense because if we actually look at the executable itself or the source code we have a hard-coded string hello world this is a native C program right this should be viewable within the strings the output here indeed row but it's not because this string is part of the final payload which is in a compressed state currently so these are just some indicators that you're dealing with a potentially packed sample one thing I do want to mention because I did explain this very well at the beginning and there's again because there's a blog post I'm not going into kind of the generics of this stuff I'm dealing with more the technical how do we you know unpack it aspect of this but upx deals exclusively with compression if we look at where I actually ran the command this is where I ran upx to pack the sample notice that our file size reduced from 53,000 bytes to 30,000 bytes this is why upx is designed it's it's exclusively for compression it doesn't do any sort of encryption or anything like that so like I said there's the blog post has some decent graphics I might explain this better but essentially what we're doing is we're taking our original payload we're compressing that payload and then we're sticking a something called a stub code on top of that payload and that stub code is responsible for decompressing the payload writing it to a new space in memory and then passing execution to that decompressed payload which is your original executable which will then run as intended so with all that out of the way we know that we're dealing with a upx and who based off of all the indicators I just discussed and there's a few more that you could probably pull out of here but that's enough for our purposes what I want to do now is kind of verify that these sections are strange or these sections are doing what I expect them to do so what I mean by that is I want to make sure that we have the compressed data in you px 1 and we also want to double check and make sure that you px 0 is completely unallocated so notice that the entry point of this executable is right here which is a new px 1 so that tells me that not only is the compressed data in this upx one section the stub code itself is also a new px 1 so we can see that engage row I'm at the entry point right now the entry points right here we are in the upx 1 data region or memory region I think we can confirm that in in here virtual address 0 X 4 1 4 0 0 0 we can go into gira and we are in 0 x4 ones seven blah-blah-blah-blah which is still in the region it's likely at the very bottom of the region because notice that upx to the next section starts at four one eight so we are still within this upx one region so this is where this is the actual stub code this is what's responsible for decompressing the payload and then writing it to another section or another area of memory remember we also said that because of the high entropy we're expecting the compressed dated also be in here and we can see that if we scroll up notice that we have all of these weird op codes that can't actually be deciphered as assembly code so this is probably our compressed data not any of this all this stuff this is probably the compressed data that we're dealing with notice that it's in the four one four zero zero zero memory region which means we're still in the upx one section so all this is our compressed data and eventually at the very end of this section we have our stub code which is going to decompress that that data another thing I'm gonna look at real quick is you pick 0 that's the memory region that doesn't have anything on disk but is allocated a bunch of memory we can see that here there's nothing actually in this region it's just a bunch of literally nothingness so that memory space is 0 0 4 0 1 0 0 0 so let's definitely take note of this address because it will come into play later on so this kind of confirms all of our hypotheses our suspicions we have upx 1 this section which contains the entry point into the stub code that stub code is going to decompress compress data that's also stored in this section and that decompressed data is going to be written into upx 0 and eventually during runtime execution is gonna be passed into the decompressed payload which will be stored in this u p x0 section alright so with all that in mind let's go ahead and actually jump into olie debug because this is where we can actually start getting stuff done the the debugging process and the dynamic running of this payload is the best way to go ahead and D unpack this so what we're looking for is any point in time within this stub code in which execution is going to get passed into what we believe to be the decompressed data so remember we just went over this remember that we we have this huge unallocated memory region in the four zero one zero zero space that's where we think the payload is gonna be written to so notice that in the stub code all of our jumps and and stuff like that they're all jumping into the zero zero four one seven region right because this is the stub code right you expect all the jumps in this code to jump within this region because that's the only area in which is actual code right going back in nagira we jump into the entry point or we can try there we go that's the region we're dealing with four one seven you know xxx so all the jumps should be into this region what we're looking for is any call or any jump to a memory address that's going to be contained within this upx zero region because that's where we think the memory is or the decompressed payload is gonna be written to so you can kind of just eyeball this there's probably easier ways to do it but I like to just eyeball it to be honest so we can kind of take a look around and right away I notice something strange notice that we have it jumped to 0 0 for 0 1 to D 0 this does not make any sense right this jump right here I'm gonna put a breakpoint on it real quick with F 2 this memory address makes absolutely no sense we can actually follow this memory address with the enter key and notice that it's completely unallocated it's it's absolutely empty because that's this area right here we could probably find in here with like a search or something but I think you guys get the point right we're dealing with completely unallocated memory so what is the point of jumping into this right it doesn't make any sense so what's gonna happen here is all this dub code is going to eventually populate this memory with valid code so we can actually see this happen what I'm going to do since I just said a breakpoint on this jump executions gonna run all the way up to the end at this jump so all this dub coder I think most of the stub code like I don't know what all this is doing this might just be like error codes or like error activity or something but everything up here is gonna run and after it does we should start seeing that memory region that were eventually gonna jump to get populated so I'm just gonna hit run here we hit the breakpoint this is currently where we at we could see a IP is Force 1 7 F 0 B that's our memory address right here and now if I hit the enter key and actually follow this address notice that it's now populated with code right this was not here before the 4 0 1 to D 0 that's the memory address we just looked at that was completely unallocated so this is the payload getting decompressed and actually written to memory so MSV CR T that's part of the original payload I don't actually know what it does I don't know what Emma's a VCR T is I should probably research it but this is part of the initial payload also notice I'm seeing references to like Lib GCC that's because I use Ming W GCC to compile my executable we can scroll down even more and notice that we see an ASCII string that says hello world this is a native C program that's what we saw in this printf statement so it looks like we have successfully gone to a point in execution in which the original payload is written into memory so for 0 1 2 D 0 is the entry point into the actual payload so we're at a point in execution in which we are very happy right we've gotten to where we need to be so now we have to do is actually dump out this data or this memory region into its own independent executable so we can have the original payload by itself so in order to do that a ladybug is a plugin call it Ali dump we can just go ahead and hit dump debug process so what this is gonna do is it's going to create a new executable basically and set the entry point of that executable as our current instruction which is 4 0 1 2 D 0 that's gonna work for us because that's where the stub code is jumping into the payload so we can assume that that's the entry point into the payload itself it's also gonna fix some light offsets and rebuild the import table and stuff like that will let a ladybug deal with that for us so let's go ahead and hit dump and then I'm gonna rename the executable to sample dumped and with that we should have our original payload here so we can throw it into Dedra sample dump right here and let's see if it looks a little bit more realistic so we can open it up go ahead and analyze it I don't know why my font randomly got reset now it's super small but that's okay so first thing I'm gonna see is the imports if you're looking at the left side here notice that we had way more imports look at this we now have all of these all of these right this is a lot more imports that we had before going back into PU studio this is the original pac---- sample our imports we had five now we have not five way more than five so that's a good sign we can look at defined strings a lot more defined strings now we have references to live GCC and Ming W because again that's what I use to compile this we also see the hello world string right here which is what we would expect and we can also just quickly maybe jump into the code itself and see if it looks valid really no reason to you because we already kind of confirm that this is the case so I think we're good in that regard so it looks like we do have our original executable I can go ahead and try and run it and we get our pop-up that says hello world it's leaves for 10 seconds working completely correctly we can also run this is the original pac---- sample it's gonna operate the exact same so it's working properly and now we have the original payload with its you know in its own independent executable so that's the process for kind of manually unpacking upx like I said in practice you're just gonna use the built-in unpacker that comes with upx but like I said the reason I did this video is because I think that it's a great introduction to what packing is and kind of the process and mindset you can have to go about unpacking it so hopefully that was interesting hopefully you learn a little something and have a good one
Info
Channel: Kindred Security
Views: 6,631
Rating: 4.8873239 out of 5
Keywords: kindredsec, kindred, malware, analysis, reverse, engineering, malware analysis, reverse engineering, virus, rootkit, root, kit, brute force, crypto, infection, attacker, cyber, cybersecurity, engineer, hackthebox, hack, hacking, hacker, pentest, capture, the, flag, htb, computer, trojan horse, executable, exe, portable, networking, traffic, backdoor, windows, oscp, google, chrome, password, batch, cmd, security, basic, c2, python, programming, basics, upx, unpacking, packer, crypter, tutorial, universal packer, ollydbg, ghidra
Id: jfuj0b3Ao1k
Channel Id: undefined
Length: 20min 24sec (1224 seconds)
Published: Mon Jan 06 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.