Live Coding A Squirrelwaffle Malware Config Extractor

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] [Music] [Music] [Applause] [Music] [Music] [Music] [Music] all right hello i guess we're we're live now [Laughter] all right so as you guys know um i don't do this too often so i'll see if i got everything set up right um maybe somebody in the comments let me know if the audio is okay uh i'm flying blind here no ears so not really sure um we'll just wait a little bit to make sure the stragglers jump on the stream what we're going to be doing today is talking about squirrel waffle whatever the heck that is so i think we can do here is pull up this malware traffic analysis blog from brad um so this is the first time i heard of this thing called squirrel waffle and it's crazy name i don't know if brad just came up with that or if it actually is what the malware devs call it um honestly the only reason why i looked at this is because it sounds hilarious and uh somebody in our discord uh was asking for some pointers on it so i was like what the heck is that and i figured i'd come and uh take a look at it uh live so um as we wait oh yeah a few more people jumping in the stream um yeah we'll get started in a minute so basically what this is is this is a loader um so you can see here brad's done a nice job of kind of showing us the execution chain here so they have maldoc i think it runs like it has a macro there's some vbscript in it and then that downloads this uh squirrel waffle dll which is itself a downloader so it's basically just a simple loader and when i was looking at it uh greetings from kenya hey kenya shout out so um when i was looking at it uh this is actually a nice um piece of malware to do a tutorial on or live stream because it's super simple there's lots of cool um lots of interesting tricks that we can sort of show and demonstrate with a sample that's pretty easy to understand so that's why i thought it'd be nice to kind of do this live uh basically the purpose of this malware is to download and execute a final stage payload so it is kind of a dropper i'm not sure i saw someone in the chat saying um the squirrel waffle thing comes from some strings that are russian uh saying squirrel donut so if that's true that's kind of funny um i guess that's where the name comes from uh okay so i think there's probably enough of you here we're almost at 10 past so that's probably enough time to jump in here uh i will actually show you um i don't know actually it's probably easy enough just to see in the traffic here so um basically we're going to do everything static today so we're going to just do it in ida it's a sample that's easy to statically reverse engineer there's not a lot of components to it but uh i did want to just mention that i took a look at this blog here and i took a look at the pcap before i started analyzing it so i had an idea of what the c2 urls look like so you can see actually here in the image it's pretty easy to see so they're doing these post requests with the url string that's a bunch of looks like maybe basics d4 encoded characters or something so and it does a bunch of these different uh posts so you can see there's probably going to be a bunch of different c2 urls here they're going to be looking for and specifically what we're going to be doing on stream here is looking for the config and how to extract it statically in python um so we won't be like reverse engineering the whole rats that's something or sorry the whole loader that's something we can maybe do another time or maybe i'll make a video on it or something like that but today we'll be focused specifically on the config um what else do we need to know about it oh it's packed um you guys get to see my obs magic here in a second so uh it's packed and in order to get the uh to get the sample i just unpacked it with unpack me um so you can see here like this is the parent that i uploaded and uh this is the child that unpacked the unpacked child has this loader export um and i'll show you an idle when we open this up that's where the uh the entry point for the code is that's what we're going to be looking at um let me see if this uh yeah actually so i saw there's a question in the chat will i be uploading the configs directory to github yes i'll show you guys how this is going to be set up i'm actually going to use our lab notes which is a jupyter notebook and i'll just push the notes live to our github afterwards so you guys can download them and use them no problem um okay so let me try this obs thing i really want to try it i set it up last night uh let me see if this works unpack me a malware unpacking service from open analysis expose the malware before it exposes it [Music] i don't know if that worked or not i get a kick of that every time i hear it i've heard it like a million times it still makes me laugh okay anyway so that's where we got our sample uh it was from unfacme and i've actually already dropped it over to my desktop here uh it's under documents malware squirrel something squirrel waffle yeah so we'll be looking at this here and that's the unpacked child um i'll also throw this on mouseshare and i'll drop the link i'll put it in the description below the video i'll put a link to this on malshare so if you guys don't want to open an unpacking account or whatever just go over to mauser and you can download the unpack file and play along from home okay so with that let's open this up in ida and see what we've got here document malware squirrel level okay uh let it do its thing here um and of course like i said there's two exports here um there's the entry point and you can see here the entry point doesn't really do very much um what we want to do is look at this loader here the loader entry point and for this tutorial uh so you can see there's only one call here so for this tutorial um i'm going to also use the hex race decompiler obviously it's paid product it's pretty expensive you can do the same thing in geohydra g hydra hail hydra but um i won't be using that i'll be using the hex fuse d compiler today but pretty much the same steps you can do in geohydra or whatever you want to use so well f5 this yes go ahead and i like to set up my decompiler and my disassembly window side by side text view drag this over so you guys can see it a little bit more let me know in the chat if the font is too small i hope it's okay to see if it's too small i'll try and make it a little bit bigger just let me know so um oh yeah i saw it in the chat um you can unpack this with x64 debug yes uh this is actually a really simple malware um you could do a memory breakpoint there's many tutorials on our channel of how to unpack stuff manually using the debugger and uh the the the memory watch um uh series of tutorials we've done where you do a break point on alec on virtual lock and you watch what's written into memory space um that'll work for this sample if you don't want to use unpack me so yeah go ahead um okay so it's too small uh too small in the chat okay i'm gonna try and make this font a little bit bigger here uh i don't think i've ever made the font bigger so i'm not sure how to do it if somebody wants to let me know i will do it how do we make it bigger uh now is enter edit and review maybe let me make this bigger can i just do like a ctrl shift plus no come on ida all right well until somebody helps me on the chat lets me know how to make this font bigger i will control plus does not work uh oh there we go there we go all right let's make it 14. okay is that good can you guys see this okay uh let me know i think this is probably this might be too big all right uh well i mean it's okay uh hopefully it's not too big for you guys it's a little big for me on my screen it's like you know each character is like an inch tall anyway okay so we've loaded up our uh our binary in ida we have our decompiled view here and if you guys are diehard oya labs fans you'll know this trick that i'm going to do next where we do produce file c file i'll just drop it in the directory here and the reason why i'm doing this is because this forces ida to decompile every function and that will help fix up some of the arguments and stuff like that if there's any issues with um with the initial easy pass-through that it does in ida so now that we've done that we can f5 again in the window here okay there you go so you can see that that actually fits some of these some of the function and some of the arguments the function prototypes here so always helpful to do that when you start analyzing and i started just on the chat yeah people watching people watching these tutorials on their phones it is not going to be a good experience i mean i can't imagine how small the ida window must be on your phone um yeah to get the most out of this you probably want to watch on a desktop or you can make the window a little bit bigger um okay anyway so now that we have done the decompile all so we forced ida to decompile every function you can see some of the arguments here have been fixed up this is again the entry point uh if we just go back to xrefs here so this is the loader entry point the export for the dll and there's only one function in it so we know we have to start there and if we start there we already see on our screen here some interesting stuff so i see this kind of looks like maybe a base64 encoded string or some sort of string and of course we're looking for an encrypted config so you know that's probably a good place to start and if we look at this function here well um this is the beginning of the uh of the entry point function and the only function that's called each time is the same you know it's only one function being called there's no other functionality so far uh in the beginning of this function here just these this function being called again and again and what's being passed to this function well strings we can see here there's a bunch of strings being passed and the string length you know my guess is this function is some sort of uh string struct um set up so if we take a look at it here ooh nasty assembly not assembly but now it's the bit banging stuff here which i don't want to reverse engineer but we can probably get away with at least cleaning up the arguments a little bit here so we can work through this and figure out what this function is doing but let's clean up the arguments and see if that helps so what i mean by that is you can see you have this variable here and see how it's like this variable uh the uh men the memory address of this variable plus five um that indicates to me that this is a struct that they're actually passing a structure uh passed by reference a uh struct in this variable here and now they're dereferencing the the struct in see how that's plus five anytime you see this in ida this means that there is a struct and you need to fix that variable type before you can continue reverse engineering now we've covered this in a bunch of other tutorials if you guys don't know what i'm talking about this is completely new you can check out our c plus plus tutorial and i'll explain exactly what's going on here but for now let's see if we can do oh create new struct type heck yeah ida i like that oh did that not work there we go okay so ida is creating a new struct type now i don't like these gaps what these gaps mean is ida doesn't know what the variable uh the variables are that should go in the space it knows that there's one because there's a reference to the first um variable instructor the first uh what do you call it in a strap the first entry or element sorry instruct and it knows there's references to the last two elements but it doesn't know in between so it creates this sort of byte stream now i like to clean these up and just turn them into d words so it's 12 bytes that's going to be three d words so let's clean this up here and then the reason why i like to do that is because oftentimes this struct will be used in other functions and those the that gap is going to be referenced as a d word somewhere else it might not be it might be referenced as say a string of bytes or something but if we see that we can fix it later it's uh it's usually better to start with uh d words just to get a clean slate so 3d words and i like to label them so that the first one would be d word zero so this will be d word one two three four and five and i'm going to rename this as struct string because we know that it's some sort of string i don't know what kind of string yet so now this cleans up you can see here this cleans up this code a little bit and we get to already label one of our structure elements so we know that the size here because that's the size and we know that's the size if we go back because they're passing see that structure fixed some of our code here because we know that we're passing the string length as the final argument in this function so that's why this is a size so let's go down here and we'll rename this element with the n key to size okay all right let's go back because i saw that when i added that struct i saw that that actually fixed our code a little bit here whoa there's a lot of calls to this function so let's go back here and in our code here i'll just hit an f5 it usually decompiles each time you pop into it but f5 just to fix anything up and we can see that these are now automatically called string struct so ida has helpfully renamed that variable and fixed it up in our main code here so that should make this a little bit easier to read so we can see that now these structures are being passed into this string like i don't know what it is it's transforming string into some sort of structure and we can also see which is kind of cool here this um long interesting string is being copied into this structure so we'll do var i'll just rename this as of our string uh b64 so that looks like maybe a base64 encoded string or maybe not actually no that's not right that's not a base64 encoded string that's something else um string interesting all right so we can see uh remember these are all i'll just actually name this function we don't know exactly what it does but we know that it does something so it's like a string struct purse okay again this is kind of you're getting an inside window into hella reverse engineer i don't actually know exactly what this does i'll come back and figure it out if i need to but right now i'm really focused on trying to get that config out so i'm just labeling it we kind of understand what it does we know that it's creating the structure that we've defined we don't have all the elements of the structure defined yet but we can kind of figure it out as we go and this allows you to do this a little bit quicker um it's definitely not the most thorough way and i'm sure there's a few uh reverse engineers out there screaming hitting their keyboard but again this is the fast way to do it uh and hopefully we're 25 minutes in hopefully we can get a full config extractor in under an hour that's the goal for today so um oh actually look at this so we have our interesting string here what do we have here we have a chunk of data i just saw in the uh in the chat somebody's making me nervous because they're they're real we're since you're watching me yeah i think there's a few uh pretty solid versus engineers i saw some names in there i recognized uh i'll look forward to your dm's later on how i could improve my my struck setup and my naming conventions okay um so anyway we have this uh this is a pointer to somewhere in memory and uh it's also being passed to this string struck parse uh thing so let's take a look at it uh i'll slide over into our disassembly window here and so this is the chunk of data we'll just name it as a pointer to block data actually i like to call them blobs just keep my name and convention the same uh for the stream i like to call them blobs so we see this is a blob of kind of weird string data it's definitely not a string i understand you know i can't read it but it is some sort of data in memory here in the r data section and it's also being passed to the string parsing function here and this is the string length that's been passed in the past so we know this is probably the length of that data so um this is interesting i'm going to rename this as var blob uh what did i call the other one uh string blob yeah i'll call this var str blob okay and that gets passed into this function here oh interesting okay what do we do in here let's fix the um function type here we'll go back and so they're passing in this is again the struct that we've created so this is the struct underscore string so let's press y and we'll change the argument type here for the function so this is going to be struct string okay and if we pop oh that fixed things a little bit if we pop in here uh there we go that looks good is this so the first argument is that's also yeah that's also a structure of some sort i wonder if it's a similar structure v4 uh let's see here let's see i don't know let's see if we change this to struct string see if that works does that make sense i think so okay so this must be some sort of translation function again not going not going too deep in here uh we can again always come back and reverse engineer these uh i always come back and reverse engineer these functions in detail if we need to uh right now we're kind of going as fast as we can to try and get the config um if i was going to write a report on this i would obviously go back and figure out what these things are doing um but right now we're still kind of laser set focused on the config and we can see this is probably some sort of copy of that blob okay and now this makes our next function uh pretty interesting the only thing we're passing to this function is that blob from memory and this interesting string now we're passing different elements from each structure in there but we're passing basically you know just two pieces of data into this function if i had to guess i guess there's probably going to be some decryption because this looks like in memory remember this is our big blob this looks like an encrypted piece of data and this looks like some sort of decryption key that's what it looks like to me but let's figure it out so come in here there is a ton of arguments to this function and so it's going to be a little bit messy at first but not a big deal we can kind of figure it out oh okay well anyway this is what i saw um when i first before the stream when i first looked at this this was the function i looked at and i was like oh yeah i'm gonna do this on stream this is pretty simple um but uh so we found that again so basically here so you have a for loop here and you have an xor uh you know an xor operation here uh with the elements you know they're they're being incremented in the for loop and you have a percent so uh i'm not explaining this right basically whenever you see something like this like a loop that's iterating over something and it's x-raying it with something percent something so percent is a modulus so it's basically you're taking the uh remainder from uh the um from the iteration of the for loop so the reason why you use this is because you have a key that's shorter than the data that you want to xor with so you just wrap over the key as the data increases so you have a pointer that goes across the data and then as the pointer wraps like as the pointer gets bigger than the key you take the modulus of the pointer and that'll just wrap it around the key so i don't expect everyone to know this but anytime you see something that looks like this a loop with an xor and then a mod for the pointer it's almost 100 uh a simple xor with a looping key like that's that's pretty much what this is so uh we can go a little bit further uh just so you guys can see yeah remainder of div division exactly so i saw in the chat someone saying the modulus is the remainder of division yes that's exactly what it is so you're basically like instead of incrementing over the entire pointer you're shrinking it down so that it only increments over the length of whatever you're modding it by so i'm assuming this length is probably going to be the length of the key so it's probably going to be the length of whatever is in here so let's actually take a look at that a11 so argument 11 that's the second last argument so let's go back here and look at what the second last argument is oh look at that it's the size of this interesting string so now things are starting to um you know they're starting to come together uh what's the other size so that's zero one two three four arguments in so zero one two zero one two three four is that the loop yeah that's the loop okay so let's rename this as arc data length and we'll rename this name argh len or keyline okay so then i would assume this is the pointer name of our pointer and then this would probably be var key and then this would be var data okay um and that's that's pretty much what we have here uh what are oh yeah okay so this is very straightforward so uh what's the data well it's the first argument if you guys remember what's the first argument here well it's our data blob the dword1 from our data blob so at this point this is pretty much uh you know this is pretty straightforward what i'd like to do at this point is instead of continuing to reverse engineer and wasting our time here let's just try and validate our hypothesis so i think that this stuff is the data this is the key and it's a simple xor so let's pop up in cyber chef here see i already have it open let's grab our key here cyber chef oops this isn't right let's look at that key okay cyber chef we're gonna do an xor it's not hex it is etf eight and we're going to do the blob and memory here so we're going to highlight it scroll down to the bottom here of course we have the length here but i think we probably won't even need that because this crazy string is probably going to terminate with a nice null bite and then we'll know we're at the end here so let's keep scrolling down keep going oh my god it's huge that's what she said no that's what she said jokes on the stream oh my god dating myself okay let's keep going oh did somebody say binary refinery yeah i should do this with binary refinery i just the visual element of cyber chef is better for explaining how things work than like a bunch of like crazy uh uh pearl syntax but yes to speed this up just if you're trying to go as fast you could yeah binary refinery would probably already have this ripped apart okay uh yeah we got an old bite here so let's shift and check this out oil labs tool copy hex heck yeah go over here paste our hex in uh we have to convert this from hex from hex we'll move this up hey look at that so we have now decrypted the string and what is it well it's a list of ip addresses i think i saw on uh maybe on twitter or a blog post or something that they have a block list in it um so i think that's probably what we found here so that leads me to think that there isn't actually a config in the uh in scroll rat it makes me think that they actually have a bunch of uh encrypted data sections that they just decrypt and use so there isn't like one config there's probably just a bunch of um of these like data blobs so let's actually uh make our disassembly window a little bit bigger so we're looking at the memory here we're looking at the r data section and let's look and see if there's any more of these like um oh yeah there's another one right here okay so yeah so it looks like they have uh encrypted data here and they basically have an encrypted blob followed by the key in our data so this is i guess there's probably going to be another one um let's see [Music] well actually it's probably easier just to find it using the xrefs to that function so we'll just rename this as uh decrypt data oops and right spell that right decrypt data okay we'll do extraps to it okay yeah a few extraps to it um so what are they doing here uh [Music] that's a little key uh what are they doing here oh that's the same one doing here um okay oh that's the one we just looked at uh so i think i probably skipped over probably one of these right there we go is that the one we just looked at i'm not really sure but uh oops oh yeah there we go there's a different one okay so there's this one as well um let's grab that see what that is oh no that's one we were just looking at i'm confused now let's go back back yeah this is the one that we haven't seen yet okay so we'll copy that decryption string here and let's grab the bytes for that decryption string pop back in here uh is this giant as well yeah this is a giant oh it's not that giant not too bad not too bad okay so let's grab this chunk of data and see what we get uh shift copy hex copy thanks oa labs on our github if you want that plug-in all it does it doesn't really do anything too crazy just allows you to right-click hex copy um obviously you can do i think it's like edit export bytes uh yeah export data so you could do it this way as well as hex strings so all that plug-in does is just saves me one extra click or two extra clicks um okay so let's paste this in here paste it in hey look at that looks like our c2s so there we go so we found a block list in the data and we found uh our c2 list and we're at 12 39 12 40. so we're doing pretty good but that isn't what i promised you today this is just the first part what i promised you today was a static config extractor and this is where you're gonna get a couple tips on how uh you know industry uh industry hardened sergey does config extraction because there's a lot of gotchas in static config extraction so i'll start out with what i'm not going to do today and then i'll show you what i'm going to do so uh there are actually let me start out with saying there's a couple different ways to do config extraction statically so first of all you need an unpacked binary so we have an unpacked binary here obviously if you want to automate this kind of stuff you know you blast through unpack me or some other unpacking service then you run your static config extractor on it you get the config right that's kind of how this works now the static config extractor you can kind of do three different things i mean there's there's all kinds of different ways you can extract the config but you know in my experience there's three different ways to approach it number one if the function here let's actually do a synchronize with our disassembly and uh yeah so sorry i just see somebody has to drop in the chat and they're saying will this stream be a video once it's over yes we will leave this up as a video so you can come back and watch it anytime no pressure to stay on the stream but you know if you want to stick around you can ask questions and i can answer them so that's the only thing you can't get if you wait for the video um i won't be live to answer questions but remember we have a discord uh link in the description of the video so you can jump on discord and ask questions there if you want okay so oh you guys might hear my dog my dog's back from his walk you might hear us heavy breathing in the background okay so um where were we we want to look at different ways to extract this config statically so one thing you can do is you can look at the assembly code here for the decryption function and you can see if there's something unique about it so maybe this xor here and you can see if there is let me show you guys here options general show me the bytes up codes so you can see this is actually the hex data that's going to be available in the file and this matches up to the assembly so you can write yar rules and you can write regexes based on this data here and that's how you basically identify these different pieces of code in the in the function so one way to write a config extractor is to write some sort of regex that finds um you know let's say this uh this byte pattern here which is the xor um and then you can use some sort of offset to try and find you know some data that's being passed to it now this is a bad bad example for that because the um data that's being passed to this function is being passed as an argument so it's not anywhere near this actual decryption routine in the code so it's not going to work for that and also this decryption routine is pretty generic it's like you know you're just doing an xor there's lots of variables in here obviously all of these things can change depending on how the compiler is set up you know depending on how you um you know how you're doing the compiling of the binary if that changes these values might change they might use different registers so then the bytes will change so that's not always the best it's good if you are um writing config extractor for something like say like drydex or trickbot where the code is almost never changing you know they blast out a billion versions of it that kind of stuff it's good to do this sort of um you know quick regexi type stuff the next thing is the next approach you could take is you could say okay well you know what i want the disassembly context for this file and i can see some people are thinking about this in the chat they're saying like oh could you use like uh radari 2 or cutter or whatever um to do this and the answer is yes but in practice in the industry no so uh could you basically write a so you could use like um what's that uh disassembler that everybody uses i can't doesn't come to mind right now anyway you can use like uh capstone um is that right yeah i think so uh so you can use like a disassembler in your plugin and you could disassemble this code and you could do things like say look for some sort of um some sort of marker to identify this function and then you could say look in the disassembly for all the x refs to this function right just like what we've done here you could do all this in code and you could say what are the arguments to this function and you could use like you know you could basically work your way back uh in the disassembly now this is something that i've seen done and i've done it myself for simple uh programs but running in production disassembling all this code uh you know if you have millions of files going through it's slow and it's it's just not very elegant right and you're still stuck with things like well this is a struct so the so the the argument for this is actually comes from another function that has another argument passed to it so you have all kinds of nested crap in the disassembly that you have to walk up programmatically right you have to do all of that and you have to build it in a way that it's robust so if they change things your config extractor doesn't just blow apart right so this is again this is a method a lot of people who are starting out will try and do this for config extraction because it makes the most sense because it's the closest to the way you would manually do it right because you can replicate the steps that i just showed here today in code right you can do the same things i did you could like you know find the arguments walk back what function did the arguments come from okay where did those arguments come from okay now find this you know offset in the rda error data section okay what data is in there that's you know you can do that stuff programmatically but again with a more complex thing with structs and stuff like that it's not always going to work the best so the third way is also kind of frowned on but it's the way i do things which is we look for a pattern of data and we try to use that pattern against the developer so what i mean by this well hopefully we are 46 minutes into the stream so hopefully no malware devs are watching right now basically when people do stuff like this they have these like decryption things they don't want to manually code this right this is like a real pain to manually code and see so what they do is they have little helper functions macros that they can do so they'll basically put like the data and the key in the macro and then the macro will auto generate the code for them so what does this mean it means that the layout of that data in memory is usually consistent even if they change the um the code this the way that the macro works for their decryption or the way that the function works for the decryption is usually the same so we can use that against them and this is how i write a lot of config extractors that are robust they work through multiple versions and this only really works if the developers are kind of you know not trash but like you know they're using old methods and this is definitely a very simple loader using old methods um new methods very interesting cool topic that i actually do professionally so this is i'm talking about de-obfuscation you know heavily obfuscated stuff that's interesting but this is just a very simple loader this could have been built you know five years ago and nobody would have noticed the difference it basically looks the same as something we saw five years ago so what does that mean well that means this kind of trick is going to work so what is what did we notice here we noticed that these encrypted data blobs were always followed by the key and again this is something that if you've seen enough of these usually this means they're using some sort of macro in their c plus plus or some sort of helper function to encrypt this data so that they don't have to manually hand code it so you can see here's another data blob and after the data blob this is a really long one i think this is the block list that we did first wow it's long there we go after that we have the key so we have this repeating pattern in the r data section that's what i want to attack and the best way to do this i think in this case is to go for size because there's no real discernible pattern in the key and the data so if we look at our strings here so there's nothing really tell like this is a key versus this is a piece of data right you can't really sometimes you can use regex's like if there's some sort of special set of bytes in the key that you can use to look for but in this case there isn't it just kind of looks random so like that's a key and that's data all right whatever but what we do know is that the config for the c2s and for the block list is extremely long right so it's probably the longest string in the r data section so if we look at these this is all the contiguous block of data oops i went too far anyway it doesn't really matter it looks like it's the longest contiguous block of data in the r data section with no null byte in it now they're using xor so you can have null bytes it's possible but it's just not as likely right you're usually going to have a big long contiguous string of nominal bytes that's going to be a data section then you're going to have some null bytes and then you have a key so how do we do this um blind right how do we do it blind without having to know anything about the binary without having to load the binary and disassemble it in our extract or anything like that well we have 10 minutes left on the stream i'll see how fast i can type but i think we can probably get this done so let's go into our documents and our get is that right um what do we have in here yeah we have our lab notes okay so uh if you guys aren't familiar with the way we have this set up i'll just go to the oi labs github so what i've been trying to do is use uh a jupyter notebook for these streams and in each notebook we're saving the code um the python code that we've written and you'll see it renders here in the um in the notebook and you can just look at it on github and you can copy this working code out of the notebook if you want you can also download the git repository pull it up and open it in jupyter labs it's all open source and so if you guys want to follow along from home what i'm going to be doing now is i'll load up a notebook and i think that's going do i have it installed yeah i have it installed okay great um so what i'm gonna do here is i will open up a new folder here i'll call it squirrel waffle kill this sure save it and i want a new python 3 notebook and i'm going to save this rename it as squirrel waffle okay so i'm gonna just make the okay you guys let me know that's too big i can't see it you guys let me know if this font is big enough import pe file is that big enough for you guys to see let me know in the chat and make this a little bigger so we can see okay and what i'll do is at the end of the stream i'll just push this to our github so then you guys can follow along with the code at home so uh the first thing we want to do you saw i imported pe file really useful library super helpful and we're gonna have to actually parse out that our data section um so yeah all good thanks thanks chat um so we're gonna have to parse out that r data section and that is why we have pe file installed because it's the fastest way to sort of parse the structure of a p file it's not too slow um you know we can actually uh we can actually uh do it you know it's it's not too taxing it's not the same as like disassembling the whole file um so let's uh first actually let's do that so we'll do a data equals open yeah you guys are going to see i'm really terrible at typing and we will open our documents malware squirrel waffle scroll.bin we want we want that path here and we have to escape all this crap here i'm not used to writing stuff in windows i'm just doing it for you guys so i have the vm set up nice um usually using os x as a development environment so i have to deal with this kind of crap but it is what it is all right and we'll close this off we want to read it as binary all right let's make sure this worked that work okay yeah it worked all right so uh we now have data that contains uh the data variable contains our file here of course when you guys run this on your uh at home you guys are gonna have to put in your own path to the file that you want to extract um yeah uh i saw no no rostering literals in python yes there are i could have put an r in front of that and it would have worked fine i am definitely not as comfortable with python 3. i've been writing a crap ton of stuff in it but it's still it's still i have a lot of python two seven tendencies where you know anyway but yes you could have put an r here in front and remove those backslashes i probably should have done that okay so now what we want to do is want to find that r data section so let's do we'll have to create the p file first so we'll do p file dot p e um you can look up the syntax for this i just i write it so many times a day that i just know it off by heart and then we want to for s and p dot sections uh what do we want to do here for each section let's check if it's an r data so if it's binary string dot our data is in the s dot name so if the section name contains our data then our data equals s dot get data all right and then we should probably just to make this a little bit better um we should do our data equals none and so now let's just double check here we'll just print out the length of our data to make sure we get it we got it actually we don't even have to do that we just do len of our data does that work yeah okay so we've got the r data section now all of the binary from that section in this variable here so next thing we want to do is want to split it on null bytes and the reason why if we look in ida we can see that there is no no bytes in our encrypted data here but there is a null byte before the string and there's a null byte after the decryption key so it looks like and there's i think there's null bytes i always do this too fast scroll too fast um yeah there we go and there is a bunch if we undo this there's a bunch of null bytes before the encrypted data as well so if we go to our data here and we do blocks equals r data dot split on binary null bytes then we have to so we can split that and of course if you split a string that has a bunch of null bytes you're going to get a bunch of empty strings in that blocks data so i want to remove those empty strings so i'm going to do blocks equals and we're going to use some list comprehension here so we'll do x for x in blocks if oops if x is not equal to an empty string of bytes all right see if that works okay so now we have split our r data section into a bunch of blocks and each block is going to contain some data but no null bytes and so if you look i'll just go back to our id again so that we can keep track of what we're doing so if you look at this what this actually looks like one of the blocks should be this data this encrypted data and then the next block right after it should be this key now all we have to do is find the blocks that have that sequential data and key and how are we going to do that well we're going to take advantage of the fact that these config blobs are pretty big so we can sort them by size and just take the biggest ones and see if that works so let's sort them by size we'll do uh blocks sorted oops equals sorted native python function you guys can use for this we'll do blocks and we'll do key is len okay let's all right so let's actually make sure that this makes sense let's print the first the size of the block sorted the first thing in block sorted so the first element and let's print the len of the last element just to make sure we got this right and i'll make this little tuple using old style python27 syntax with a new python three what could go wrong right um zero percent d yeah percent d and land of block uh negative one is d all right all right so the length of the first block is one length of last block is 6500 6542 let's take a look at that uh 6542 block actually no it'll probably print too much in here let's take let's print a little bit of it um let's print like uh block sorted grab the last block running out of time oh we're one minute over time uh we didn't quite get it within the hour um but hopefully we can uh hopefully we can do this quickly oops okay so is that the uh encrypted data looks like the encrypted data of me so it looks like we have our block set up here uh correctly now what we want to do is we want to create a decryption function a simple xor decryption function so now that we've found the blocks we'll create a decryption function and then we'll parse through the blocks and decrypt each section so let's do that quickly we'll do def decrypt still haven't type trouble typing decrypt uh we'll do a key and data and whoop we're gonna do out equals a string do 4 i in range line of data so we're going to iterate a pointer through all the data so this is actually exactly what they do in the binary itself right so this is the same idea as if we go here [Music] exports loader so what we're doing in python is exactly this we're just recreating this function right here that's it so pop over here we'll do four iron range plane data out equals char of uh the data at i xord with the key and remember the key has a length that we need to take the modulus of so we'll do i mod len of key there we go i'll make a little space here so it's easier to see okay and then print no return okay so uh that should decrypt our data so now what we don't want to do is want to parse through these blocks let's just take the two largest blocks because i think that's the the two configs that we want right so it's the block list and the c2 list and we will print those out so let's do uh actually we'll create a function for this um no that's not for b no i in range len of blocks so we are going to be iterating through each block if blocks at i equals uh blocks sorted the last element of blocks forwarded minus one then we want to decrypt it decrypt the key is the next thing out plus equals uh yes sir thank you chat plus equals thanks chad that would have been funny um okay um so we want to do uh decrypt the key is going to be blocks as i plus one because of course the key is sequentially the next one after the encrypted data and the data is the block that we found yeah i saw uh in the chat i saw that little plus equals out thanks guys um so that'll be i and then maybe we'll just uh oh uh charizard defined yeah i know it's not defined okay there we go and then we can print out right at the end we start fumbling there we go so we found our block list right that's the first thing and let's go back here and we'll find the second largest um sorted block print that out boom config okay so now we know how to um extract the config completely blindly right with just a little bit of python it's fast we don't have to decompile or disassemble the binary and because we suspect i highly suspect that the developer is using some sort of macro pattern or helper function when they encrypt this data the encrypted data and the key will likely continue to be sequential in the r data section so we can use this trick to continuously extract configs what could break this we always have to think you know how robust is this what could break it if we have a null byte in the encrypted data blob this will break right so we won't get uh that pattern will be broken so that could happen definitely could happen have to handle that not on stream but you know you have to think about how to do that um obviously the most uh the easiest way to do it would be to just check the next key right you know just check the next null byte but anyway you have to deal with that and uh what else could break well maybe the developer sees the stream and they decide to split the key and the data in the r data section apart so we can't use this trick so that could maybe break um but other than that uh or they could change the encryption algorithm right maybe it's not going to be xor but other than that as long as um the developer keeps using this kind of uh you know setup then it should be uh it should work for us so there you go uh a little bit over an hour to be fair i started i think at like 1200 or something like that and it's 108 so i'm only like two two minutes over um but anyway that's how you write a super fast config uh reverse engineer and do it all in an hour um that's basically uh that's all there is to it so uh what i'll do is i will push this to our github um so you guys can go check it out there um obviously like i said it needs a little bit more work you need to handle null bytes in the config data if they occur and you also probably want to make this a little more robust at some error checking in i'm going to leave it in the jupyter notebook obviously if you're writing config you would pull all this out you'd write yourself a nice standalone python file um you'd add some error checking and stuff like that but i think for learning this is the best way to do it because you can kind of you know mess around with the stuff um in the notebook and you have all of your history here so what i'll do now is i will take a look at the chat i'll stick around for a few minutes and if you guys have any questions about this or anything let me know and i will try to answer them on stream here and for everyone else who just joined for the config extraction there you go i i love these twos and malware i love these simple malwares i think they're so um they're so you get so much out of them because you don't have to worry about all these extra layers of de-obfuscation and stuff you know i always think they're such a nice learning experience uh some stuff in the chat am i gonna do more streams yes i love doing these streams these are i think these are the future of our channel i'm still gonna be doing uh videos uh nice edited videos there's one on hash db coming up soon a tutorial if you guys haven't checked that out as well um we have a hash service db dot open analysis dot net and it's https uh so there you go so basically we just launched this uh on friday and so you can look up hashes uh malware hashes there's an ida plug-in for it uh if we go here where's the item plug-in here you go yeah so there's an ida plug-in for it um if you want help for this or you have any questions you could join our discord link is in the video description um it's open for you to join discord um join there say hi to us and i'll have a video on hashtb how it all works and why i think it's cool um it's all open source and free forever we're never going to try and monetize this which also leads me to if you like us sub i set up some patreon stuff and there's some merch buy it support us so that uh i'm more incentivized to make these videos instead of doing uh my job which makes me money so if you guys like these uh support us let us know okay so more comments here am i doing the flareon challenge no i did year one just to see what it was like it's kind of cool i have the little coin the very first year um but i haven't done it since i have a personal aversion to ctfs i like to do real malware analysis for real samples that work in the wild because a i think they're more interesting and b the work is applicable i like that i don't like to do work for the sake of doing work right i always like to be able to do something that is going to have a real impact so i don't really do ctfs it's a great learning experience i did the first one it's kind of fun you know they're fun they're good to learn but i don't do them um would you mind explaining how xor works it's not clear not on the stream i won't but xor is a simple um algorithmic what would you say a logic sorry it's a logic expression that has a set of rules where if two bytes are matching they're uh they're left alone if they're different then they flip um that's a terrible expense that's terrible explanation go on to wikipedia and just look up the truth table for xor it's pretty straightforward and if you want reach out to me on discord and i might make a quick video just what the heck is xor what else we have here do you use g hydra no i don't but there are two super helpful dudes on our discord um who use it all the time uh one of them we did a video with um yes go he did the binary refinery one the other one is lars and they do uh tutorials on geohydra you can go check them out i can't remember their exact i think it's like mal dot re is that no mal let me just see here mallory yeah that's it okay so uh if you want to know stuff about g hydra m-a-l-dot r-e uh go check them out you know we can't do it better than they do so we just let them do their thing um good idea to learn csu plus plus tourist engineer i don't know how you can reverse engineer if you don't know cuc plus plus um you might learn it by reverse engineering first which is kind of cool but you're going to have to know it if you want to do anything at scale what else do you have private class on patreon no what we do for patreon is you get access to a discord channel that's just us so you can answer ask us questions we can like hop on a live stream and help you if you get stuck so you have that and if you pay us a whole bunch of money you get access to our private git um so we have a bunch of like tools in there that aren't public um but uh that's not really what we're trying to do we're just trying to let you guys you know just grab the bottom tier and support us that's it you know we're not trying to trying to sell anything there really those are just extra bonuses for you guys um what else do we have here what's the difference between your hash and lumina okay that's the different kind of hash so the hash db that we set up is for malware import hashing and string hashing so this is where the malware author creates a hash of strings to obfuscate them and we can use it to de-obfuscate them lumina or illumina from hex-rays is actually a really awesome idea it takes a hash of the function and then you can identify the function in ida based on that hash different kind of hash um different kind of setup uh that's the difference between them uh lenny zeltzer hell yeah lenny's elterman og yeah definitely recommend him his work his like sans courses are awesome he's a cool guy um really helpful nice guy definitely recommend uh what else do we have here in the chat i'll stick around for a few more minutes if you guys have more questions if i didn't get something just post it again in the chat and i will try to um i'll try to get it to it stick around for another few minutes mini rats are c plus plus yes lots of stuff is c plus plus lots of stuff to see definitely good idea to know it um yeah some people say you can be a malware analyst without knowing programming yeah it's true i've known uh decent analysts who don't know how to code but they're not this kind of analyst they're a different kind of analyst the whole industry has grown and matured to the point where you can have different teams specializing in different things um you can have people who are just doing top level analysis so they're taking the reverse engineering reports and they're tracking actors and campaigns those people don't necessarily have to know anything about programming you can have like junior or sock analysts who are doing more like simple triage type stuff and they maybe they know some scripting and stuff like that but they're not super interverse engineering you know there's lots of rules now that industry is so mature so no you don't have to know it uh you don't have to know how to program to be an analyst but if you want to be this kind of analyst you want to be a risk engineer yeah you have to know um what else do we have here either g hydra i think i already answered that um what else do we have here um all the cool malware authors use go oh my god don't spread that around go is a friggin nightmare it is a nightmare to reverse engineer um if you guys are interested in go uh oh no and no is that his name my old colleague alex no what is maybe that's his twitter handle here just bear with me here i have a tip for you guys yes all right what is your github alex show me your github so um alex one of the smartest guys i know a super awesome verse engineer he doesn't use f5 he just likes to look at the disassembler um he has recently taken on go and he's been putting out some really nice helpful stuff on his github i wish we could oh there we go oh it's his name okay well that's easy enough um so go to uh if you're if you're on github uh if you're reversing go um obviously ida released the new like x-rays released a new um a new version of ida that handles some ghost stuff but if you really want to dig into the internals check out alex handles github he has some nice stuff on it that'll help you with that he's a good you know good resource to go to also check out his twitter um if you guys want to know about go i hopefully i'll have him on at some point to talk about this he also wrote the ida python book um just a really a wealth of reverse engineering knowledge uh advice on transitioning uh lots of software dev and some reversing uh yeah that's gonna make it easy for you so basically the best way to do it also maybe the hardest way to do it but you'll go get good fast is to pick pieces of malware and go through the whole analysis that's my advice you want to get into this business you want to become a risk engineer quick male reverse engineer pick malware samples just go to malwaretrafficanalysis.net grab those zip files right let's copy this here so this is how you go how you transition into this if you have a reverse engineering and programming background if you don't that's going to be a little bit hard but if you do go here pick these you know download these files and analyze them uh google ask on our discord ask on twitter for help if you get stuck um and the best way i think uh to go from zero to hero is to look at other people's blogs on the malware and see if you can replicate it yourself and literally write out the analysis that you're doing that's i think the best way to do it again it's it's not the easiest it's hard but it's the best way to to get through is to pick a piece of malware analyze the whole thing write a report on it pick another one write a report on it i've seen a lot of people come into this industry um a lot of people i've talked to you know friends with now uh and i saw them do that exact process and now you know they're people that you probably see you know writing reports on twitter and stuff like that um is there an id plugin yes for go yes there is definitely use it um [Music] i don't understand what you xor there and how you figured it i'll look at it later yeah um the uh the jupiter notebook will be up there so you can play around with it yourself um and also you know you can watch this video and just do the same thing in ida and you can kind of figure it out you can also use the debugger for this it's not too hard i like to do things statically but if you want to use a debugger you can watch it actually xor excuse me each bite piece by piece might be helpful any groups for noobs starting out professionally the struggle is getting certified yeah okay again this same question in a different format and uh basically my my advice is if you want again if you want to be an analyst it's a little different but if you want to do what i do i can help you i can at least give you one path there so if you want to be a male or reverse engineer the best way to do it is to pick binaries pick you know grab dopple drydex or grab you know this thing squirrel waffle or something reverse engineer the whole thing in ida document every single function what it does write a little report on it and publish it on medium you know just do that and then pick another sample and do it and get some feedback join our discord join there's a reverse engineering discord um join those and ask questions share your report ask people what they think of it you'll get feedback people are pretty helpful in this community and you'll get good fast that's the that's the best advice i can get for this kind of stuff if you want to do that um yeah report writing takes a lot of time yes but it's one of the most important things if you can't explain what you found the findings aren't that valid you know they're not useful not that they're not valid they're not useful you need to explain what you found um yeah so anyway that's kind of what i would recommend getting certs helps you getting getting jobs you know you know if you want to pay for certifications it might help get you a job but at the end of the day if you produce a bunch of media medium articles proving that you can reverse engineer stuff it's going to be you know it's going to go a long way to getting you hired getting you a job um okay so i think that's it i'll give you guys uh one more minute any uh any last questions and then we're gonna kill stream uh is picking sample as a random thing i mean pics pick f you know pick popular samples if you want to if you want to start um if you want to start reverse engineering stuff pick stuff that's popular because then other people will have analyzed it and written about it um so that's probably the the advice i'd have again i would say pick some of those ransomwares pick like dark matter or pick revel you know those are those are awesome binaries they're not too hard they're not heavily obfuscated there's obfuscation but uh but you know they'll there's lots of feedback there's lots of public discourse about it so get feedback on your work quickly um what else do we have here medium blog or website oh i just say medium because it's easy i mean do whatever you want it doesn't matter i just mean publish it publish it publicly that's that's all that matters um so you can get feedback because you know if you're just talking to yourself you're not going to improve you don't you can't do that you need feedback um oh thanks guys yeah um i like doing these streams um so yeah thanks for the feedback uh we'll do more and hopefully i'll give you guys a bit more warning next time check out our github we have lots of nice tools on there lots of stuff that i've used here today we have on there go check it out um check out our patreon um i don't know what it is dot com slash labs is that right i think so uh i did it wrong look at me fumbling around here ah did that work yeah there you go look at that go support us look at those itis stickers and remember if you support us we'll give you a nice roll in our discord you can ask us questions and stuff like that um and also if you are like a diehard youtuber you can also like join our channel it's the exact same membership privileges as patreon it's almost indistinguishable except that you're on youtube um okay all right i think that is about it yes we will do more all right so with that i'm gonna throw us into the outro i made this uh i actually updated it last night with new music so we don't get coffee strikes so hopefully hopefully it works [Music] [Music] [Music] you
Info
Channel: OALabs
Views: 3,711
Rating: 4.98 out of 5
Keywords: ransomware, red team, conti, malware analysis, lice
Id: 9X2P7aFKSw0
Channel Id: undefined
Length: 80min 55sec (4855 seconds)
Published: Mon Sep 27 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.