Analysing Obfuscated VBA - Extracting indicators from a Trickbot downloader

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
any guys callin here hope you well so today we're gonna take a look at a word doc file which was used to deliver the trick box info stealer recently the particular word doc we're going to take a look at you can see here it's got quite considerable detection on virustotal 34 out of 58 engines detect it as malicious and I wanted to show you this particular sample because I found it really interesting the way in which the bad guys have firstly obfuscated their VBA code in order to kind of evade human human analysis but also just some of the techniques that they've used as well in order to execute processes under the hood I found pretty interesting so I thought worthwhile sample just to kind of spend a little bit of time on and you can see kind of my thought process behind how I analyze this kind of code from a from a code analysis point of view as well as behavioral as well so we'll dive right in I'm just going to use my Windows 7 machine here virtual machine and just look in my settings out of interest I've got the network adapter set to host only just because I can't be bothered if I'm honest infecting my machine and rolling it back I can just deal with the kind of network connection that I know this particular sample makes and it's obviously not going to reach its destination so I've got the the copy of the document here on my desktop and I guess from a processor from a behavioral monitoring point of view we just behave on analysis point of view rather we're just going to monitor the processes that are created and they get created and processes that exit within proc Mon just so we can see the kind of flow for what happens as let me just kind of move that screen over here and then also you can see as well I've got um proxy in my traffic suite burps we just to capture any networks activity as well so if I open the document you'll be able to see that it encourages me to enable macros and it says like loading content or whatever so that's a joke and if I click on enable macros and click OK we should see it try to reach out to the Internet which it did the aim of which is obviously to download its additional payload which in this particular example was trick bar so we can see here makes a request where it starts to at least two onyx tools com foolish life's public dot PNG so that's cool and if we look in proc minor what actually happened let's have a look under the hood here and I'll just stop capturing just to kind of speed things up a little bit we can see the windward spawns cmd.exe and there's a command line there where it invoked bits admin and it so it kicked off this bits admin request and you can see the various parameters which surpassed the bits admin and in the URL as well so uses this like live off the land binary as the known which resides in system 32 in order to pass it a URL in order to download and then subsequently execute so pretty easy additionally you can see as well that it was creating the schedule or tasks so if you like this maybe this a bit bigger and then copied all of this out in so I can text editor for example you can see the full command so it's like two commands in one you've got the bits admin transfer and then that gets written to the temp directory and then you've got a scheduler task which gets created and I think it's for like ten minutes after you execute it the scheduler task will then fire in ten minutes time to execute that downloaded file it's now stored in your temp directory so it doesn't run it straightaway kind of sits there for ten minutes and might be some kind of like sandbox evasion technique or whatever that it's using so that's pretty interesting so that's all it does effectively so let me come out of here and if we wanted to see like the actual macro code let's go into Visual Basic and macros we can see that the bad guys have password-protected the VBA and there's numerous ways you can get around this there's tools which you can use which will just rip all the VBA out for you and just kind of bypass it you could open it in Libre Office for example which will just kind of bypass that protection automatically or it's a method that I like to use is if you stick it in a hex editor hxd you can search for the string DPB and here is like the hash of the password effectively but you can invalidate it just change the beep DPB to like DPX or whatever hit save and that were like invalidate the actual authentication so let's go back into developer Visual Basic we can see we get like an error because it says right yeah it's got an invalid key now and we got an unexpected error so that's fine no problem that's what we kind of expect you can see we kind of get in now but we just need to just take the protection off put it back on again maybe set a new password I'm gonna write hello and what I like to do is just kind of close it down and save it and then reopen it again and then hopefully what you'll see is when we go into the macros type in hello we now didn't get any errors this time but we can access all of the VBA and you can see this is so so noisy it's unbelievable and it's really difficult to read as well so the bad guys have used like long long and very similar variable names there's there's an awful lot of comments in here which looked like Latin so that's just to kind of like really confused and it's just you know thousands of lines here which is just crazy of all code and we really our task is to work out like what's in this code that we might be missing from our behavioral analysis or does what we saw in that behavioral analysis you know completely corroborate what's in this code so I'm interested to know like how it created the the network connection I also want to see as well from an you know my point of view if there's any additional indicators we can get from it but also just work out like how it created the process of cmd.exe one of the first things I would do actually is look for the word shell and says like search text not found that's interesting so it's not like executing a shell or and look for the string run and you can see here that it's like in the comments or in like some rather random variable names but it's not actually it's not using the run command sometimes they use exec and the search text is not found again so I'm curious at this stage as to how it's actually you know how it's actually happening so ease of demo what I've done is I've just copied out all of that Visual Basic the VBA code into sublime just because I find it easier to read and navigate around if I'm honest with you so I'm not actually going to execute any code here because this is on my host machine but I'm just going to kind of start sifting my way through the code so you can get an idea I kind of clean this up and the mentality I go mentality I have when I'm analyzing this kind of stuff so this is the original I'm gonna work on a copy just out of interest sake you can see the first thing that actually the bad guys have done here is they defined a few things up at the top of the code so before in the kind of namespace here they defined a couple of function aliases and like anything with like kernel32 in it were just like raised alarm bells so you can see straight away that they've they've created this alias for create process a which is obviously a bonafide API call from kernel32 and in the same as well forget startup info so those two super interesting to me so if you want it you can just control and find them and you can actually see that on line 506 here this is where the create process string is be or the Creek process API call is being used and the second if you looked it up on an MSDN the second parameter to create process a it's the command line and there's various flags here as well they get passed in and we'll talk about them in just a second but the first one is this is the command line and you can see that command line is actually stored in user form one textbox to talk value so that's weird and you want to have a look maybe at user form one see where else that's that's kind of mentioned in the code we can see that only exists in that these two locations so we can see that the value is set in this particular line here for eight one where the there's obviously some string strings being built up and passed into this particular parameter this particular method of replace we're not gonna go into that into the weeds on that just yet what I want to do let's kind of take a stop a top-down approach and have a look at the code so we know that it's using Create process a and get startup info this is probably like fed into create process a so let's have a look when you open the document you can see here we've got this document open routine and we've got what looks like quite a lot of noise so we've got this big these big strings which are being declared here which looks just look really noisy and there's no science to this I'm just going to kind of presume at this stage that these are noise and I want to look for like where's the first meaningful line of code here and I can see here there's a call instruction so there's a call to this particular function on line 25 so let me just double check right so this variable ctrl + f8 well I only see it mentioned in two locations so it's probably noise the same for this one it's only like mentioned here it's not been referenced anywhere else so if it's not being used I'm just gonna kind of treat it as like as you know oops as potentially unwanted and it's just completely noisy so actually what I'm gonna do is I'm just gonna delete all of this just to make it a little bit clearer and we can see here that like you have like methods for town absolute there's some like your addition and subtraction routines exponential it's just it's just so noisy just like pointless code so I'm just going to delete everything that's in between it we can see there's another call to the same function that's quite interested actually and we can see there's more the same and I'm not verifying this at this moment in time I'm just presume and this is all noisy but you can see there's another call to another function and then there's a load of noise and then you can see that maybe there's some other stuff as well that again just just looks like noise so well let's have a look at first off this particular function that's called twice because that's pretty interesting I don't know why it's something would be called twice but we can certainly investigate that so the first function let's just make a little bit space here well it returns a long which is like a long integer and we have the same kind of pattern so there's like this big strings that being declared or defined here and we can see as well and not used anywhere but for that particular location when I'm debugging a function I actually start from the back and work my way backwards through the code so I start to see like what the function returns first and look to see how the return value is built up but you can see here that the return value is just literally the integer one so it doesn't matter what actually goes on in between all of this stuff it's just going to return the integer one so it's just a pointless function it's probably there to throw you off your analysis and to be honest with you we can just get rid of it so let's get rid of that and get rid of that and what we're left with here is this particular function call and so let's let's do to this particular function and see what it's designed to do so you can see as well this one returns a long integer and actually again if you can start to see the kind of pattern a little bit so we've got some really big long strings being defined and then some like what looks like noise in the same kind of pattern in between but if you see here one thing that sticks out is this split function so there's like a parameter that's passed a split which is the output of this particular of well it's this particular value or variable which is fed into this particular function twice and then whatever the output of that is it seems to be being split by a pipe symbol so that's interesting so let's have a look and see what's being passed to split or you can see actually if we control F it it's this particular long string here so this one is not actually noise and if you scroll right to the end of it you can see that actually there's an equal sign at the end of it so that's a little bit yeah a bit of a giveaway of the potential encoding that's being used so we'll treat that one as potentially we want to have a look at that the other ones we can see yeah there's some other usage maybe on that one so we'll leave that in for now and maybe some other uses on that one as well so that's interesting but let's for now because that's got an equals at the end of it let me open Chrome here and let's go to cyber chef which is my favorite tool of choice for kind of decoding and putting recipes together and stuff let's feed this string in I think I had an equal sign at the end there I copied the wrong one so yeah it's got an equals at the end and we'll say from base64 and we can see here that like we get like another load of garbage gob would kind of string together here but if you remember let's go back here it was passed through so this particular function twice so I'm gonna make a guess right now that that's a base64 decoding routine and it's probably double base 64 encoded so we'll test our theory I'm going to say from base 64 again and actually what we get is some plain text so that string there is that and it's you can see see it's easier to read here isn't it see program data Malwarebytes see program data and the security Trend Micro altero gasps t no idea what that is because Persky lab and mint are spells again no idea what that is well you can see that's a string which is delimited by the pipe symbol as well so that's really interesting so I think what we've got let's go back to our code here whoops where's it gone let's go back to our codes this particular function then is a base64 decoding routine let's just go and find that here in that in our code we can see that the function resides on line 14 and it looks quite noisy but I've no need to debug it because I know the output is going to be basics T 4 decoded so that's great I can just kind of like put some comments around that and say write base64 oops base64 decoding routine awesome ok let's go back to our our code of interest wherever it's gone scroll down here somewhere isn't it did you do let me search for this split now get back to where we were okay cool so this stuff here I'm gonna presume is noise but we've said that like these strings might be in use and then again we see some like square in whatever it just looks noisy so I'm just gonna remove it for now and you can see that actually here I mean there's a for loop which is going from 0 to 5 in steps of 1 and something is happening in the for loop well there's a little bit of noise I'm gonna get rid of that and we can see here there's an if condition and if block being defined so let's that looks noisy but we got a value each being set to 102 I'm gonna leave that for now in case that irrelevant and these look a little bit noisy as well so let me just get rid of these here and there's an elf ll score laws as well so let's just kind of tidy up the syntax a little bit so we can read it a little better so let's have a look at this for loop so for 0 to 5 what's going to happen is it's gonna check to see if a directory exists or not so basically if a directory does not exist so it's blank so if directory and it's gonna pass in each value in the array that we've just split so if we go back to Syrus yet it's going to pass in C Grahame data malwarebytes and it's going to say is that a directory or if it's not rather then it's going to set this particular value to one zero two and then it's also going to set this particular variable at the end here to [Music] till one zero two as well so that's again a little bit noisy but basically we're going to come away with either a variable having the value 102 if the rip if the directory does not exist or 349 if the directory does exist so that's interesting and we don't know where that's going to be used just yet and actually here's the end of the if statement here as well so it's probably like that and end if like that let me just get it right and that's probably your syntax right there okay cool and then there's like an if block let's get rid of this noise in the middle so if the value that's returned from that directory check you can see here is less than 250 for example so it's gonna be like 102 then then do something and then so it's going to set a variable here so let's just clean up this noise get rid of that that looks noisy get rid of that and also that looks a bit noisy and so get rid of this out the else Clause and I could be deleting code which is really important I don't know yet if I'm honest with you this is just kind of guesswork as you as you kind of go through it but we shall soon see whether or not anything you know is relevant again that looks a little bit noisy because it doesn't really do anything so let's just tidy up the indentations a little bit so if it's like less than 250 so the directory does not exist then we're just going to set this other variable to 137 if the directory does exist I'm going to set the variable it's a 3/3 one and you can see here that that variable 1 3 7 or 3 3 1 it's just passed in as a parameter to this function just down here and so actually that looks like noise so we can get rid of it okay that's cool and then so the first parameter is going to be like the output of where are we what are we talking about here so so weird how they kind of mess it all up right so we get the value of 102 here stored in this particular variable that's passed to this variable which again is packed I see so 102 is always passed as the as the first parameter to this to this function and then the second argument is either 1 3 7 or 3 3 1 which indicates whether or not the directory exists so a little bit bizarre but we'll we'll go with it so we'll say 102 equals paramah and parameters like whether directory exists so that's interesting there's obviously some reasons why like they're gonna miss looks or noise down here as well so there's obviously a reason why they want to check to see whether those directories exist and so we can actually look for that in our behavioral analysis to see whether or not anything changes later in the later so let's jump to this particular function so yet more noise to have a look at and we can see here that we've got similar kind of pattern being formed so we've got this string being defined but let's see if we can kind of make it a little bit more meaningful and see if see if we can work out what's going on here so it looks like let's just double check that this particular variable is not being used anywhere and it doesn't look like it is so I'm going to skip all of that but it looks like the pattern repeats here so we've got this variable and we can see it's been referenced in a few places and then you can see let me just get rid of this line this line this line these few here and you can see that like there's a variable name and then they've just like thought a stick an underscore 2 on the end of it so we've got this string which is being built up so this line this line and this line so one eight four five and six get concatenated together and then one eight seven eight and nine get concatenated together as well to form two different variables so that's interesting so these these is that's one variable and that's another variable and then you can see here that we have let's see what's going on next so we have a variable which being passed as a string passed into the string function so the first parameter is being converted to a string okay that's perfect okay let's go back actually cuz I'm not convinced let me just double-check the parameter here doo doo so this parameter equals oh I see I missed the step I beg you pardon I thought it was weird so this the first parameter of this particular function is actually whatever this equates to and I didn't check whatever this equates to right so apologies not just go go back on myself a little bit so I guess it's like part and parcel of doing some live malware analysis so let's just copy and paste this and see see what this variable is actually what we find that variable to be is a function so it's the output of whatever this function is and again this function is like super noisy and to reverse engineer this function I'm gonna look like at the return value of the function and see whether return values built so you can see here the return value is is like itself plus plus a variable minus a variable but we're in a for loop and so we can see here let's just get rid of these because these look noisy to me we can see there's one plus three and there's four plus five so simple Maps obviously and let's just get rid of these here because again I reckon these are noisy so let's just clean this up a little bit and this all looks like noise pop and the return value here at the bottom and so we've got a for loop which goes from zero to two like whatever that is 238 million and something but bear in mind it goes from zero to two three eight seven four zero nine one three so that's two three eight seven four zero nine one four times and what happens is we take our variable and add to it four plus five which is obviously 9 and minus three plus one which is obviously four so we're just five each time so what that says to me is we've got five times two three eight seven four zero nine one four times because it starts at zero which is this crazy number so the return value is this so that is actually what's being passed into this function and that is going to be the first I see so that's the first parameter as well so let me go back to my calculator here copy this crazy value so okay cool so let's have a look at this function again where we started so beg your pardon so the first the first parameter is that big value here from my calculator whatever that means and then the second parameter is like whether or not the directory exists or not okay so we'll come to that in a second and we worked out that there's like there's a couple of variables that are being built built up here and then so the first thing that happens below that declaration is our number a big long number here it gets converted to a string so let me get rid of these because these look a little bit noisy and then we can see as well what happens to the output of that string well you can see here it's used let's just get rid of everything in between will presume that everything in between is not necessary so what happens in between is we get converted to a string and then that string we're looking at we extract a substring of that string so we start at position 3 and then get the the next two values from it and then we get that and we turn that into a value back into a value now this threw me off a little bit if I'm honest with you when I first looked at this and I only looked at I've only looked at this I think twice now but what I'm just going to do let me just I'll come back to what my my thought process is in just a second but let me just kind of clear up the rest of this code here because I think we can pretty much work out what this is doing it just a few clicks here so I think this is all noisy [Music] like I said this is not scientific but hopefully I can see that same routine being used again this is the one this is the base64 decoding routine we remember from earlier so let's just get rid of all this stuff noisy as well so it's as simple a simple if-else okay let me just take take a step back then so our big long number it gets passed as parameter get to convert it to a string then we take a substring starting at position three and taking this and taking the next two characters so if you did that you're like your sense would say position 1 2 3 I would start at 1 water so I'd start up at the 9 so position 3 and get the next two characters which would be like 3 & 7 but actually what happens is bizarrely if you convert an integer to a string I don't know why and I might be wrong but I think like a trailing or a leading whitespace character is inserted for some reason I'm not quite sure why and so actually the number that we end up with here is 19 and I'll demonstrate that to you in just a second so we get the number 19 and then what happens is we take another substring so this line here where we see mid which is taking a substring of a string so this is the string we're going to take a substring from and so this is this big long string that's declared up here it's going to start at whatever this value is here and that start at value we just worked out and I'll show you is 19 so it's going to start at 19 and it's going to get the next value which is just another one position along so let's just have a look at what the value would be see you can see I've selected 18 characters so it's going to start at 19 and just take the next one character so that is that character F lowercase F so that's all that is returning so that's lowercase F and then so there's some other stuff that goes on here which we'll just have a look at in just a second but if you don't it's weird right that like starting position but let me so let me just kinda proved that a little bit like in anticipation created a document with some macros to show you what I mean and I don't know whether it's just me but so here's I've kind of replicated the codes so here's the the big long number and here's the here's the string that we're interested in so I convert the number to a string and I take a substring from it and I'll just debug print it then I'll take the substring of the the big long string that should produce the letter F and then just produce the ASCII value of it and debug dot print that as well so let me f5 that I need to enable macros sorry options enable macros that would help right so let's go back into that let me f5 it and you can see that it takes the string 19 and then it converts it into 102 because 102 is the the ASCII representation or the decimal representation of the ASCII character F lowercase F but let's just say I don't convert it to a string and I take the mid value to the middle value of one one nine three seven zero five four seven oh I started position three and take the next two characters and then just debug print up the value you can see here there's 93 which would which is what like your intuition says it should be so I don't know whether it's like a weird VBA thing that if you convert the number to a string you get a different result and whether that's because it has like a leading whitespace character I'm not sure maybe someone can answer in the comments but anyway that's what's happening anyway so we end up with the the value of F and if we go to ASCII table comm this is my favorite lowercase F is the decimal value of 102 and this will become relevant in just a minute so what actually happens is we can see here that if this particular variable here which is the second parameter that we passed in which is which was determining whether or not the directory existed so if it's less than all of these added together which is one to eight plus 56 94% e2 I took a gamble and said that's 300 in my head when I did this and allow me to just double check myself plus 22 is 300 so if it's less than 3-hundred then basically we're gonna base64 decode this value which is this string that's concatenated together and if it's not if it's greater than 300 then we're gonna basically for decode this particular string which is a second variable here which gets base64 decoded so that's interesting so we have let's let's start to build a recipe in cyber chef so let's just kind of exit get rid of all the stuff here I've got a big string that's been concatenated with that and being concatenated with that so we'll just work from this the first one so we know it's being base64 decoded and you can see here that we get loaded in separated by at symbols that's a bit weird so you can see right so after the if block we can see there's a split method which takes in whichever one of those variables were using and splits it based on the at symbol so let's presume all this is noise and get rid of it and then here you can see that next to the app next to that splits we can see we enter a for loop so again let me just presume that this is noise and we can see here we'll come back to all of this stuff this is really the kind of money shot because we can see there's a for loop which iterates over the length for that string or the length of the array sorry I beg your pardon that's now split by the apps symbol and we can see there's an X all routine here so let's split this and we can do it in cyber chef will change the delimiter ourselves and we'll join it with a space so we're going to split it and then we've got this for loop which iterates the array and then what happens is each entry in the array gets ex-ored with this particular variable and that variable is what we defined up here as the letter F and so what we need to do we can build so actually what we need to do is treat this as a char code and so it's in base 10 be sure to change that and we're going to XOR each of those each of those character code values with the letter F now you can cite if you want to you can type in utf-8 here and just put the letter F and actually what we get in or if you wanted to F in base64 a certain teapot an F in hex is 66 so whichever your preference is you XOR to the same value obviously and we can see here that what we get is what looks like a base64 encoded string so what happens is after that for loop we can see here there's another cut another call for that for whatever the output is for it to be sorry I beg your pardon so the output is stored in a variable which is reused the name of the variable is reused and the what's being passed here into this base64 decoding routine again is the output of the XOR okay so that's obviously super interesting so let's base64 decode it again so base64 decode from a 64 and actually oh I didn't copy it over so there so that's the output so that's the encoded command which has now been decoded by our cyber chef recipe so we can see that's exactly what we saw in our in our behavioral analysis so well you'll notice though there were two strings and whether or not that directory existed so whether those like program data Kaspersky Labs whatever whatever those directories were if they existed or not was dependent on whether we'd use this particular variable so let's get rid of this and we'll start to build up this one and add in all of these and you can see here that the command changes so weirdly if any of those like antivirus kind of folders are present we didn't vote PowerShell instead so it we're in rather than being CMD you would see that bit transfer would be actually executed via PowerShell and the same c2 as well so that's just the behavioral nuance and I'm not exactly sure why the bad guys have done that if I'm honest with you and again maybe one that you can help me answer and then you can see the rest of the code I'll kind of spare you if I'm honest a debugging of this but all the rest of this stuff and all that looks really noisy but you can see here there's some like this is where we got to earlier when we first started looking at the code user forum on textbox to null value and that's going to be fed if we have a look the value of that command that decoded command is now going to be fed into user form one text value and then there's some other stuff here which is probably not relevant but then here we can see the call if I just control an F it is the call to create process a and that's where user form one textbox to value which is the decoded command that we've just decoded and that's where that gets fed to create process and one thing to note as well and I had to go to a mentor of mine the cyber Cramer who should definitely tweet more and definitely do some more blogs and stuff because it's about time but I wanted to know what the process of creation flag of 32 was in decimal which in hexadecimal is 20 and he kindly helped me understand that it's probably a normal priority class which basically means that this thing is gonna just often run and execute and we're not gonna like invoke this process or create this process in a suspended State or anything which is going to be treated as a normal process and then some other stuff which is fed in here which are like pointers to the startup and things like that so you should definitely look up on MSDN the create process a kind of function itself and and you know all of the parameters that it requires but essentially that's the they've gone with kind of default parameters to get this this process often firing and then the rest of the code is essentially that's it this all actually looks like noise and so effectively that's all this code is doing we can see this is all the stuff that we've looked at previously as well and so we can see that during our behavioral analysis we saw those that those strings for the the antivirus vendors and so that's you know something that you can test for in your environment to see you know what difference it would make and maybe during your your analysis going forward you would benefit from making sure you reverse-engineer all of the code to pull out any of those kind of nuances as well so hopefully that's of use and you understand the kind of mindset and the thought process I go through any analysis techniques that are used to to look at malicious macros beaten just because they're really scary in how they're written and an obfuscated it doesn't mean to say that you can't actually pull out some key indicators from them so enjoy anyway thanks guys I appreciate all of your comments and obviously if you liked the video then please subscribe to the channel and also you can follow me up and follow me on Twitter as well at cyber CDH and I look forward to to hearing from you thanks guys
Info
Channel: Colin Hardy
Views: 8,857
Rating: 4.976048 out of 5
Keywords: trickbot, macros, vba, malware, malicious, code, downloader, scheduled task
Id: auB7mkwfHrk
Channel Id: undefined
Length: 36min 18sec (2178 seconds)
Published: Thu Aug 16 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.