Notepad.exe Will Snitch On You (full coding project)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in modern Windows 11 Microsoft added an interesting new feature where if you were to open up notepad that native installed by default and built-in text editor you actually have different tabs where you can open and work with multiple files interact work with them and save them but the interesting thing is if you close out of notepad and then were to reopen it you still have all of those saved in available ready for you to resume work just where you left off here I have two different tabs open something first some screen connect work that I was doing on this virtual machine and drivers Etc hosts over in the C Windows system 32 directory now again as you saw I could just reopen anytime I wanted to that notepad instance and if I wanted to start a new tab I could write anything I want he a little cheesy Please Subscribe but if I hey just navigate around I could close out of this and reopen notepad once again to see that that tab is still intact now we might have a wondering question how does that happen how does it get saved or does notepad cash anything that's open in a buffer whether it's saved or not well I'll answer that question for you pretty quick yes it is saved and this is a really interesting artifact that could be used for blue teamers Defenders hey folks doing digital forensics and incident response to see what might have been open in that text editor or even red teamers penetration testers and ethical hackers wondering if there's any juicy details that just happened to be left in that native default text editor Let Me hop back to my windows 11 virtual machine and I'll open up the Windows File Explorer I'll navigate over into the address bar and let me use an environment variable here just wrapped in two percent signs I want to go to app data but I actually want to prefix this and go to our local app data you might know hey there are a couple different folders inside of your user accounts application data there's one for local local low and roaming but local is really where we want to dig into and there is one directory here called packages now this has a couple interesting things for Native Windows applications or other things you just might have installed of course we might be able to drill down and find a Microsoft folder for notepad here it is Microsoft Windows notepad with an oddl looking I don't know suffix at the end there but if I dig into this we can find a couple other interesting folders and there might be some juicy details there now I want to bring us down this rabbit hole if I move into the local state folder there is one called tab State also a little bit curious hey what might be in Windows State these are probably worth digging into just as an aside but look I feel like I just got nerd sniped while I'm recording to go want to explore something new but tab state is interesting because it looks like there are a whole lot of different files in here all defined as a guid you can see hey some random heximal and that pairing of numbers and the set of digits there and these are bin files we see some end with a0o or a one before the file extension bin but these are all things that are artifacts remnants of that cached notepad buffer information now let me try to open one of these files up it's odd to me that I have this do0 or1 suffix before the file extension but I think I'll go for the 63 9744 whatever guid and I'll double click on this and it will need a program chosen to actually open that file the bin file extension Windows doesn't know what to do with naturally so I'll open that with Sublime Text my textt editor of choice and I open this up and it is a binary file right hence the name so Sublime Text will just show us the heximal representation of all the contents within that file on its own this isn't all that helpful or useful or interesting to us but we might be able to dig a little bit further and see some Oddities if I actually minimize this I'll go ahead and open up a command prompt and I will use one of the tools that we could find inside of CIS internals the CIS internal Suite you can find online now I have the CIS internal Suite available for me from the root of my file system seeon backlash and I want to try and run simple strings if we were on Linux that would be a natural built-in but on Windows we will need to rely on CIS internals so let me see if I can run strings on this file present here inside of that app data directory so let me copy the path I'll paste this in and I'll just hit enter can I oh sure yeah I'll accept the UL that's fine ooh this Returns the exact same data that we saw just opening that file in notepad actually looking into it it even includes the file name or the whole path to that file that was saved there is some Oddball data in here A C exclamation point closing parentheses whatever but we could genuinely see the plain text the raw data the values that were present inside of that file and we get to know the file I could do this for just about any of those gooid bin files that we'd like what else did we see in there there was a Zer FF and a c2a let's see what that Zer FF one is that is our Please Subscribe nice now naturally if I were to go take a look at c2a I'm sure you get the gist here that should be our ooh little bit of a happy accident there some Bob Ross moments look a zero. bin isn't going to give us anything interesting and that was the one that I was a little bit curious about uh how does that differ from our one. bin at the end there nothing peculiar in that but what if I were to remove that do digigit value and we just had the go that we were looking at previously did that have any sweet data for us yeah okay so this is the value this is all the text that was present in my etet ra hosts file again that includes the path for the file that was open and some nonsense in the data but don't forget I never even saved this please subscribe file I could make any changes to it and I will not even modify or click save on file here I won't hit contr s on my keyboard but if I go navigate back and try to use this command line tool to see what was that zero FF value for that cached file that has some other Oddball data that I'm not too sure what it is but look what if I were to close out of notepad again with that buffer not saved will that still make any changes here oh yeah Okay cool so it will track that down but notepad was closed there maybe it just flushed whatever data that would have been again obviously opening up notepad will bring that buffer right back to us so that means that Windows notepad again native built-in natural maybe just like a knee-jerk reaction when you just want to slap and save some data I don't know moving fast on your computer that will actually be cached and potentially an artifact that might be able to be used in dfir or red teaming pen testing anything that you might think of we could extract some of that data out with just simple strings like you saw but what about that weird random data that was in the mix now I was super interested in this again I'm nerd sniped so I wanted to go do some Googling do some research and see what the heck is this file all about what's it made up of what's the structure what's the format and are there any other tools that might be able to get that info out in a natural programmatic way and I got to be honest I couldn't track down much of anything right if I were to look for Windows 11 notepad tab State cache or whatever can I find any of those details I Google this looks like there's some chatter on Reddit uh Delete notepad cache open content from previous thing are there other things that might be present here autosave functionality right sis admin is interesting maybe there's some good details there as to the file structure of what's even included there oh this is cool this is the commentary is like look this is maybe a minor thing to be annoyed about but it caches anything that's in your last session I know this might be kind of cheesy but it's problematic because look anyone could just jot down passwords or credit credit card information or anything and even if they don't save the file it'll still be preserved we can dig into this hey maybe there's some other settings we could play with but there's no sweet detail as to what that bin file is made up of let me take a look at some of the other Google results here how to delete that cash I wonder if they mentioned the path they weren't able to find it in that Reddit post so let me see if there's any details here they're looking for it they say where is that cash well now you know and 11 replies is there anything else in here oh it might just be boilerplate Microsoft stuff yeah okay they're just running in circles troubleshooting falling down the rabbit hole oh hey someone mentioned it excellent they track down the path this is December it's kind of recent here are some Forum posts chatting about it uh again discussing the path o we could dig into window state I'm curious about that just as well everyone's chatting about the location but not how we might be able to pull out data from that file other than just strings like are there any dfir tools like if I add GitHub to the search will it find anything interesting I don't think so I got to be honest I haven't found anything that would work with these bin files so what I would like to do for this video is try to make some sense of the binary contents the real data values represented in that bin file and see if we could write some code to maybe cut that up and pull out pertinent information that would be useful to us again as a blue teamer as a red teamer pentester ethical hacker or digital forensics instant responder but before we dive into that if I may say look I try my best with videos just like these trying to get them out every single day always showcasing a cool demo sharing free education the way that I'm able to do that as frequently as I do and honestly the only way with free education is with some sponsorship I hope your understanding of that but please let me tell you about plex track here's a question for you are you feeling the pain of pentest management and Reporting trying to juggle data from multiple sources collaborate on findings or write a pentest report all this takes time time away from hacking and ultimately remediation there has to be a better way with Plex track you can go beyond penetration testing management and Reporting Plex track helps you analyze your attack surface at the asset level action all pen test and vulnerability scanner data in one place use context based scoring to prioritize risk and finally conquer the last mile of continuous validation so hey what does that mean for you faster pentest reporting time better collaboration AC across teams and with stakeholders the ability to address high impact findings faster and more effectively and proof of risk reduction as Evan pea the Consulting leader at Mandan said Plex trck enables our services team to deliver better reports in less time seriously check out Plex track I have great colleagues who use Plex Trak every day to manage their pentests and they're cutting reporting time in half you can use my link below to learn more and get a personalized demo so you can can start reporting less and hacking more huge thanks to Plex track for sponsoring this video all right so because I mentioned I haven't been able to track down any documentation for the format and structure of this kind of file we're going to need to dig into it just at the hex values looking at the binary data so we need a hex editor I am going to be using 010 editor because I think it's phenomenal and I want to learn and do a whole lot more with it so I have just installed the 010 editor and we can open that up and see if we can make a little bit more sense out of the heximal values that we saw in that Local app data path let me get to packages we'll get to that notepad folder that was in local state tab State and let's grab the full no.0 or1 method there and we'll open up bin so now we have this file open and we can look through the first couple bytes to start this thing off we can see what looks like hey probably our file path here see C users John H documents screenconnect diff do files yeah so that's what I was working with as the file name and we saw that in our strings output but I'm curious about what makes up all these structures what's in the bytes let me open up some of the other files just to kind of compare and contrast we were looking at that uh zerf example and that one just had our Please Subscribe data but I immediately see a delta in the first couple bytes here there's a difference where this NP little maybe Magic bytes the header to denote this is that kind of Bin file presumably for notepad right NP might be the start of this file 0 0 and a null bite there but a one that follows probably means this is a saved file where the zero probably means that that is a unsaved file it's a buffer just floating in The Ether not ridden to dis that's interesting to me because we have no file name included here and when we're getting up to the contents that's included in that buffer um there's significantly less data than what is present in the one that is Sav right because of course we'll have the full file path just as we saw and then some nonsense that's up until the point where we get to our actual content so at this point we could try and experiment with this data and maybe write a tool that we able to extract and parse out all the pertinent information and maybe do that in Mass across all of the cached dobin files in that directory now let me temper expectations here this is an attempt it's exploratory it's Discovery based I don't know and I haven't finished this idea again I don't know if any of these tools exist I don't know if there's any documentation on the file structure that we're looking at here if you know please let me know in the comments cuz I'm fascinated by this but maybe this is a cool exercise maybe it's a cool activity and maybe something that you want to try and do just as well so I'm going to get back into Sublime text my text editor and I'll open up a new file and look I know it would probably be really cool to write this in a compiled language be all hipster do something in goang or something in rust or Nim or whatever but for the exploratory sake to prototype some stuff out we could play in poke in Python right so let me do like a notepad parser dopy um and look we'll import the OS module so we could go ahead and check out our environment variable for our Lo Local app data that will be our app data directory and then the path that we want for all those files in there right hopping back over to Windows Explorer we could just grab this from the directory that we're in in the address bar here we could say the uh directory relative or whatever could just simply be this as a raw string we'll paste that in and then we could do some os. path. jooin for our app data directory and our direct relative directory location will that give us the full path here I'll simply print this out just to sanity check and I like to use format strings with an equal sign here so we could see that value displayed yep that is looking good oh but we aren't getting packages actually no that isn't looking good it's from exactly that it's like we're starting from C packages that that's a little uh gimmick and idiosyncrasy hey when you're up against OS path. jooin it'll think this is an absolute path because of the back slash prefix there so if I remove that run this again I'll hit contrl B that will give me the full path and now I want to be able to actually glob for all those files so let me glob we'll import that and we'll do glo. glob on an OS path. jooin of our full path with just an asterisk do bin let me look for any bin files there and now can I display that list of Bin files that we can store as a variable list of Bin files will that work here yeah okay so it will extract all of those files out now I can do a for file name or let's do a full file path in that list of Bin files and what we could do is just simply check if that full file path ends with a z. bin cuz remember we didn't really care about those uh anything that started or ended with a zero bin or a one. bin in which case we'll just go ahead and continue so now I should only have the full file path where we have the full context there yep just those bin files that we were working with previously we can extract out the uh file name just the like absolute path or the base path I think is the right wording there that's path base name that's the one for our full file path and we'll print that out let me see what will that display just fine uh not really no oh we need our full file path variable I passed the wrong one there we go now we're getting just the goid values looking good so we can do a with open full file path we'll read that as bytes here as a file pointer or philp I like to say uh and we can say the contents will just be that red in and I know this is bold here but we could just honestly like print out the contents uh but I will break this Loop just for uh temporary in our testing to see the value come through so spitting this all out okay now we have the contents the raw binary and the hex values really representing the data and that format file here looks like this is the very very first one because remember that fourth bite that we saw was a zero for an unsaved buffer the others when it had a one there was like a written file but we could actually extract out some of this data and make that kind of interesting right when we're working with the these files here as we open them we can say the magic bytes are the contents up to the I guess third bite there that should print out if I just validate that super duper quick yeah okay that's the very start that we were seeing for these files the next one was kind of a check here if it is a saved file or not uh that would be the fourth value correct um is save file that should be zero no okay so I'm ahead by one that should be a three uh that is zero and let's start to have at least some print statements for us to be able to play with we will go ahead and print out the file name as we're kind of looking through each of these uh let's make that an F string again for the sake of our playground and now that we could uh determine whether or not we're working with a saved file or an unsaved buffer we could add some logic maybe we'll have to handle things differently again depending on how these all are structured but that varies based off of whether or not that bite indicates it is a saved file or an unsaved buffer this is where stuff might get weird and hairy and uh I again haven't finished this all and don't exactly know because I can't track down document maybe I'm just making a fool of myself let's keep going let's focus on if it is a saved file because in that case we could extract out the file name right if it's a saved file we'll do something but if it's not a saved file then we'll do something else so again let's just display this so we have our debug information we can comment out later on if we want but in that case we know that that is not a Sav file if it's not a save file uh and we won't break in this case we'll comment that out so we'll see that ooh okay the data that we're working through look I only am going to care about these now let's go look back at our uh 010 hex editor because it looks like there's some data that follows on a bite representation of something well no okay our C value is in there for the absolute path of the file location right and then the file name that we could carve out now I'm curious because this selection that I have here that file name kind of he piece of data goes up to a 00 B4 currently and I don't see any other b4s in the mix so let me just experiment with that for a moment to see if we could track down some of the pertinent information to help extract that out um let me say that our contents from our what is it we start at this which is the fifth value I believe if we slice that up to the point where we see the bytes of B4 I'll add the X denoting that um we might be able to see that full file name correct just for the sake of carving it out the problem is that errors because that probably differs for the other cached by binary data but at least for our 63974 for good that will pull out the file name curious part here though is that this data like if I were to actually kind of take a look at this obviously we have a bunch of null bytes in the mix separating each and every single character so that's likely utf16 but if I were to get back to my terminal and let me just open up python for a quick second we have this data that I can paste in as all that stuff we were just looking at the length of that data is the length of our file name right multiplied by two because of the serious amount of n bytes in between every single character so we divide that by two and that's 53 okay super duper cool but looking back at our 010 editor that we were missing after the four bytes that we start with and then the fifth by stuff starting our actual file name there's one in between here and that is hex 35 now if I go to take a look back in Python what is hex 35 it is oh no I did that backwards right it should be 0 x35 I'm not taking the decimal value and bringing that to hex I want to know literally 0x 35 and its value that should be 53 so that bite what we would zero base index as the length of the file name is present there in that fourth bite can we track that down we could say length of file name if it is a save file remember we'll drag that down should actually be our contents on the fourth bite now what I could do is display that super quick again just for our knowledge but length of file name will go ahead and tell me that look we need our contents from the fifth character as we're starting that index adding in the length of our file name times two because that is again utf16 with n bytes in the mix so I believe that should just be our original file name right from what was really genuinely opened up inside of notepad for that cache file can I display that can I now that I've smartly been able to track down that information could I print out our original file name as we've extracted it yeah perfect okay so now that works smartly with not just our c36 whatever 639 good for the screen connect diff data but also for our drivers ET said ra host but we will need to decode that as utf16 so our original file name could actually equal that value decoded as utf16 right will that work fingers crossed yeah now we have our original file name that we can properly carve out we should you know what actually try to break up this data let's multiply out a little divider here and then let's uh wrap this in our F string to get the variable name included how does that look cool now we can properly carve through and cut the file name out of all of those files let's actually move that print up because uh it'd be sweet to do that for every single uh iteration that way our zero not saved file when it's just an unsaved buffer we'll still have the uh dividers in the mix whoa did I do that wrong oh it's uh not keeping track of the0 or1 Bin files so let me put that just before we work with the file name that looks a little bit better cool okay at this point we've accomplished practically nothing we can extract the file name but the contents are the really interesting part here and I think this is where stuff gets weird because the structure that we were looking at again that random data between the file name and the contents might be variable depending on the file that we're looking at I've seen this act and behave really strangely and again I'm waiting to hear if you've got any insights please do let me know in the comments but if I pivot back to our 010 editor we know that we're at least up to this location right we've got the file name carved out but now up until the point where we get to our data there's this weird in between from that B4 hex value up till our screen connect content start something that might be interesting though if again we know that we're looking at the length of some data like we just saw for the file name how long is the contents of the saved file that we wanted to work with in this case we have uh okay you can see it down below notepad is telling me about 143 characters but maybe that's not a good test maybe we should kind of hey have a clean slate to at least experiment with this in our own again exploratory setup here so let me create a new file we'll just call this I don't know anything. text I don't care and we could say literally anything as the contents let me move to our uh tab State here looks like that is a new file bd32 let's remove all the others and uh oh yeah okay they're open in notepad so if we try to move on it'll it'll error because these are again the cache files let's close out notepad and then try again on all of that let's make sure we can remove everything other than that BD 321 that we wanted to play with I'll move back to Sublime Text and I'll try to run this now there we can see our anything. text is properly being read in 010 is going to start whining because hey now it doesn't know where those are but if I actually open up our BD file we've got a smaller sample set to try to make sense of right remember we were trying to read up to like hex B4 for a second but that structure is gone and now we're getting some other weird things in between the file name and the contents and it varies it's completely different I don't know if that's a timestamp or some file metadata like NTFS file system properties and attributes it's seemingly random every time every single time that I look but you might have a different experience please I hope you try out this example in this exercise too but we can try some deductive reasoning here kind of as I was alluding to this anything. text with the contents literally anything is 17 characters long right now now when we were experimenting just a bit ago uh what is 17 in HEX and can we see that value hex 11 anywhere inside of our contents the bytes and the data here I'll move back into our 010 editor do we have hex1 I see it there again immediately following after our file name that's worthwhile and interesting but I'm curious what happens if I expand that like can I paste literally anything over and over and over again what happens with that okay 010 wants to reload it so good it is now going to do that uh the total length of our file now is about 25 characters so checking again in Python 215 in HEX is hex D7 can I find hex D7 just after our file name yes I can you can see it right there so that works but now has a 01 at the end of it presumably a 05 would follow every single time but I don't know what this is going to do if we move past the 255 like bounds to a regular bite right hex FF if I paste my literally anything sample over and over and over again now we've got 864 characters what is that going to look like if it's in bytes let's reload in 010 end of our file name here we have e0 06 up to 05 um what is that if I do e0 that's 224 which doesn't make sense it's not filled up to like hex FF 255 multiply that by hex 06 that's 1344 so it's close to the full length that we're seeing kind of that is not super duper clear to me but I think again if we kind of look back at 010 we might be able to get a little bit clever because I noticed and maybe you tracking this down just as well when we get to the end of our file name we get to presumably some length or something that matters some way somehow maybe not exactly the full length of the file contents but it has a 05 and 01 delimiter and that seems constant if we go look for the start of our content just before it is e06 and e06 which again is just what we saw for presumably some delimiter before our 05 and a 0 one0 z z many things there um if I toggle this and try to modify our file contents again here I'll just delete some portions uh that still has I don't know some data on El at 420 characters nice I like that a lot we'll reload this looking at the structure again we have a435 and getting back to the start of our content we have a403 um and then some slight variation of a303 oh man so it's still not super reliable but the 0 1 000000 and then that length value is something that we could use to denote the start of the content and the content will run and continue all the way down until the end of our sum data and some trailing data that varies every single time now I don't know if that's some CRC or check some or something but if I manipulate this I don't even think I have to like modify the contents if I resave the file just hey create a space Delete the spaces save the file we'll see this will change and I'll have a new value there at the end that's always seemingly six bytes so I think we can get smart on this for a saved file if we go back to our python code we'll go ahead and take from the original file start value which we could probably save as a value file name ending can be that math that we'll go ahead and use as our file name math ending value there starting from there if we take the contents from our file name ending sliced up to our contents actually we should use that as sort of the starting point to then find the index of our hex 05 right we can make that a delimiter offset I suppose this is still super weird in which case the special delimiter could be the contents from the file name ending up to that delimiter offset and we'll print out our special delimiter see if we can get the value for that oh great nothing okay so I must have mistaken that do we have a delimiter offset that is properly working this is still dumb this is why you should have like an actual file structure and not try to script it in Python but I'm not smart Okay limiter offset is two so we need to add that from our file name ending right cuz we're we're slicing here so we'll add a plus to that so we get to the values now our special delimiter is hex a43 that's what we're looking for so that special delimiter will at least help us speed run through that chunk of data in between the file contents and the file name that we don't really know what it's doing or if there's a Tim stamp or if there's any unique special data in there but again every time I save the file look I'll add another space again save this data reload this has that changed did that change let me add a space Oh yeah cuz we needed that for the 420 always looking out for the 420 reload that value differs but the D4 Shenanigans in the mix that is not what it used to be save again reload yeah that it's F4 whatever so it that does vary and I'm not quite sure how but we can use that special value before 05 to at least reliably build out the start of the file contents so the delimiter offset and the special delimiter we don't really need to display but we we could use our contents from the index of these has to be by strings here for our oh I want to use an F string so I could put our special delimiter in the mix but I can't mix the bytes and an F string together so we'll use a format string with percent s in that case little bit weird right but I I think that'll work and then what would be hex 01 hex 0000 hex 0000 and another hex z00 and then once again our special delimiter that will format in with our percent sign uh to slap in our special delimiter special delimiter up to slicing to -6 right because we don't care about the last couple of values that aren't relative to our real data oh no the space is that hex 20 is something that we do want so I think it's only how much is that four characters five characters I think four the very end because the 0 might denote the end of the file but UTF 16 will need the zero following the space so I'm I'm getting in my own head and not making a whole lot of sense of this but I hope now you kind of understand just as well where this gets a little bit weird so let me use that 04 at the end to Define this as file contents or our original file contents right does that work can I print this out as our original file contents displayed on the screen contrl B to run this seemingly that worked getting to the very top of this we start with oh hang on our a403 01 and all that stuff is still in the mix so we'll have to add that out uh or make sure that we add the length of our special string oh gosh what could we call that what what name should that have like an egg for egg hunter stuff when we're trying to find how that kicks off we can uh cut that value and then call that I don't know start of file contents and that's kind of the offset there but if we look for that we'll get the index of that start a file contents and add that in to where we start slicing from start of file contents my goodness this is confusing to say out loud and I think we'll go to neg5 now presumably so can I run this no I broke stuff oh you're right you're right you're right um that is literal btes and we need to get the length of that uh so let me change this to Len adding that on super weird this is messy this is dumb if you write this in rust or whatever I'm sure it's better but now we have this and that is our file contents in utf16 so I could then do original file contents equals original file contents. decode utf16 or just put that at the end of the line above but this is already messy and gross as it is oh and I wrote utf8 here no I'm an idiot that should be UTF 16 run this and that is our data literally anything literally anything literally but is that reliable that's my next question like if I change this to hello I am John and now I run this script hello I am John okay cool what if I add a new line and then run that code oh no something's already broken why what is different about that let's go back to 010 editor reload hello I am John the error was that it couldn't find the subsection so we cut out our ending that we don't exactly care about but when we get to our file name as our anchor the length that we see 11 up to 05 do we find another 11 just before our contents we see a 10 why we still see the same 0 1 0 0 0 just before the actual values but what makes that different why did that change what if I add another new line I just hit enter here reload and 0 and0 now we have the length of 13 what but just before our contents is 11 is that something we can use as an indicator like the 00001 and then some static value that is or or some same value between the 0 1 0 0 this is dumb added another new line just to test it now it's 15 and then 12 12 0 1 0 02 maybe that's reliable maybe the same value and then the 0 1 0 Z and then that same value maybe that is a good way to start what if I just copy paste at a bunch of these bring me past hex FF or 255 characters in length if I reload again that gives me e802 and before that are 010 Mech and then the same value set up to that I so wish there were documentation but we could if we were getting stupid and Scrappy use that as our anchor because I don't see any 00001 in that strange structure after the file name so what we could do is change this to determine a different special delimiter where we're not looking for what would come before 05 anymore because that is seemingly unreliable but instead we look for the 01 and then the text that is duplicated after right so let's change our delimiter offset to not look for 005 but look for hex 01 0 0 can we get that and let's uh clear out all that other stuff that we don't need to see quite yet do we see this yes we do so from the delimiter offset we slice from the delimiter offset actually we could call that like a delimiter start right and then a delimiter end that would be our 0 1 000000 0 We'll add that in validate that we can determine the start and end which should give us four bytes ultimately right no oh shoot yeah we have to move from that location so it would have to be uh plus our delimiter start right no again do we have to add one to that so we can keep like moving forward no oh no we have to add the length of the limiter start or what we were searching for so it has to be two or three to get ahead of it no dang it oh wait a second I have that backwards it should be 00001 that's the problem we don't even need all that so cutting from there there we go start is 45 end is 51 that will give us six bytes which ultimately means those will cut out the start delimiter marker and that will give us four and we'll determine what those bytes actually are yeah so we'll get file marker to equal our contents from our file name ending and we can actually add these right let's do that uh ahead of it so that way our delimiter start will actually have be sort of baselined after the file name ending same thing with the limiter end and that way we'll be able to extract out the file marker from our delimiter start plus two referring to the length of what we're looking for to find that delimiter up to our delimiter end and that will give us the file marker that we're ultimately looking for that EB or e0 thing right E8 E8 that's the one so now we cut that half and we say the file marker is really going to equal the file marker up to the length of the file marker cut in half now our file marker oh yeah sorry no divide by two outside of Len now our file Mark okay string indexes must be integers yeah just make it an integer use the uh integer division two dividing Signs Now we have e802 which is the safe and sane so far as far as can tell real world delimiter that we could use from our contents given delimiter End plus 4 which is the length of what we were looking for to start with actually no we don't even need that I think we can make that a variable first to uh start with so start of file contents should equal our limiter End plus the four values and we should Define this as a variable I like hey what we're searching for to find that plus oh no we'll just need to index that I think that's probably the smarter way to do that so we say the contents from that location index to our file marker will give us the start of file context value that we're looking for no unless we'll no it is it is right after it so that is correct so that using a delimiter versus an offset versus indexing and trying to hunt it down really doesn't matter we just add on is that like a static number length every single time am I just again making this more and more complicated for myself I don't think so if we add the length of the file marker we could just use that as file contents on its own like up to neg5 is what we used to cut the ending was that right yeah neg five I feel like I've just over complicated that and solved nothing so let me try to display play the file contents now and that works sure again we'll decode that we'll make these original file contents steal that line back over there but I'm not sure that solved for anything like I could try to print this out now reliably with what the actual contents are but if I tweak this again say anything to start we'll run that that validates it found it um if I add more to this run it again that works let me add a new line run it again that works another new line still works okay maybe I'm not crazy am I just getting like too far inside my own head let me add another character another and another I'm seeing that reliably work I'm seeing that be able to track down the contents and this should work thanks to the way that we wrote Our script because we're globbing through all the files for any of the tabs that are open inside of notepad and the files that exist here so I think we're like looking good in a scrappy dumb way to carve that out but I totally acknowledge that we haven't even touched things that aren't a saved file and things that are just a unsaved buffer the only sticking point that I have with that like as if I were to go hey again back in the notepad let's do our hello my name is John I could open up a new created tab buffer it is completely different and in some cases doesn't even have the content staged until I close out of the notepad instance let me see if that will work I'll close out here reload now we see the contents but because there's no file name to extract out it just immediately gets into some content somewhere okay probably based off the length that we could use to track that down that that could work and again subtracting five random values we could write that and maybe that would be reli iable but what if I open up notepad again and made changes adding content how does that look in 010 editor okay none of that is present really so again i' need to close out notepad reload that see it an action but let me up the anti because again this is an unsaved buffer so I could say if I were to delete some content doing something new if I reload this it actually adds on to it and I'll again probably close this while I reload that okay that will see it in action that's fine but remember this is only in the case where it is less than 255 characters in length as that can be stored in one bite so if I open up notepad add a new line expand this to a boatload of data close it out what is this look like now that's not bad okay maybe we could write that maybe it is is pretty reliable as long as notepad is closed we could still probably extract out data and if I just delete something and cut it chunk at what the size would have been there reloading this not closing notepad yet uh what did it do did it do anything close out notepad it reloads significantly less data and that probably works fine so we could script that just as well the start of our content should follow based on a length again sort of like a a magic or special delimiter that will be the same length as other bytes added to in between a 001 and a 010 d0 B would that work just the way that we have it like even without a file name kind of to start with I think that might just behave I'm trying to think about the edge cases where would that go wrong let me make another giant file where I just copy and paste all of this all right now we're at like yeah 20,000 characters close this notepad reload the data that we're up against is ea2 ea2 and that is what we would be looking for with the 01 up to the S of our data oh no it is exactly that again so we could check and validate that that data is all the same of what we're looking for but ultimately that could still carve out the value sorry I'm like making sense of this in my mind trying to understand this again driving in the dark without documentation to figure out real structure but I almost feel like that same logic would work if there is no file name like what we did just a moment ago with 001 and 1 0 I feel like it would work so if it's not a saved file file name ending or where we kind of continue off of would just be five right I no just probably from the start of the file itself so so let me toggle that logic move this down and if it's not a save file then we'll just say the file name ending that like offset doesn't even matter and it could just be zero will that work we've got our anything anything anything anything anything from our anything. text file and then the other the unsaved buffer is actually being processed totally fine holy cow that's so dumb let me open up notepad and let me modify Scrappy notes in the moment let me say oh Bank password is super I love apple pie yeah don't save it I'm not hitting contrl s on my keyboard or hitting the uh menu to save it but if I close out of notes notepad 010 should be able to see that data and so should our Scrappy Dumbo tool for even the saved file and the unsaved file Windows 11 notepad is literally going to snitch on you for whatever you don't save in the text editor now okay look I know that I just fell down the rabbit hole there publicly and you could just do all this with stupid strings but that's you know it's not as cool and as clean as something that we could build into our own tool and maybe we could make this a little bit more fleshed out a little bit more Polished in python or rust or golang and yeah you could using like real structures and actually carve and process the file hey super quick John from the future here sorry to interrupt but look I did want to say I've been trying to Tinker and play with this a little bit more and it is still finicky it might not work in all situations but I want to get some other folks input on on this and I asked my editor Nord Garen and he is much smarter than me and he started to try and Tackle this project in Rust let me make it clear it is still early in development and I think he's still running into the same walls that I have been in this video finding weird idiosyncrasies and just odd behavior especially for unsaved buffers but if you'd like you can find it I put it online so far in Nord Garen on GitHub uh tab State util is the name of the project uh and again literally only hours of work into this far and they put a note look this is still an early development we're just still figuring it out does it work LOL cwy we're Mage may vary I appreciate some of the Kudos here and you could look through the read me and the code if you would like but ultimately if really you're up for it we'd love to crowdsource some of the intelligence some of the info is there documentation for this have you seen any other tools that do this there is still work to kind of figure out but look it is now in Rust if you wanted to dig into that codebase uh it's a little bit more structur has a bit more context and a little bit beefier stronger and sturdy than again my like shooting from the hip python code if you look into his code though just as well you'll find a couple of the sweet little Easter eggs where again we're just trying to make sense of this let me be clear that the copy that I completed in air quotes to show in this video did not handle or work with any of those unsaved buffers when notepad was still open we always had to close the file to be able to reliably get data out of it he has started to dig into some of those unsaved buffers when notepad isn't writing that to dis or if it's closed out of the application so again more stuff that you might be interested in digging into and please please please we'd love all your input okay that was just my quick interjection I didn't want to interrupt I'll let you get back to wrap up the video but I thought hey this Scrappy exploratory thing was still kind of fun and if some way somehow you were able to make sense of it and follow along with me hey Kudos hats off to you all but I think this might be some cool tooling to again seriously used for dfir digital forensics instant response or red teaming penetration testing and ethical hacking and I don't know of any tools that do this out on the internet yet so that I hope is a cool thing to Showcase in a video thank you so much for watching I hope you enjoyed this video and if you did please do all those YouTube algorithm things and especially please give some love to our sponsor Plex track link to the video description and hey for your next pen test hey maybe you can use a tool just like this and you'll use Plex track to crank out the report Lightning Fast I'll see you in the next video
Info
Channel: John Hammond
Views: 172,663
Rating: undefined out of 5
Keywords: cybersecurity for beginners, cybersecurity, hacking, ethical hacking, dark web, john hammond, malware, malware analysis, programming, tutorial, python programming, beginners, how-to, education, learn, learn cybersecurity, become a hacker, penetration testing, career, start a career in cybersecurity, how to hack, capture the flag, ctf, zero to hero, cybersecurity for noobs, ethical hacking for noobs, networkchuck, learn to hack, how to do cybersecurity, cybersecurity careers, notepad
Id: zSSBbv2fc2s
Channel Id: undefined
Length: 53min 30sec (3210 seconds)
Published: Wed Feb 28 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.