C# JIT Decompilation Tips using WinDBG

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Very cool stuff! I didn't know windbg was so powerful dealing with .net

👍︎︎ 1 👤︎︎ u/Smanshi 📅︎︎ Oct 09 2020 🗫︎ replies

Can WinDbg change the executable and materialize that executable to a file?

👍︎︎ 1 👤︎︎ u/NetBlueDefender 📅︎︎ Oct 10 2020 🗫︎ replies
Captions
hi everyone and welcome to another video so today we're gonna be talking about c sharp slash jit the compilation tips in wind dbg so normally when i do videos about like jit optimization tricks and things like that i use a online tool called sharplab so sharplab is an d compiler which is very cool it has a lot of features first of all it's online and you can do very interesting things with sharplab that i haven't even covered yet but the problem is that it's an online two precisely so the connection to the server might be slow the server might be down sometimes and if you have something like for example a online presentation on online lecture that can be a problem because it's going to fail on you in the least expected moment plus it has some limitations that you cannot get around currently so for example it has a very um limited static types handling so if you would want to have something that you can do offline plus have these additional options to deal with these cases where shout love fails you can do a the compilation in win dbg so that's a really interesting debugger it's a very old debugger but it's a like a workhorse basically so it has just tons and tons of features so we're gonna cover just features that are connected to c-sharp and did the compilation so let's jump in all right so we have a class where we have three methods so it's called wind dbg test and we have some like uh like a function so we're gonna return a plus b and we have some two which is the same sum but presented as an expression and we have some free which is a loop that sums over the elements of the loop and it's very short just two elements just to show you that we don't need like the groups to make things happen but we're going to get to that so what we have to do in order to even use the compilation to jit to jit stuff really is that we have to call um all of these methods at least once in order for them to be able to be jitted first so let's compile this let's run this program and let's see now how to decompile stuff in wind dpg so i have my thing already loaded so what i have to do is just restart i have to run the program to completion and if you dealing with the class the simplest thing to do really is to check if it's on the heap so let's do dump heap first this will dump our keeps in c sharp and as you can see there's a lot of types and it's difficult to find ours so what we can do is we can dump the heap and what we can tell it is to find a type where it's a like the full type name or just a like a piece of type name so a bunch of characters so we know that we have wingdbg class so let's type in win and there we go we found our object here and we found our method table and a method table basically is the place where all of the type metadata is stored so we have to go there in order to be able to find what methods are declared on that type so let's go there so let's dump a method table and let's add a specific flag called please dump the method description and let's pass this address here so as you can see we have our method table and we have our methods here so they have been all jitted because we call them at least once and we can go to for example some and let's check it out what the method description really is so it's a place which contains a bunch of metadata that it's currently cached because it's hard to you know compute them at runtime so this is a place where it's all kept we have our il code address here and it's been quick jitted which means pretty much that jit has done a quick compilation without any optimizations but the result is really quickly right so it got to the result quickly and that means that the start of the application is really fast because that completion doesn't take a long time but let's first check how to get the io so there's an instruction called dump io and what we're gonna do is we're gonna take the method description address and let's pass it in so this is uh let's let's go here and let's copy this here and let's do dump i o let's pass it in and as you can see we have our il address plus some i'll code here so that's good and now let's let's go to again to the method description and let's click on the code address version which will dump the assembly code under this address because this is the address the first like assembly instruction so let's do it and as you can see we have our assembly code so we essentially have the same functionality as we have in sharplab since it's quick cheated this code is not very optimal as you can probably tell why it's not optimal because it uses the register and pushes them to the stack then it will do the sum and then it will return everything and decrease the stack so that's not very efficient you can imagine that the most efficient version would be to just do load effective address add these two together using this trick and then just assign it to the return value and that will be it but this is not what happens and let's for reference now check how sum2 looks like so sum to pretty much looks the same so there's no difference in assembly between these two techniques but let's go to some free so some free is interesting because it's a loop and already you can tell that it's kind of different because it's called optimized here and if we go here you can probably tell by looking at this code that it kind of is optimized because there's no bounds there's no different like checks the code is pretty simple and the reason why is optimized is because there has been a decision made by microsoft people that loops a method that contain loops are already optimized and the reason like the rationale for this is that they fall shortened benchmarks because if you have a system in production there's no like reason to optimize loops ahead because the systems take a long time to like run and complete so it's going to jump from tier 0 to tier 1 anyway at some point in time but if you have a benchmark code that does not behave like that oftentimes it's just a loop which will do a bunch of like calls and that will not reach tier 1 and it will have detrimental results to benchmarks and people will assume that that probably.net is just slow which isn't true by the way and it's going to pose problems so that's why the decision was made to optimize loops out of box so now that we know how to sort of replace sharp lab in an offline world for example we need to know how to deal with static types because sharp lab doesn't deal with certain types correctly all the time but we can here and you might imagine that there's a problem if we if you have a static type because it will not be on the heap so let's change this around this code around a bit and let's add static to the class and let's add static to the methods and let's do the following since we know how quick jittered code looks like what we can do is we can force tier one optimized compilation out of the box that's not really recommended in all all the places because sometimes certain optimizations are applied when we go from tier 0 to q1 if we're already going to start in t1 that's going to be a problem but let's force the compilation anyways because we want to check how the assembly code looks like now so let's do aggressive optimization here so method information aggressive optimization will force tier 1 optimized compilation so let's add these to the methods and let's change this around because now it's a static type so let's change this code just a bit and let's compile this it compiled really good and let's restart so as you might imagine if we do dump keep minus type win this time around we don't have any objects so you might ask yourself how we're gonna get to the method table now well what we can do is we can poke the module if we know the module then we can find all of the types that are loaded with that module but how we're going to do it especially if the module has like a really complicated name because we can totally dump the module but there's an instruction called name to execution engine and that instruction is really robust it has a lot of features it's it has a like a query syntax where you have to pass a module name and the type name but if you don't know the module then you can do a star and you can pass the full type name or again you can pass like a wild card for example so let's do like w right because we know that our type is called wind dbg tests so we found a bunch of thai modules and one of one of them is ours because all of them contain something that starts on the w unfortunately but you know it is what it is and now let's copy this module address and let's do the following let's go and dump the module but let's include the method table in it and as you can see we have a bunch of like metadata here but but what we're gonna care about is this method table here and let's grab this method table pointer to our class and let's again dump the method table with method description and we're at the same spot as we were before but now we have static types which are by the way optimized as well so let's click here you can see it's called optimize tier 1 this time around and that means if we go to the code address as we discussed before it's going to have a very optimized version of the sum so it does not push stuff to the stack it already does a load effective address adds one a and b and returns the result and that's pretty much it so let's see if the second example is the same it is the same so that's what we expected and let's go to the loop version as you can see it went from optimized to optimized tier one but if i recall correctly that doesn't really change the logic of the code the code is still optimized it does it didn't get any more optimized because first of all in this example we probably cannot do much but um still it's probably the same thing so the same optimization will run so now that's that's one of the ways that you can do it but uh there's other ways there's there may be very specific and less like obvious to you and they have much less utility but they're useful in like certain very narrow cases so for example what we can do is we can grab the metadata token and dump the method description based on that token as well and that token is really can be thought of like a unique id of the type and it's exclusive to the module so two modules can have the same token so keep keep that in mind when you're going to do the instructions so let's do um token to execution engine and if we're gonna do the star and pass our token uh what you're gonna see is that a bunch of modules share the same token but here's ours and this is our sum2 method and we have the code address here so we can go here another thing that you can do which is not really as useful as you might think is you can grab this address for example and call name to instruction pointer and name to instruction pointer will give you again the jit code address at the start but the problem is that you need to know the pointer to a instruction in a method which is very complicated as as it is right so it has its uses they're very narrow so that's the thing you can just do basically if you liked the video leave a like possibly subscribe leave a comment if you've you know feel that there's something that i missed here and i'm going to be doing more videos on mean dbg because this is just the compilation and there's much more to talk about in like the context of c sharp and windy bg so see you next time bye
Info
Channel: LevelUp
Views: 1,043
Rating: undefined out of 5
Keywords: C#, csharp, JIT, winDBG, decompilation, C# decompilation, tutorial, debugger, Just in Time Compiler, optimization, performance
Id: BaFquQ9YZYU
Channel Id: undefined
Length: 13min 55sec (835 seconds)
Published: Fri Oct 09 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.