*(char*)0 = 0; - What Does the C++ Programmer Intend With This Code? - JF Bastien - C++ on Sea 2023

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so my name is Jeff I'm going to talk to you about Star Car Star 0 semicolon so first we're going to get rid of the the lawn chairs and whatever put a nicer font and I'm going to talk a bit about what this talk is about I I got a lot of requests during the week to spoil what the talk was about and I refuse to do so um so my name is JF I I work at a company called woven by Toyota I'm Chief architecture I'm also chair of the wg21 C plus Evolution working group so the evolution of the C plus programming language and so today I want to give you a talk about a a interview question I've asked more than 100 times to different interviewers over time our interviewees over time and and it's a question that's fairly benign seems simple uh and and what's interesting it's a question that really doesn't have a right answer it's a very open-ended question and it's a question that I found uh for for systems engineers and compiler Engineers primarily to be fairly revealing of you know where where the boundary of their knowledge is which is useful right trying to figure out well I'm trying to hire a candidate ever since particular seniority or a particular band or whatever where where do they end up in that that series of knowledge and I end up learning stuff from asking that question as well so it's always interesting um and it's it's a good way to get you know someone talking about something they should know fairly well if they're in the compiler space resistance space um and either use the design breaker talk about it for five minutes or use it a whole interview slot for a whole hour or something like that so what I'll do today is I'll I'll pretend to be answering the question for all of you um I'll I made the content fit inside of an hour so there could be more content than this so if you're listening on YouTube and you're really upset that I mentioned didn't mention a thing just put in the comments or something like And subscribe it's always good uh and so so here's the question I'll I'll write this on the Whiteboard this isn't my handwriting this is an actual programming form but usually I'll go on the Whiteboard and write this down and I'll just say you know it main parens curly brace Star Card star 0 Pro zero semicolon return zero semicolon curly brace talk to me about this code this is the whole question right now I give them a bit of introduction I'm like there's no right or wrong answer I want to have a discussion about what the code is right you can take it in any direction I've asked this question a lot I don't try to just people put people on the spots for fun not that just for fun uh and so so you know I I put this on the Whiteboard and then I asked the person you know just start somewhere right start wherever you want and this can go in many many directions uh you know there's there's one two three four lines of code plus a new line at the end so there's like five lines of code it can go in a lot of directions with all those five lines that we have um and so so you know I'll start and I'll be like you know let's just think about what this is what does that actually mean what are you trying to do um well today I I'll just start by focusing not on the the big text but on the other stuff right like I I'm on the standards committee I know about these things let's talk about it uh so I'll start and I'll say well you know this is Main and knowledge sharing it can't be in line can't be said it can't be context but can't be expensive it has to be C plus plus linkage and then I'll say well you know it can't be deleted it's also something the standard says and it can't be a core routine also did you know the main Camp your quality that's that's fun uh and then you'll say well okay there's main main is the single main function it's the main entry point and calling Main in C plus plus is disallowed you can't from your program call Main you can and see though that's a fun fact you can call Main from C um and then you have a curly brace there which is wonderful and you say this is the main entry point and the global Constructors so your your globals outside of name happened before this um and then you say the signature of name could be different right not just parents but it could be something like int Arc C and ARG V and you know I forgot a star there so sorry for the typo but it could also be something like it's implementation to find whatever what else you have as signatures so on Apple platforms it also has NFP and uh star star apple in there or if you're in freestanding implementations it could just be whatever you don't have to have a main in a freestanding implementation which is also fun um so there's interesting stuff to be had here uh and then you say oh this is the normal program termination which calls exit with in this case Zero because I pass zero here but it'll pass whatever this value here here is to exit and this return 0 is implicit so you can emit it from Main and that would be totally fine as well it would return zero for you and after that you have a curly brace now the curly brace just cause the global instructors after it it's the best garbage collector ever it's really wonderful um so that's that's that's the code right I finished the interview you've gone five minutes good do I get the job well we can talk about other stuff but I'm gonna guess you were asking about not stuff around it right it's interesting it's it's it's interesting to share but I'm going to guess you're asking about other stuff there than just the main uh what other parts of the program should we talk about maybe the stuff I've grayed out yeah let's talk about that so let's focus on this part here first things first it's terrible code I know Matt godbold here likes it he's told me many times this week that he likes it but most people have told me why would you write a talk with this title it's terrible like it's cringe-worthy um okay well let's break it down this is an assignment right so you're assigning the value zero that's nice uh you're assigning it where where you're going to take this which which happens to be a character pointer and then what are you gonna do hmm well you're gonna have my clicker that doesn't work which is annoying there we go add an address cast from integer 0 that you dereference that's pretty nice okay that's that's pretty straightforward so what is that well and you have the statement end right so what does that actually mean this whole thing uh well it's it's a null pointer dereference right that's just what it is I think we agree on that yeah any objections no uh so then like you know it's undefined Behavior right I think we agree on that uh and then the next question is this one why would you do that but it's a good question right like the whole question I asked was talk to me about this code right and it's fine to ask a question back at me like why would you do that then maybe you should think about why would you do that that's a good question actually right so so so maybe the next question I'll have is like you know how does this code make you feel right like the question isn't necessarily about the code I just said talk to me about the code and and how does it make you feel well the people who ask me why would you give a talk with this title some of you showed up so so I think you you feel maybe you were in denial or at the beginning when you asked me why would you give this talk with this title and then I wouldn't answer it you were angry right but then you said come on please tell me and then you know stuff happened you went to other talks you saw Bryce's talk earlier you've been depressed honestly but then you accepted the the Star Card star zero equals zero semicolon and you came to talk anyway certainly some of you did all right but like the real question that I think you should ask yourself is what did the author expect when they wrote this right there's a variety of reasons why you would write this besides just you know trying to have fun with the interviewee um you could want a null point to the reference legitimately not know that this is undefined behavior and you wanted an opportunity reference um you might want a valid right and I'll get into that later this might be your objective you want to write to address zero that's that's completely a valid reason to want to do this it's invalid for the C plus plus spec but your feelings are you wanted to do this and you wrote it with valid intent and I want to honor that you might have wanted a trap I know Matt godbold told me that that was what on Sega you did you used to do that yeah okay so you used to use that as your break point yeah um but she he wanted that that's fine you might want to have this optimized to know code and in fact if you feed this to a clang with uh optimization at set at O2 it removes the code it just gets rid of it whereas GCC uh in some settings uh puts a trap instead right so that's that's maybe what you wanted you maybe you wanted to write this and get questions to see if the code reviewer was actually paying attention maybe that's what you wanted I I don't know maybe you just wanted to have fun I mean you're all here seemingly having fun so maybe that's what you wanted to do plenty of other reasons you might have expected to write this right so when I said talk to me about this code it is really like you talk to me about it what's what's the purpose what do you think do you really want to have fun well this is a c plus conference right it's called C plus what's on C the C part is SCA so not the C letter so you should talk about this is more of a c code maybe we should make this into C plus code right ah nice nice okay so any any one want to say something about this this is much better obviously right previous code compiled as this compile [Music] no no it doesn't compile it just doesn't compile it's just kidding with you what's the diagnostic [Music] yeah that's not allowed except the underscore one is it's just funny like and it's not all the versions of clanger just to do this right but it's the the lib C plus plus namespacing has it in underscore I just thought it was funny when it gave it to me so I just kept it there but it's just not allowed to test null pointer to car start um all right cool what do we do about this because like we're trying to have fun I think we're in the last section here what are we going to do well I I could do this instead right so we're going to go with C plus plus 20 right so I'm going to bit cast an all pointer to Car Star and if if you're not familiar with this with bitcast uh it's a function that's kind of as if by mem copy so you're going to mem copy null pointer into a car start that with this that's what the spec of bitcast does effectively it works at const expert time it's wonderful doesn't slice bread but it does a lot of stuff for you um much better code any objections to it no it's wonderful cool cool um Okay I lied again it doesn't compile why doesn't it compile oh indirection of non-volatile null pointer will be deleted not trapped and that makes you sad the compiler's pretty smart it's a warning or else it does compile but then it says consider using underscore underscore built-in trap or qualifying pointer with volatile what is this ah ignore this um now it is just a diagnostic and and I I want to be honest with you uh I lied I just lied but you know you could also tell to go away with ethno delete null pointer check so you could tell it like no don't delete null pointer checks it will stay there um but I lied this doesn't work because what I actually did to get this diagnostic is I did underscore underscore built-in bitcast with car star and all pointer um now I I was laughing at the underscore underscore built in earlier and now I'm using it myself right so the way bitcast is implemented is as if by men copy but it has to be const expert and the way you make it cost exposes with a built-in right so the way bitcast is implemented is with a built-in I have an implementation on GitHub from the bitcast standardization times that does all the stuff except contacts but you can kind of do it without using console expert but you can't do the context for uh without the built-in so bitcast in the standard library is implemented with a built-in and GC and playing at a minimum and what's interesting is is this a function what is it because it's taking a car star and all putter like that's the return type the thing that went in angled braces right and the null Twitter is the parameter so what's great when you're a compiler author is you can just make stuff up and when you create built-ins you can pass types as parameters to functions did you know it's really fun you can do all this stuff and when you do this it gives you the diagnostic that I gave you earlier right now I just said that bit chats in the stand library is implemented with built-in bitcast right so let's look at what that looks like because I have questions right so this is what it looks like in the standard Library and there's stuff over it it's pretty straightforward it actually uses uh Concepts which is beautiful right as requires and it has to be the same size to really copyable trivially copyable on both sides contacts per bitcast to from no accept and Returns the thing right fairly straightforward it's beautiful let's put a frame around it now I said that when you call bitcast directly you get the diagnostic that says you're going to do a null pointerity reference but when it goes through the stereotype Library you don't get that diagnostic now I've still passed null pointer to it and I don't get the diagnostic why is that anyone it's compiler bug as I and I haven't filed the bug because as I was writing the talk I I did that and I ran into the bug and I'm like well the problem and and I I think I know where it is so whatever client developer gets the bug I'm sorry but the reason I think is the way the Diagnostics work in clang is when you um when you emit the Diagnostics there's like two phases one for the templates and one for the non-template stuff and the Diagnostic opts in to be relevant for the first phase and templates or not and I think this one doesn't and so you don't get the diagnostic that's my guess what the problem is I didn't look at the sources that's just my guess what it is it's just changing the false to True somewhere adding a parameter somewhere and then it's fixed so if anyone wants to file the bug thank you it'll be great I was just too lazy to do it um but that's that's that's how stuff works under the hood right so let's go back to our call of built-in bitcast which we found really beautiful and it was saying this right so so it's still just a warning but I want to get rid of it uh how do I get rid of it any ideas turn out well and on without turning off the warnings okay you'll love this you'll love this you ready okay it's going to be amazing three two one ah oh it's awesome so what did I do right so I changed the null pointer to a zero which obviously gives me the same diagnostic I skipped that step but then I made it minus zero now you'll remember that in C plus plus uh integers sign integers or two's complement right so negative zero is just not a thing right but that removes the diagnostic now one thing I want to point out is I was using null pointer earlier and technically null pointer doesn't contain a value because it's a mono State type and so men copying from null pointer in my mind just shouldn't be really copying anything it's it's a it's a placeholder for no value being there for a pointer it's not actually a value itself where 0 is totally the value and so is minus zero now why don't I get the diagnostic for zero for minus zero when I get it for zero I guess it's compiler bug thank you thank you I haven't filed that one either but I I it's great and so I I think this is a guess for the great clang person who's getting the second bug uh and I'm again I'm sorry I'm too lazy to file the bug but someone can do it that'd be great um is is I'm guessing that the way the diagnostic is implemented is it looks for well-known values of zero in the abstract syntax tree and in the abstract syntax tree if you put something else like a minus or a plus or whatever else it just gives up now the compiler usually does it it won't do propagation of values right abstract interpretation or whatever else is for the the optimizer to do later a lot of those value prop and everything but it does do context propagation on the front end so plan refuses to do diagnostics that would require the optimizer but const expert happens in the front end and that's fine the problem is it probably doesn't do context per evaluation of the parameters at that point because it just doesn't want to maybe it takes a bit too much time to do that it would take time to do it and so it just sees looks for very obvious values zero and then just says well this is minus zero it's not zero so I'm just not going to go into that AST node and then I'm not going to diagnose for it it's the smallest of compiler buds probably doesn't matter uh but yeah the problem now is I get this right who saw that bug that's another bug right so what's what's the problem here it's not a compiler bug into sport bytes and car star is in this case eight bytes right and in every modern system I'm sorry for anyone else who doesn't have modern systems but yeah obviously I I car star and ins aren't compatible I can't bitcast one to the other because they're not the same size if you remember that's part of the the bitcast as of my mem copy you got a mem copy to and from the same sizes so how do we fix it well first let's go back to just using bitcast right because like that's that's let's not use the built-ins we want to be clean clean code we were talking about that earlier uh the problem now is is I get the diagnostic but it's not as nice as when I use the built-in right because remember the implementation of bitcast is with Concepts and I get this Pew of stuff and I don't have like Bryce did in his keynote machine learning to help fix that so I got to read through it and say oh well like there's these things there and then like that matches to this thing and it's that's that's why it's not working right so I there's a bit of stuff to go through but it's it's a decent Diagnostic and it does tell me you know car and ends here it does say like because size of car star is equal to size of int and evaluated to false right that's a pretty good diagnostic as things come it's not as legible as the one-liner I got earlier but that's the beauty of built-ins you can just have codes that gives you the diagnostic so when you see a diagnostic incline or GCC that's just terrible completely legible but it's it's generated for a built-in you know that someone was lazy in implementing the built-in whereas you know that whoever implemented the built-in for bitcast was not lazy and gave you a nice diagnostic right they went through the effort of doing that that's nice uh give a coin to your compiler developer um okay so what are we going to do next well to fix it I'm going to do this ah a I have my smart pants on right that fixes the problem beautifully doesn't it um does this compile mumbling it's after lunch I don't really know it totally compiles yeah yeah so we're gonna run ADOT out which is the name of the executable we got out of this and uh it says Ada terminated with signal zigzag V address boundary Segway and we party this is great is that what we wanted yeah I mean I don't know what I wanted out of it but we got we got it we got it uh so it's it's pretty good it's pretty nice um who's that what's this zigzag V right we're having fun here with an interview question as an anchor to have a discussion between us right so so that okay we're done having fun with this thing so what's that thing right what does it do well let's talk about the code like now that we're running the code let's not just talk about the code but the effects of the code what is that what is that thing like uh you know like there's the sex86 segments something something Mumble no whatever yeah yeah um yeah so we could end the interview here and then I found the boundary of my own knowledge which is like there's this there's segments in x86 32 thingy uh bounce stuff right maybe something I I don't know I don't know okay well let's go back we had a suggestion earlier right the suggestion from the compiler which we have to honor because we're nice people was please consider using volatile instead all right any feelings about this denial anger no well like you know volatile here my intent is just hands-off compiler right and I I'm pretty well grounded in standardized to say generally volatile says I'm going outside of the C plus abstract machine right like that is a legit use of volatile and you can believe me I'm an expert in volatile um Okay cool so so this does that and the compile told me to as well right so that's like you know the pile told me tubes yeah that's pretty cool um okay cool we're gonna run this right it's called compiles it runs uh a DOT terminated by signal Sig sag V address boundary error zigzag V good nice nice are we satisfied with this how are we feeling yeah Matt's satisfied nice well does it always do that like I don't know yeah maybe no I see some heads nodding huh okay this is my slide site ah actually no good job you got it cool cool so it depends on your perspective right so we talked about different perspectives without me calling it out but you know language compiler people instructions uh other perspectives Hardware operating system hypervisor something like that there's a lot of perspectives to be had in terms of what this ought to do right the language might say something but the compiler might think something else people obviously have feelings about it I'll say people feelings don't really matter in terms of what the code does but whatever the operating system might think something else the instructions the hardware so you know there could be different meaning um and really we've only just focused on language perspective and we've sprinkled some people perspective here and there um but you know like voltas has just hand off my code right so you know let's do some pseudo assembly of what that actually does right because I like pseudo assembly right so so this zero here you you store the value zero into register Zero Store the value zero into register one and then you store into R1 the value uh uh R is zero right this is just do what I mean it's my intent with using volatile right that's a valid intent I think right you just want the compiler to do whatever you mean and the hard way to do that and so on and this is pseudo assembly that I think is what ends up in most programs right okay cool um really though like does it does it that your intent is do what I mean does it actually do what you mean well not really always like it it won't always if your intent is to get a trap It won't always give you a trap now why is that actually like step back a bit what does volatile mean besides just hand off the compiler any clues talk to Hardware yeah external Behavior make no assumptions yeah so I have a list uh signal handling is good for for volatile untrusted shared memory infinite Loops said jump long jump avoiding speculation external modification do what I mean control dependencies also so I wrote a whole thing about this uh you can go read up but volatile has legit uses and like this is kind of one of them right external stuff is happening but really like when people get handed volatile uh the feeling they get when you're a new programmer and you get given volatile is this I think well new programmers won't know what I'm talking about I'm sorry I'm really sorry but I'm not sorry actually uh but the older programmers will know what I mean by this right like you get given this volatile sword and you're the start of your experience as a developer and and someone tells you like it's dangerous to go alone take this and then they tell you nothing else and you've got to figure out what you have to do right that's the game that you're playing when you do volatile um right so so now that we've talked about Zelda this is Zelda reference for the kids here Zelda one uh we're going to talk about Zelda 2. so actually what happens in Zelda 2 is is you get told this instead right now everyone in the room knows what this is right right yeah so people thought it was a hilarious meme when they discovered this in Zelda 2 because the translation in English was I am error and people assumed they assumed that that was a weird translation from Japanese to English when in fact like like this is just era right so so um basically when you're handed volatile you just handed a tool to make errors as Zelda has shown us right so this is the history we've learned from video games or at least I've learned from video games um and that's my view on what it is to hand out volatile car started people now think about this though uh uh if you're familiar with C plus plus uh there's this concept of objects living at places right and and when we're casting zero the volatile Carstar what's even there is there a thing at zero right like C plus plus is all about objects or a lifetime where they live and the effects on them and and my feel when I see this type of code is it's not a place of honor There's No Object there right you all agree no no no you know I try to trick you by now you know there might be something there right like in in the in the C plus abstract machine there totally isn't but we're doing volatile there could be something in there so what could there be well there could be a memory map register there could be an intro Vector table this is not theoretical right like memory register here at a low address would be like a VGA graphic card or something like that so I'm writing black into the VGA graphic card at a certain location or the interrupt Vector table next 86 real mode there's a 256 entry table at address zero and fun fact the the interrupt uh pointer at that address is the divide by zero Handler beautiful beautiful right so so if you're an x86 real mode and you write at this address you're dealing with the divide by zero Handler that's beautiful it's so is there something at that location well it depends right depends it could be something related Graphics could be related to other stuff uh you know memory registers are great what else could there be well in in there's a thing called webassembly on the internet uh in webassembly when you write to xero there totally could be something right there's no memory protection webassembly so between xero and the user stack there's there's nothing and so that right just works right there could be Memory protection and in which case you get a trap that's great right there's nothing but it traps Okay cool so even without an object there's something going on okay cool cool what else is there here well you know we're really kidding it's fine we like it um but you know like I joke but it's memory protection mommy daddy what's memory protection so we're gonna have to talk about memory protection now um so protection is important and and basically it's about virtual versus physical memory addresses right so physical memory I'll just say it's dram I'm hand waving here if you're not a happy comment on YouTube I don't care uh but but uh virtual memory is like the processes view of memory right so there's a the process has a view of what memory is and that's a virtual memory and then that match somehow into physical memory that's kind of dram right each process has a different view of its address space now I'm generalizing it's not the case in everything like like in DOS you don't have that concept everyone shares an address space an old Macs as well but we'll just assume that you have processes with their own address space virtual and they map onto a shared dransom app right uh so so the process view of memory is kind of arranged now it's it's not um what is it it's a span is it a span well I don't know like that maybe my knowledge breaks down here right maybe I'm going through this as a way to see like oh hey like I understand memory and stuff but it just kind of breaks down here just slap it down it's gone so okay what is it like it's a bit frustrating right um what could it be I'm gonna say something and you tell me what you think I think it's kind of like an ordinary map is that right for those in the back what it's too small right yeah ah well I'm gonna have to say it out loud or write it bigger um let's write it bigger and I'm not going to say it I'm not going to read it it's it's a thing with no honor for me to do that to you but I think it's kind of is like that right so so memory mapping from virtual to physical is kind of like that which is funny it's in Direction it's great right so can you imagine that you're running your computer and there's an unordered map under the hood just doing stuff right that's that's great like what are the API implications that unordered map right who would put an ADI in this and have to make it stable do you think this would be a good idea for your computer to have a strict ADI for an unordered map to run everything related to memory who who thinks the computer does that two people all right I saw a third hand three people out of a lot yeah fourth they're they're kind of scared to raise their hand because we like to talk about apis with C plus plus and we know that this stood unordered map API is kind of a problem right because it makes it so that unordered map is kind of slow because you can't change the hash table for a bunch of reasons and so it has kind of old and slower ones or sorry the the hash algorithm um but let's let's look at this a bit more right so so we'll write this code uh so basically virtual to physical kind of takes your process ID and an address right you're trying to map the address for virtual physical it'll return what I call the PT page table entry which will contain something about the physical address and other metadata information some things and I use optional to represent no entry right so if if there's no entry at the address that you've put for that process it will return the optional being empty otherwise it will return to you a page table entry that tells you the address in physical memory makes sense and a bunch of metadata about it right that's sensible that's what our unordered map does right so virtafiz inside itself just has an under map that does this it does a lookup right sensible that's how memory Works uh most of the time Okay cool so let's draw it right so I have this virtual memory so in this case I decided to just draw 32 bits because I'm lazy uh so you go from zero up to seven ffff for for virtual memory and on the other side you have physical memory and I don't know how much RAM you have on your machines but I don't have that much on my 32-bit machine so it goes up to there right so you have a bunch of virtual memory for every process every process has one of these and maps onto that right that's what the unordered map does that's what vert to Fizz does as a lookup and when you do a lookup each page in my example is four kilobytes or sorry Kitty bytes uh it could have been other sizes right so some I'm not going to talk about huge pages and other stuff but you're going to do a lookup of this address here for this process here at this location here and it's going to return a map into this physical address right and so when I dereference zero it goes up to the top tells me where zero in virtual memory lives and Maps it onto the physical memory wonderful it makes sense yeah or in my map that nowhere return is stood optional that's empty it might return something about memory protection remember we talked about protection protection is important and I didn't explain what it is or why and in general physical memory is usually belonging to a single process unless you love a map and you share entries between different processes right that's another fun thing that we can talk about I won't say anything about it further okay cool that's that's neat I like it okay so now we have this avert to Fizz thing right and I'm going to hand wave a lot more because you know we have limited time and and I you know if I if I try to explain how everything works uh they're all every every computer is beautiful and different and and they all work differently and so I'm going to explain a subset of how they work and overly simplify it because you know we don't have that much time but so basically PT this the thing I'm returning kind of looks like that and again if you're on YouTube you're really angry at me I know I put this thing called other bits right and I'm not going to talk about dirty I'm not going to talk dirty at the talk I'm not going to talk about modified but if you're angry in the comments just put stuff there it's great I'm not driving for engagement by the way I'm just going to delete your comments if they're angry um okay so what does this do so I decided that in my page table entry which again super hand wavy this is what's there now you have the physical address right which which is a 38 bits uh in there you have the page offset which I it's not really in a page table entry so let me have it but it doesn't actually make sense but there's this other metadata in there right now we talked about protection earlier that's the protection now again not every computer works like that and what we're simplifying but when you look up an address from physical from virtual to physical it also tells you if you have read write and or execute accesses right and wavy it's not always like that that it works but in the case of Star Car Stars equals zero I'm dereferencing and storing into that location so if I don't have right permissions to it it's going to fail right that's what this metadata tells me the computer is going to look say I know where you're trying to write and you're not allowed right or it could return an empty optional and say well there's nothing there so obviously you're not allowed to write or if I try to read from it and the read permission weren't set same effect same thing with I tried to execute from it makes sense yeah okay cool and then there's this other thing that I want to talk about which is present right so this one is kind of like if you had a page table entry you go there you'll finally I found it and you have this mushroom that's like thank you page table but our page is another storage location right uh again sorry video game references from the 80s uh sorry not sorry but like you know you can have fun with these page table entries and other stuff and in this case uh some some platforms do this where they say actually I'm going to put a page table entry but I paged it out to disk or something like that or to the network or whatever right so it might say yeah I know about this thing but it's not actually in memory so OS like before you go back refill the thing into memory and then tell tell this do a page table walk again and tell the the process actually your memory is here right so it's longer latency to refill the thing and whatever else all right makes sense cool so what does it look like so I want to implement this vert to Fizz thing and I'm going to implement a hash table lookup now are you ready for this because it's going to be beautiful what does it look like this unordered map lookup You're Gonna Love It I guarantee you're gonna love it it's beautiful it's amazing who likes magic numbers okay so what's what's special about like these four numbers here what do they have they're consecutive bits in the address that I'm trying to look up and they're all nine bits right and so what I'm doing is I'm extracting nine bits at a time right so if you remember I said that my pages were for uh Kitty bits right so the first line here uh FFF is getting four nibbles out so that's 12 bits right and after that I'm segmenting the actual address into nine bits at a time to do a five-way lookup into my map into my my unordered map this five levels look up unless you want to add a hypervisor then we'll have another level of a cup I'm not talking about that but it's wonderful right like this this is how at least on xcd664 in md64 sub this is how it works for Real uh so this this unordered Map works this way and what's wonderful is remember the question I asked you would you make this ADI inside of your Hardware it is API uh fun fact uh the the kernel has a a a a contract to honor with the hardware to do exactly this right if you look at the manual for your Hardware some Hardware has guarantees about what the layout is of that unordered map and what it looks like not all of them that's some of them support like page table walks fully supported by the kernel and the the the the the the hardware doesn't do anything there but some of them force you to have a certain layout so what does it look like I'm going to draw a picture and you'll you'll find that picture beautiful I'm pretty sure um so I took all these offsets that I calculated to show you where they are because remember I took nine bit offsets and then the the the the four nibbles and this is roughly what it looks like uh inside of every uh piece of x86 machine me you have a page a page map based register that's relevant to your process ID this is called the register cr3 and it points here at where the start of your page map is for that specific process right you have other page Maps down and up from this and then you take the offset the nine nine little bits that you calculated you know where the entry is and it points there and then points there and it points there and so there's every process shares this thing finally you find the base of your page and then you were trying to de-reference that address and that's the page offsets right the four nibbles and that's it that's where your physical location is cool so that's how our ABI works we just have a page table walk and so when I do Star Card star 0 equals zero and if there's a mapping in there I'll do a walk and I'll find a page page at address zero that might be completely somewhere random inside of physical memory not at zero necessarily and I'll try to dereferences it I also get the metadata bits or read write execute and I might be told this isn't readable this isn't writable or it might be told something else cool any questions no cool great all right so what else can we talk about well wait so so I have a question now every memory instruction is a hash table lookup like how often do you hash table lookups in your code not too often because they're kind of expensive right and I just told you that every memory instruction that your computer does is a hash table lookup that sound right sounds fast that's how it works it just is right but it's it's even funnier than that it really is I it's not that funny actually but um stuff kind of works this way I I find it funny but you know I'll be nice um so what's interesting is there's basically like three kinds of instructions right there's arithmetic there's control and there's memory right arithmetics like plus minus multiply divide control is a jump or a test or whatever and then memories load stores load stores are about two-fifths of all the instructions right so I just told you that there's a page to a a hash table lookup for every two-thirds of your instructions does it sound right yeah it sounds right I see you not saying no but it's right it's correct ish and the other thing is all those instructions are in memory right thanks to Von Neumann it's a Von Neumann architecture compared to a Harvard architecture so all the instructions themselves are in memory so every time you execute instructions you do a hash table lookup maybe the instruction is the load or a store two thirds of two-fifths of them are in which case it's also a hash table Lookup All instructions are are memory accesses in themselves the page table itself is in memory usually uh and so when you want the page table those little steps I showed you in the drawing that's also memory accesses and the hash table lookup is multiple accesses right because there are multiple levels each for nine bits and some instructions are multiple memory accesses as well so xav6 has for example rep move SV also called mem copy and there's arm that has ldm or STM called push pop and they can track part way through the executions you can do a mem copy of like a thousand things and that's a thousand memory accesses plus the memory access for the instruction itself plus the memory access for the pitch table walk for every single thing uh I I just I what are we talking about again Okay so hash tables they're everywhere they're awesome uh it's got to be slow how does it actually not slow down or it sounds recursive because the page table is in memory and like that doesn't work right does the page table page table so you can page table where you page table how does it work oh I heard a thing I heard a thing I'm not going to repeat it okay cool cool okay so what's the solution ready you ready for it this is the moment this is the moment you've always been waiting for you ready okay come with me let's count from three okay three two one oh oh that's great it felt awesome caches yeah so caches make it so that all the stuff that I described work and again I'm hand waving if you're angry at me it's fine it's fine we only have a little bit of time uh but there's a lot of caches to make this work right so thank you cash when you can um yeah so Hardware has physical wires that performs some lookups right the cashes are kind of all these cool little things that have wires and it's kind of a fancy Rube Goldberg device like when you look at Hardware it's like these things and you think it's digital but it's actually analog and the single kind of settles over time and it looks digital and it's the propagate it's it's really cool really cool there's a lot of them yeah awesome this is how everything works nice okay so let's draw it and again artistic Liberty here I'm doing box drawings with ASCII on a terminal so like don't hate me if you don't like the drawings uh so how do machines work well they have registers right we know about registers by now um and we're going to try to show some of the stuff that they do right so so you have these registers 0 to X which is called X and then uh in the hardware fairly often you have these shadow registers right that allow you to have more speed right there's a register renamer that renames your registers as you execute did you know the registers have secret names there's usually more Shadow registers than our real registers it's great it allows you to do a lot of speculation and then there's this great thing called IP it's special it's actually the instruction pointer uh TC whatever you want to call it it's special it's really special because it's an instruction portal and it points to the eye dollar and the D dollar is it dollar no it's icash and D cash sorry it's like the cool person parlance so I cash in decach dollar sign is cash funny if you see eye cache and D cash it's spelled that word pretty often and IP point it points into the instruction cache to know what the instructions are to execute that's a cache we have two caches that's the L of one cache let's call it that and again oversimplification sometimes between IP and I dollar there's other caches as well right this is the decode buffer and other stuff there's also like a huge checkerboard of instructions that are in flight so that you can kind of have full parallelism in your instruction instruction level parallelism so the shadow registers are used and when a shadow register is fulfilled dispatches instructions and other stuff uh you know there's books to explain this but just gonna ignore it around there's more caches there's more Machinery but then you know just these caches are really small right so what do we do about caches we have more caches yeah so level two cache much bigger much nicer so another I dollar another D dollar I cash the cash so the instruction cache sometimes usually is smaller than the than the data cache sometimes there's the same thing right but I'll just pretend that they're different because usually they are nowadays and we can put more cash more more if you look at the die shot there's caches everywhere right so there's some shove more caches in there um and then you know we're going to stop at three caches because I'm kind of running out of space and so I'm going to represent memory here so so these caches go into memory and memory is Big it's really big right the caches are like megabytes memory is gigabytes right great awesome and so where's this vert to Fizz thing that I had where does it live right so I said there's the pageable walk and all the other stuff where do you realize that you got to do a page table walk and all this stuff it's in the register is in the the instruction pointer the L1 cache L2 cache L3 cache memory no over there when you get to the network disk tlb oh I heard tlb translation unit okay cool well in my example my example world uh L1 and L2 cache will be addressed virtually right so when you take a pointer it's addressed virtually in my world it's not always the case and then the rest would be physically and so the translation has to happen here and so you do a lookup with those magic beautiful Rube Goldberg wires that I explained in the the level one cache if it fails it looks in level two cache if it fails it goes and it does a tlb translation translation look inside buffer and uh if it hits then you do a vert to Fizz because you have a hit in the tlb so you just do a quick lookup if it fails then you do a page table lookup the little drawings I had earlier and that might cause a seg fault that to happen here or it might return an actual address right so stack faults happened here so again star car starts zero equals zero if it's set faults it'll probably set fault here right or it might suck fault later all right so this is roughly correct but more caches right now there's other stuff to notice like you know here you see the executable bit you might check it here in the the L1 instruction cache State can I execute from here or you might check it when you fill the instruction cache because you know the instruction captures purpose is to execute and so if you try to fill a thing that's not executable you might just Mark that entry as invalid and then not look at the executable bit uh here when you try to read or write from the L1 data cache you might say what happens or you might have some read through a write true whatever here you might have so when you go from address virtually address physically is when the processor or the hypervisor matters right because each process has to have its own view of memory and if this if this side is addressed virtually L1 and L2 then each process has to have different entries into the cache right unless they share pages and the TLP lets you do that and whatever so they have different views of the caches different entries in there okay cool and again like this Hardware imposes say beyond the OS cool it's awesome okay so let's have some fun let's shrink this a bit was repeat on Earth I didn't put any other dollar signs in there sorry okay so let's make it more fun so this is how roughly Hardware works yeah I'm really hand waving here but Hardware doesn't actually just always execute like that so what if I run this under GDB that would be cool right what happens when I do that oh we could talk about it but GDB has to be able to debug things if I have a reverse debugger the reverse debugger has to be able to reverse all these things right so it has to capture the architectural State be able to unwind it and Advance one instruction at a time do a bunch of stuff so it has to have a bunch of cool stuff that it does but you can do more cool stuff like you draw Bots around this and then you all know valgrind right so valgrind has this cool tool that comes with it called Cash grin that allows you to submit the caches you could do that too right so you can run a whole fake Hardware inside of cash grains to simulate how the caches work and see all that beautiful stuff executing so you could in cash grind in GDB run Star Card star 0 equals zero and then like that's your weekend right there it like I know I'm old ish but like that's a beautiful weekend if you see that in action ah amazing and then like you know you can do other cut cool stuff like you could just run this inside of qmu if you wanted to qme was great you could run the whole thing inside a hypervisor whatever it's great right I like fun you like fun yeah you're still there so I guess yeah um cool okay so let's go back to our code here nice I'm not I'm not a fast typist I'm sorry all right so what does this code even do right like talk to me about the code what does it do well it really depends I think is what we've discussed so far right like it depends on a lot of stuff is see a direct mapping to Hardware not really right because it can get optimized out so people who are used to see get really angry at this code because the code gets optimized out C is not a direct mapping to Hardware um but what do you believe it does right what is this cool do it makes you believe things right those things might be wrong but it makes you believe stuff and what is the language specs say and and importantly I I know I share the evolution of the C plus language but does the language spec matters when reality is on the line not really either right so what does the language specs say that's a good question to ask but if you disagree with it maybe you go argue on rcpp or something but like it it might not be correct so maybe we need to change the languages back in some cases and what is even Hardware like I've described the narrow subset of what Hardware exists out there but it's it's not what every Hardware does right so do we really care like on some weirdo Hardware it might do something different and completely exciting right and but here I'm really talking about like some of the programming side of what the stuff that the stuff does but there's also another aspect to consider here which is the human side of it right I asked you a question what does it do and some of you it provoked revulsion right but I know it's a bit of a one-sided discussion but it can cause interesting discussion when you talk about this I I've had a lot of interesting discussions about this code um it tells me if you've looked under the hood right if I'm trying to hire a systems engineer like low level kernel person a compiler engineer they kind of need to look under the hood right so as an interview question it's a nice warm-up to see like hey pop the hood opens in there like close the hood it's a good way to see that um and also tells me if you can explain what's going on right like if if if we're interviewing you for a role that needs a lot of communication with other people right you're not just coding your thing and your code goes into a vacuum and nobody looks at it ever again uh if you can't explain what you do that's a bit of a problem if you're an expert in this field right not everyone is you should be able to explain at least some of this and you should be able to tell me like oh I don't know right at some point you hit a limit of your knowledge and you should be able to tell me I don't know so what does this code do it does a lot of those human squishy things as well which is kind of neat right all right cool well this is the code it's fading away now I think it's probably the end of the talk because it's faded away yeah cool well code's still there though okay well I'm JF uh so as I said I I work at woven Planet uh sorry woven by Toyota I chaired the C plus plus Evolution working group uh this has been my talk Star Card star 0 equals zero semicolon uh semicolon is important uh so yeah here I'll say arigato I company forcing me to do that have to say in Japanese so we have uh woven.tiot as our website we make Mobility software hiring we have places in Tokyo Palo Alto Seattle and Arbor Brooklyn and London um drew a beautiful little logo here that's pretty much it for the talk thank you [Applause] [Music] [Applause]
Info
Channel: cpponsea
Views: 251,041
Rating: undefined out of 5
Keywords: C++ Programmer, programming, in cpp, in c++, C programming, TLB hits, null pointer dereference, main entry point, reinterpret_cast, std::nullptr_t, object lifetime, compiler behavior, static analyzer, clang, compiler optimization, memory mapped register, hardware-defined registers, null address, x86, memory fault, system traps, CPU exceptions, precise exceptions, compiler reordering, JF Bastien, cpp, c++ on sea 2023, cpponsea, bit_cast, *(char*)0 = 0;, code, c++, programmer, 2023
Id: dFIqNZ8VbRY
Channel Id: undefined
Length: 54min 27sec (3267 seconds)
Published: Fri Oct 06 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.