OS hacking: A better main() for SerenityOS C++ programs

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
well hello friends welcome back to the program today we are going to explore some interesting new ideas about the main program function in users based programs so let me show you my motivation for investigating this so we have this new thing that we've been doing in the code base called try and try is a macro that works much like the question mark does in rust and it allows you to evaluate an expression and if the expression returns an error then we short circuit and return from within the try uh and if it succeeds then we move the result of the expression into a new variable and uh it requires that the return type of the function you're using try inside supports the pattern and we have this error or template type which you can use effectively with this so basically it's either an error or it's a templated type right um we have been using this for the last two months and it's kind of been expanding it's seen a lot of use in the kernel already and also a lot of use to handle exceptions in lib.js but only some scattered parts of user space are using it so far and after the interview i did with brian just the other day uh when we were talking about maybe bringing more error propagation patterns from the kernel into user space um i was thinking about ways of doing that and um one issue that we have is that when you don't return error or so say that this would be a function that just returns um i don't know like an onputter k string right then you don't have access to try because the try macro has no way to extract an error from this expression or from the return value from this expression so you you become kind of limited and instead you have to start doing stuff like this where you say like oh this is either a new string or an error and then you have to check is air blah blah blah and then handle the air and we do this in lots of places and it's really cumbersome and we really want to be able to just use try as much as possible so all of that to say that i would really like it if our main function would return error or because that would allow us to use try and main functions and then it could easily call other functions that return error or and it would just be this natural propagation path for errors and if we look at this one right here for example which is the id program that just prints out your user id and stuff um as you can see it does a whole lot of system calls on startup and just has to check for errors and print out the error if something goes wrong and then return a failure code right uh and wouldn't it be super neat if you could instead write this like something like this uh i guess like that and then you could do all these different ones uh and so on down the line here and we would just get that short circuiting behavior by default of course this would necessitate a main returning error or int say so that you can return an arbitrary integer or an error and this is obviously not the typical c plus way of having a main function and if we try to return something other than an int from main we would most likely break things for ports and we don't really want to force every port to serenity to like customize their main function and stuff like that so i think what we're going to try instead is calling it something else and then we will have a wrapper function that that calls our special main so so that we can have our main looking sort of like this and let's call it serenity main right so something like that and then of course wherever you're doing anything that can return an error or you would now be free to use try and um in our wrapper function which we'll put somewhere like the real main we'll put that somewhere else it will just be i mean it will it will just look like a normal main right so um i guess it just has to call serenity main actually let's just declare that here so we can call it um rhc-rgb and then if result is error maybe we'll print out an error like i don't know runtime error result error return 1 for failure otherwise simply return the result value um that's basically what i would like to achieve today um and i think we can also do better than passing these sort of ancient style arguments like this it would be really cool to pass a list of string views instead so that you don't have to like iterate over the arguments to find out how long they are and stuff like that so while we're at it let's make sure that the new main function prototype is flexible enough to support something like that so maybe actually what we'll do is main argument maybe something like this so like you can still get the the real ones in case you need them for something but we can also pass in a span of string view arguments something like that and he will just take those and we will have to create that list here so let's see arguments um append arch should be um i yeah there we go something like that and we can actually pre-size that vector to avoid unnecessary allocations if possible because we do have rc right so we can just do that and then we can simply call this with um a bunch of inputs so rc is rc rhv is rgb yeah like that um this is really this is really unusual but i feel like by doing this we can we can drastically improve the ergonomics of our user space program entry point and just the ability to like you saw how these things looked before right like we had these um can we get a preview of that um well like we had these huge uh um function calls here that had to check for errors of each one and so on and actually if we scroll further down you can see this id program actually already has a bunch of this manual error unwrapping and all of that code could now instead use try so this would immediately become available to us here um like try actually we don't even need that temporary so we can just do like that and same thing here like that it's pretty sweet don't need all that extra braciness there we go so that's also a super nice improvement right like it becomes much more condensed and really it's expressing the same thing and then we simply um we have to make sure of course that we return a good error so that the wrapper main function this guy right here can print out an interesting runtime error but the error type is thankfully it's something that we control right it's our own error type so we can put whatever we want into there if we want to improve it with more metadata like source locations back traces stuff like that all of that is things that we can add and by switching this to using a a struct type or class type that we control we get a lot of flexibility because we we can just change that thing and it will apply everywhere and all right so here it's using rgc and rgb so now we would get those through the arguments object that's passed into us instead um and then there are no obvious other things here that would use it but we could certainly imagine like you can imagine creating more helpful wrappers so i just use these sort of imaginary wrappers here these don't exist yet um these are things that we would have to add so like a version of the unveil and pledge this is called that return error or error or void i guess so yeah this is sort of what i would like to do but let's actually make it happen so um first thing is first we need to stash the sort of the real main function somewhere so let's make a library for it let's call it lib serenity main lib main yeah libman that's fine um and let's copy some cmec file here because i always forget how to write these so what do we need in main we need a main.cppis um and this will simply generate libmane which will have a single compilation unit just with the main function and it will use libsy uh and then the idea of course now is that as we transition programs over to this model we would simply need to add libman as a link dependency so that they link with libman and then they get main that way instead of having their own main there yeah okay so let's bring in that header lib main um oh we didn't make a header let's call it let's call it main.h touch main.h okay and let's put one of those and i guess we'll put this in the name namespace um okay and outside of the main namespace we need to also declare the serenity main function because i don't think we need to put the serenity main function in the main namespace that would look a little bit weird um but it would be something like this so error or int serenity main um main arguments and then we have that arguments struct like that let's say and this is so neat um i i feel like this is going to become really neat um okay so then we don't need to declare that and then we can just yoink this thing put it into main.cpp capital m main.cpp no wait did i not create that file oh here we go mm-hmm i can't say this is something i've never done this kind of thing before so this is kind of uh this is kind of interesting um but i i definitely feel like we shouldn't let um i mean we don't let c plus plus conventions hold us back anywhere else so why do it with uh the main function okay and then we also need a format for that format string stuff okay um actually since we have ensured the capacity we can even use uncheck depend because we don't we we know for sure that there's enough capacity in this vector because we've insured it so unchecked just uh is a simplified version of a pen that doesn't check if we need to grow the vector yeah that looks pretty good and then how would we link this into um the id program so let's see i guess we just add a custom dependency here target link libraries id libmane okeydokes and then we also have to add libman to the list of libraries that we build so lib main sure all right does that build i guess i probably have some stuff here that won't build um so this one is called main colon calling arguments and then you didn't like what did you not like okay so it doesn't like those system things because they don't exist that's fine and let me just reload the cmg project here i think we are good all right so so far so good um now let's create these things so system colon colon unveil that it feels like a really natural thing to say so i feel like we have lib system right and currently what lip system is for is it's the library that's allowed to make calls so it's a very very simple library that makes us calls um maybe we should just let it be and maybe we should let it be and then put these into lip c instead um at the same time i i don't like the i i don't love the idea of putting more stuff into libsy that's not like libsy stuff so maybe libsystem is the place for this so let's see um we have libraries lib system so it literally only has this is called that cpp but um i guess we could add something here so what would we call it we would call it like wrappers maybe wrappers let's try wrappers um okay this name might not be perfect but we've got to start somewhere so how do these things work well we know that we want them to return error or so we need docker air or void i want to look at pledge and unveil so these things right here i want these but i want them to be like this okay and then since we are being comfy we might as well use string view instead of um construct there we go yes um and then pledge and unveil the implementations of these functions need to do going wrapper cpp okay so lib system wrappers i'm wondering if we should put this in in libsy inside but i feel like no let's not put them in libsy um let's build a better interface to our system of calls so [Music] we need syscall.h all right so these uh parameter structures that we pass through the kernel um they are now string using since we're using string we don't have to do stirling here that's really cool that's something that always annoyed me that we have to stir land these things now we don't need to because we're taking a string view so already a nice little improvement we no longer stir land all these things cool and same thing down here so unveil okay [Music] um and then right and then okay so now we have to handle the return value from the kernel so if the kernel returns less than zero then it's an air no error encoded as a negative number so error from air now minus rc otherwise return a basically void so if you have air or void this is how you return the void because we can't we can't literally return void right because void is nothing so this is the the way that you express that um same concept down here so minus rc and return like that all right so those are our wrappers let's see does that work in practice lib system wrappers uh that didn't exactly build oh right because we didn't add that to libsystem lib systems necklace okay [Music] okay so that totally works very very nice and then let's see what happens if we actually do something that fails here so um i don't know i guess we can just try to let's try to unveil something that doesn't exist after we've already sealed the veil like here so if we try to actually if we try to unveil anything after that it should fail um so something like that of course it will compile but we should get a runtime error oh we get one already here because it um i think our um shell script etsy shell rc has does id yeah it does id to figure out uh which prompt color to use so error erno equals one so that's actually like a really not great way to print that out [Music] we can definitely improve that like we can show you the actual error string and um we should also um so what we had previously here right was that it was doing pair pierre unveil so it would tell you which system call failed and now we're kind of losing that we're only showing the only showing the error know not which system call it was so that's probably something that we could we need to we need to make the errors richer so that they include that information so one thing at a time though let's just do these slip system wrappers first of all so let's commit those because they kind of work standalone anyway lib system add fletch and unveil wrappers that return error or um void yes these will be a these these will be more ergonomic to use together with try okay and then let's add libman i guess so uh libraries live maine i have to make sure to add it to the libraries list yeah yeah lib main add a new um library add a new library for um a more ergonomic user space entry function for more ergonomic use based entry functions um this patch um by linking with the main your program now enters in program now begins execution in serenity main instead of program no longer needs to provide main instead execution begins in this function or int serenity main main argument this allows programs that link with lib main to use try in already in main are literally in their entry function without having to do manual error or unwrapping uh this is um this is um very experimental but it seems like a really seems like a nice idea so let's try it out yes let's try it out okay and then let's see let's improve the errors that we get so well let's let's improve the serialization of error first of all so error error.h where do we have the who formats these things rcpp no wait where did i put that format.h maybe yes okay so this guy right here provides the serialization for error codes so as you can see if it's an air no error then this is how we serialize it so i think we're going to have to do a little bit better and actually hmm actually we don't need to do that we don't have to use this we can also do better in main um actually that's probably probably what we want to do is just do better here because it's nice to have a simple serialization function and then um since we might want to get fancy here anyway and dump out like more metadata let's not screw around with the simple format string serializer so let's see uh if result auto error let's say result release error okay so now we have the error um and then wait what are you complaining about oh it's assigned but not used sure that's fine if air is air now okay and then um result no no error uh code so we stir error that code like that and then we need stirrer otherwise otherwise it so the error class currently is either in air now code or it has a string literal so those are the two types of error that we currently support we could do something like this like print out the actual number that's something that i miss sometimes that you don't actually see the number so i think for this first cut let's actually do it and we'll see how it feels okay so that's the one we're getting already there so yeah runtime error operation not permitted air no one that's pretty cool um i quite like that so the thing that's missing now is that we don't see what's this call it was right but wouldn't it be nifty if we could see that um so i do feel like these uh these wrappers they could return an error that also includes that information and um let's see how we would actually do that um so error from airno we could also say error from erno let's just call air now let's say from cisco air now maybe even something like that and then we could include the name of the syscall so pledge in this case and the return code that feels like that feels pretty expressive right so then we also pass in rc uh as it is instead of negating it here and then we let um we let error from cisco do this so let's see let's add a helper for that static error from cisco this is called name and rc and then yeah something like that and at this point we can actually oh no let's let's go all the way let's just call name and end rc i guess we can put the string literal we can reuse the string literal string view of course for the syscall name and uh code is minus rc in that case and those are upside down but we also need to know that this is a cis call so we should probably have it we should probably have some kind of a type thing here anyway so let's just say for now that this is like syscall something like that true yeah okay and then this will necessitate rebuilding a whole lot of stuff so let's just start that up um and then what would maine do so by the way like none of these things are perfect or final we're just discovering things as we go here i'm 100 sure that we're going to improve all of these classes we've already been improving them iterate or like incrementally and that's just part of figuring out new patterns like this is that we just have to take little steps at a time and not sweat about perfection constantly but just try to try to do one step at a time um so in our case i guess if error is syscall then we have because there are also a bunch of things that are not syscalls right like standard c library functions so runtime error um we could we could say like just call error or kernel error it is tempting but let's just let's just say runtime error because that's what it is we could also say error actually i don't know why i say runtime error something about runtime error sounds kind of nice um so maybe something like that and then we need the error string literal that's this call name and we need the stir error for the error code and the error code itself oh and then wait did i screw that up in wrappers here from air now oh this is called of course yeah that feels really nifty because then now you you can do this and these wrappers will be friendly enough to just pack the name of this is call into the error object and notice that what we're doing is we're using a string view literal so this will be um a compile-time string right it's not a an allocated string in any way there's no string copy that occurs it's really it's really just going to turn into a pointer into read-only data in the executable so it's a string literal basically i think that would be pretty nice hmm these are i mean these are already so much nicer than the um the c library equivalent because uh like if we look at those guys right here they return an int that you have to check for negative errors uh and then it takes these null terminated strings so you have to make sure you pass null terminated strings very clunky compared to this thing right here very very comfy maybe we should um we have this return with air now macro that helps you basically helps you do stuff like this we could probably fashion a macro that does the same thing um let's see how does that one work so it's just like that yeah yeah well we'll probably want a macro but we don't need to add the macro right now maybe since i have to wait for this anyway we can just sketch out a macro so what would it be um return handle return value let's say this is called return value um so it takes rc and it's just call name i guess obviously let's do those in the opposite order and then what does this do well it just says like frc less than zero then well then return error from syscall just like that otherwise return do hickey okay and then that would be handles is called return value um pledge rc i guess that's pretty nice yeah that's kind of nice okay let's just rebuild to make sure we get that part narrowing to end yeah that's fine exact pro so whenever there are these uh strings that c line thinks are not real words i like it bothers me i don't like seeing these typo warnings so i've been adding a bunch of like technical terms to the dictionary as i go sometimes i wonder if that's really the the best way to approach it but it seems reasonable so let's see if we get a better error now look at that runtime error unveil operation not permitted that's really cool now you could even colorize the system call name and stuff like that um but let's not get ahead of ourselves i am a big fan of this and i think it's really cool to have the code that prints out the error in one place because that means that if we come up with other cool things to pack into error objects and then we can just pack it into the error object and then we just have to update this one place and now everybody's errors will come out much much nicer and richer or whatever so interesting okay so let's add the way to make a syscall error and then we will commit that first of all so git commit lib system plus lib main plus back loop system loop name and error from syscall forces call failures this variant uh this creates an error that remembers the that contains the name of the syscall that failed uh this allows the mains or like disallows error uh handlers to print out the name of the call if they want to yes very very rad okay and then now finally we can commit our conversion although let's remove that bogus unveil call that we have so look at this diff right so we're changing to serenity main uh which immediately allows us to convert these many many lines here into these four very nice and tidy lines um very cool and then of course we have to we no longer get arch c and rgb as separate things so we have to just pass them from arguments but by the way i quite like this because if we come up with more cool stuff that we want to pass to serenity main then we can just add it to this struct instead of having to change the signature of serenity main um that feels like it might be a good idea so that definitely feels like a good idea to to pass it as a struct okay and yeah another great diff here right like look at all this red and it just kind of collapses into this super cool id use um port to lib main um yeah this is a first port of a simple program to libman um we immediately a bunch of code is immediately simplified thanks to uh the um this is called lib system wrappers um and ability to use try pretty cool this is pretty cool i think that is pretty cool oh man that's so nice let's see let's do some other program let's who's somebody who does a ton of like pledge and unveil and stuff let's see we have lots and lots of unveil um let's look in the utilities directory uh or actually we can look anywhere let's uh let's take some application let's say so what are we looking at here the welcome app let's do text editor maybe let's see what that looks like or terminal for that matter terminal is the the thing that starts on boot every day so definitely seeing that on startup so let's see it doesn't have a terrible number of pledges and stuff we have pledge pledge not a lot of stuff oh look at all this actually here comes the here's the payload look at all these unveils um from 432 how many lines do we have selected 33 lines so if we convert that to serenity main let's see how it goes so air or and serenity main main argument all right and then we can simply try system pledge like that or actually we need one more and of course we also need loop system wrappers okay and yeah so here's like other system calls that we also need to add wrappers for so let's add one for sig action i guess since we're doing it right here we want to be able to turn this into try systems take action sick child act null putter yeah we should we should should do that so seek action um where is it here oh that's a very simple thing so wrappers what does it return it turns an int see action does that return anything meaningful return value zero on success right yeah yeah so just error or void um sick action svrc i think that's all you need actually so let's just tidy up these names because i don't know why they're using these pointless abbreviations okay cool and then in terminal main now we can simply use that wait what don't you like about that [Music] wait what oh wait do you not know what that is oh shoot i guess i have to include signal.h for that to work otherwise it's gonna get weird yeah yeah um okay so that allows us to collapse that and then here we can try system pledge collapse that very nifty okay and yeah i mean they're just going to be like a whole bunch of different things that we can add wrappers for um not just system calls but other stuff uh that can fail we don't have to solve everything right now so i guess here is an opportunity to use a little bit of uh multiple cursors so let's do that click all right so try system colon colon unveil there we go that is super cool um oh so what is the um let's get the diff stat on terminal main here 16 insertions 51 deletions that's a pretty pretty good delta um and clearly like this is clearly much much nicer to look at like here are all the unveil calls after the change and this is what it looked like before with like separate error handling for each one super duper nice so yeah let's see let's do commits so let's do this sig action wrapper first of all also i just realized that i had still had a silly abbreviation in the wrapper because it says signum we can also say signal or signal number but not signum because why signum is called signal um no these guys right here yes look system add air or wrapper for uh sing action or t yeah error t wrapper for seg action wait wait i didn't actually test it because i realized i didn't even link with with that thing yeah yeah so it doesn't even build hold on now getting a little bit carried away here yep still got these guys and we also need to link with libman of course because undefined reference to main otherwise but i really like this idea that you just uh you just you don't need to add a main you just link with libman that gives you main um quite interesting so let's link with flip main there we go let's see that it works yeah i mean the terminal came up just fine right very very cool cool so port to lib main um yeah so uh it looks like the main works nicely for gui programs as well there are definitely a lot more fallible things whose errors we uh need to as errors we need to get better at propagating um okay this kind of prose does not really belong in this commit message actually so um it's more like my commentary let's talk about what the the commit actually does so um this allows us to collapse a bunch or um this simplifies a bunch of error handling and makes um the main function um quite a bit shorter there are more things that um it will become shorter yet once we as we get yet as we get better uh propagating hairs yes this is really nifty okay hmm so i'm kind of wondering if where should these syscall wrappers go so it really feels like if if you're calling into the kernel um like pledge and unveil those are literally kernel calls so it does feel kind of fair that there are in lib system but if you are calling a libc api um but you want an error or what namespace should we put those in i don't even really know but i guess we don't have to figure everything out right now it's just looking at like fork pty right here or pts name for that matter there are so many different functions that can fail and in this case we check for errors in pts name we don't check for errors this could fail we just don't care kind of interesting um and then of course there are so many things where allocation can fail not necessarily like just the object that you're specifically allocating like here but also allocation could fail like further into a function that you're calling and it's going to be it will be very interesting to try to surface those types of problems but i think now that we have a way to propagate errors out of main that sort of it removes a big hurdle and makes it clear how to do this going forward so i wasn't sure about this when i started this video but i have to say that i am pretty pretty pleased with how it's turning out so i think let's let's go down this road and see where it takes us yeah pretty cool so yeah this will be the end of today's video i think uh and if you made it this far then i thank you for watching and uh i hope you saw something interesting i definitely find this super interesting so very very curious i'm still like adjusting to to seeing this but but it feels good yeah so yeah thanks for watching and i will see you in the next one bye
Info
Channel: Andreas Kling
Views: 14,638
Rating: undefined out of 5
Keywords: serenityos, c++, programming, osdev
Id: 5PciKJW1rUc
Channel Id: undefined
Length: 57min 49sec (3469 seconds)
Published: Mon Nov 22 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.