Implementing C++ Modules: Lessons Learned, Lessons Abandoned - Cameron DaCamara & Gabriel Dos Reis

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

okay is it time yeah okay oh so uh good afternoon thanks for uh coming so today we wanted to share the lessons that we learned while implementing the surplus class modules feature this is work that was done by the entire visual cpus plus team and in particular cameron uh took over and fixed all the bugs that i created and added more features that were added uh during the simple class 20 standardization i only did the modules ts parts jonathan caves who also have a lot and then you have the id it's the entire term i can't um name all of them so i'm gabriel dustrace i'm architects on the visual secret space team and work with the ontario team and uh i'm cameron de camera as gabby just mentioned i work on the visual c plus plus msvc front end um specifically on modules oh thanks kevin so uh again without uh cameron i think we wouldn't have gotten uh where we are today so i'm glad to be presenting these lessons with him uh so uh what would i you want do i want you to take away from this talk the first is what were the modules designed for they were designed to support programming at large that means first thing if you develop your components he develops another component of the third user who just tried to combine those two building applications then i shouldn't have to worry about the internals of things that you did today that's almost impossible you always have to worry about header files leaking stuff and so forth the other thing is isolation componentization needs isolation which is that i should be able to understand the semantics of your components in isolation right and if those things are done properly then uh the theory that i put forward uh about six years ago was that we should get you know very impressive compile-time improvement and that actually happens to be the case uh if you follow some of the lessons that you know we're sharing here you'll see at least uh 10 x speed up 10 times speed up not 10 10 x um and the thing that has always bothered me is that in the c plus plus community for some reason we are so accustomed to suffer that we don't actually understand that we are missing uh if you come from the java community or or the c-sharp community or the pattern community you know that there are a bunch of tools out there that understand your program like you don't always need a c-sharp compiler you just need to have a representation of your the semantic server program and and then build a bunch of stuff uh from there um modules are um a tooling opportunity as i said back in 2017 and we're making that uh a reality so with modules i wanted to bring uh c plus to the 21st century where we have all these you know semantics our developer tools so there are various lessons to get from this talk uh some are directed to class plus two set implementers and others are directed to c plus plus users the first is as a c plus plus tool set developer what can you do to make good on those promises that we made six years ago share some some stories here and knowing that as a c plus plus developer what can you do to take advantage of the fact that the source i can actually make good on those promises those are the thing that i want you to take away from this talk and lastly uh i think the one definition rule has become cut goes well and usually when people hear about it they start shaking because either the compiler is going to do something that will mutate the intended semantics of your program or they think they might have valid something modules actually make it such that you do not have to worry about it modules were designed to bring uh the guarantees of what year for free the only thing you need to do is follow proper hygiene code like if you get a choice to do innovation in existing code base take the opportunity to introduce named modules then these audio guarantees will come for free um so i just mentioned odr and so what are they um when i started i talked to bjarne and say hey what is this what you're thinking well you know when i started in 1979 i talked to denis ritchie and and what dennis said was well it's as if there is exactly one source text for definition of an entity that was what they had and then people try to kind of formalize that in the simplest standards specification anybody has gone through that knows that it's very epic like that very simple idea has been encrypted in in pages pages of you know text essays looking for token confirmation you have to do the name look up template instantiation has to consider this and order in a context it's just incredible and it is not surprised that people get it um wrong so often because the spec itself is just too opaque um and also i kept complaining and i know it's like yeah it is so you gave me an idea that was very simple one sentence but this is what i'm getting from these standards and it's like well you know all those looking for token comparison they're just there to approximate this notion of module system if we hadn't had module system from the beginning nobody would actually have to worry about who they are we didn't we only had header files right so what do they bring you uh why do you need to worry about odr if you are in a header fired world then you know the the linker has to pick the definition has to relates the reference to a function or an object to where its definition is and if you have too many of these things you're going to get into into trouble um but if you make that process very simple then you just realize that a lot of time spent by linker in a compiler most of the time passer would would bring in a bunch of definitions produce a lot of them especially if they are in line functions which we happen to have or if they are templating sensations that we put in header files they get essentially a bunch of objs and what the linker does it works very hard to get rid of all those things that the the pastor put in so someone is working very hard to produce a bunch of garbage and you're that guy is working very hard to get rid of those cabbage right so um if you have a system that gives you audio by definition guarantee by definition you just get rid of all those hard work unnecessary hard work so that is why the audio is very important central to the design of modules so the modules their modules really kind of feature and we have to relate the features to the past in the past is this thing where we put everything in what i call the global module it is it's this unnamed modules where we have to check for audio as opposed to taking it as a guarantee the and as we know today the compiler cannot actually even if it works very hard cannot give you a complete check of of udr it works hard but it's incomplete and the rest just remains on the shoulder of the uh this plus developer to make sure and we make mixing because we're humans we are very human best we make mistakes so um to make good on the promise of modules and odr i introduced the notion of module ownership so what is this mysterious modules ownership the idea is that every interface or every declaration entity that's exported to you made available to you the tool set tracks the provenance right where they come from and this is important so that the if if the internals of your modules are kept separate so you don't have to worry about clashes okay and furthermore things that are export exposed they still need to be tracked because sometimes you get clashes in a way that's not intended for example name swap is is quite common swap in a standard library means exchange two things to places but if you write an operating system the notion of swap is slightly different right so um if you if you take those two components that will completely develop independently and then somehow you link into your program you really don't want them to be confused today they get confused because we do not have precise enough uh notions today or yesterday we don't have precise enough notion to distinguish them and to express them formally in the tool set with modules we actually have that and and again modules you know bring this audio guarantees and because now we have the vocabulary to describe those things the c plus plus two sides they can now track improvements through the antithetic from the password to the linker and even to the wrong time then i make clothing right um so i'm going to show you uh here's an example of a situation where you have the notion of weak modules ownership were compiled actually doing a great job so imagine you have two components m1 and m2 right on module m1 someone is providing you a library function the m1 that internally takes dependency on module m1 calls the function f which you swap right from module m1 and independently cameron independently goes on develop component m2 depending on a module m2 that also exposes this very useful function called f when i link the thing in a program main without actually knowing that m1 and sorry leave m1 and leave m2 independently takes dependencies on functions called f completely separately then try to link them most of the time the compiler and sorry linker will yell at me and say oh we have a duplicate definition for symbol f right and and that problem is dumped on the third person who was completely unaware of the internal details of the component leave m1 and m2 right so today the language specifications say this is informal diagnostic required but that's only because we didn't work very hard in the specification get into the agreement a better way of actually supporting componentization which is that you have a compositional semantics of independently developed component is to have the notion of strong module ownership which is that when i compose those two things together the internal details don't matter they are kept separately lib m1 when i call lib m1 function it chooses f from module m1 and the linker is able to trace that back to that that library that that module and the same goes for the m2 whose user of f is traced back to modules m2 they too are never confused okay traditionally we have always thought that was impossible it is not impossible it is really possible with the realm of the linker technologies that we have today you don't we don't need to do futuristic research to get that done this scenario is offered implemented today by the visual c plus plus compiler and way to get there is not difficult and one of the things i hope for the simplest flash community to set community is that we all make effort to bring this the guarantee of idea to the c plus plus community to you who actually use these these facilities and of course the password has to do its work but also the linker has to do its part so the notion of module as designed in c plus plus 20 is not something it's not some kind of uh surface syntactic sugar that is compiled away by the passer no it is a semantic notion that permits the entire tool set from the password to the linker so the linker has to do its own you know it's part of the job to bring this increased safety guarantees to c plus plus users cameron will tell you more about what a cursor has done all right thank you gabby so while gabby is explaining more of the targets that we were aiming for when we were implementing modules i'm going to be talking more kind of boots on the ground what did we actually do in the tool set to support the features and all of the semantics around c plus modules and in particular we're going to get to performance but for now let's focus on this idea of odr so before c plus plus 20 um we kind of had the classic odr check you know the standard says you got to compare names in their scopes and their signatures of their functions and if there's two definitions of that same function for example then you issue an odr violation and the compiler front end gets a very narrow view of the program in that respect it gets one translation unit to get that right and that's kind of the end of the list that's all you can do before c plus plus 20. uh by the way after c plus plus 20 the same is still true this check still applies um to anything that comes from the global module which we'll talk a little bit more about later so after c plus 20 things get a little bit more interesting you can do you can start to import more translation units into a single translation unit and what this means is that the front end gets a much broader view of the entire program in this respect so it can start to check odr in ways that it couldn't do before so because each of the translation units that are imported into that one translation unit were well-formed when they were created when you import it it should be well-formed in the context of the translation unit that you're creating so the msvc compiler implements a persistent translation unit on desk in terms of the um ifc format which is an abstract semantics graph and this is this is our module bmi for anybody who's familiar with that terminology and if you're more interested in the the makeup of that and what what we mean by semantics graph uh definitely take a look at gabby's talk on tuesday uh where he talks about persistent in-memory representations of c plus so what can we do now that we have graphs stored on disk and we have a conceptual graph for the current translation unit generally c plus plus is modeled as a graph so the easiest way to check for odr because we have two graphs is we can do an intersection and if you have a set of declarations a and a set of declarations b you take the intersection of those two and if that comes up empty you have no odr violations otherwise if it comes up with something in it then all of those all of those entities are conflicting in oer in some way so the the front end can can sort of do this and you don't even really need a compiler to do this because the the ifc is a graph on disk you could take some third-party application and check for odr amongst all your ifcs on disk you don't even have to compile your code anymore so that's enough about odr let's talk about odr again but in the linker side this time before c plus plus 20 the linker had a very easy job when things were when entities were defined in the translation unit and they were tu local which is something the standard defines as something that could be static or something that essentially a name that does not escape the current translation unit and the the job is very simple if that is not defined it issues a issues a unresolved external reference i'm sure everybody's run into this at some point and the job gets a little bit harder when you talk about external linkage entities well external linkage symbols what the linker will do in this case is it creates this giant map which contains all of the names of the entities as a key and the value will be where its definition is so it's trying to map def or references of a name to some definition wherever it may exist and this is what i define as the the great hunt so when a name is encountered it has to go and try to find it in this giant map conceptually the pipeline looks sort of similar to this so we have a function f goes into the front end goes into the back end in a very linear fashion the back end emits just a mangled name and that mangled name has an association that says hey um an external linkage symbol that's all it needs to know and it just goes directly to linker and that that's all you have down below you have an example of what the linker map might look like it's just name where's the definition some definitions might not exist in the case of b um in which case you might see a link or spew that looks like this and who loves seeing this on uh when you're building your application right so like how do you even figure out how do you even begin to figure out what went wrong in this case like where do you even look for where the definition might have been um and the answer is it's never obvious it almost never is uh so in c plus plus 20 and under strong ownership in particular we could do a lot better so strong ownership in the mspc toolset introduces a little bit of extra bookkeeping into the linker and uh by this we mean that it sort of keeps track of when a module is built what names are is that module exporting and it associates those names with the name of the actual module so it kind of creates its own isolated component as gabby mentioned and it this becomes its own universe of external linkage symbols this is an example of what that pipeline looks like so we have similar situation we export void f this is an external linkage function goes in the front end but this time the front end will convey some ownership information about f in this case it'll say hey this f belongs to module m and it's external linkage and the back end will retain that information and record it into the object file and when the linker reads that object file back it gets not only the mangled name but the ownership information about that name and this is what contributes to these maps down below now we have let's say we have two other modules m1 and m2 and then we have the global module which is conceptually the old giant map that we used to have and these these are kind of their own buckets of association of associated definitions between names so you can even see that there is an overlap between these names for example a here is an ada obj and a here in m2 is an m2 to obj and the linker is totally happy with this because they live in their own isolated buckets of uh of external linkage symbols totally fine and so now you can get diagnostics that look sort of like this we know that the diagnostics can be improved in this case but it gives you some indication of where the definition of this function f might have been uh the the first function there is void g so this belonged to the global module that's the same mangle name that we're all used to um whereas f has this extra association says hey you know i belong to m which is that not m it's it's it's not really a mangled name that that is just something that we on to the diagnostic message but you kind of get an idea of where to look for the definition of that function before going to performance just wanted to add that uh so that module m ownership the the mangling part is actually something that is done by the linker not by the front time because if the front end has done it then we have been already too early and the linker wouldn't have found what they needed to define okay uh that being said uh you can see the the division of labor uh the password does something associate information to where things are coming from and then the linker get to pick the pieces from there and when it's looking for um uh information where these symbols come from start from oh which module right and that dramatically narrows down where to search and you can imagine any word if the performance gain can come from well the things what you have to remember is that in the what i call it all the days of hash includes hash includes just uh copy and paste of your source file the model of simplest flash you don't pay for what you don't use is actually violated big time imagine you have a simple uh program hello world you include a hashtag stream and then you just say see out hello world well it turns out in ironstream there are a bunch of stuff there that we don't actually use and that the body of the of domain yes because it is copy and paste a password and the semantic analysis have to go and pass all those things all the other template stuff actually instantiated even if you don't use them right so in the old worlds of hash include not textual inclusion you actually do pay for a lot of things that you do not use modules fix that okay the other thing is that if for some reason uh you find yourself in uh in a situation where you can't use name module i really want you to use an module that is that's where you get the real stuff from well you can use header units header units are some kind of fancy names uh for uh pchs pre pre-compiled headers uh they give you some benefits of modules not real stuff again but when you do actually use name module you get spectacular uh combat time control so a number is that you know we'll show you later in special camera we'll talk about that uh from the current states of the msvc compiler we have we haven't had the 25 or 30 years of experience optimizing pcs right we've just focused on just implementing the basic functionalities we'll we'll work on improvement optimizing use cases but what you just have with just the basic implementation is already very very promising um so uh here you know cameron walk you through uh the example of performance here uh thank you gabby so as gabby alluded to uh you know hash include has some overhead let's figure out how much overhead that actually is um this these numbers were taken from brna's paper um integrating a minimal support for standard library as a module the in particular these numbers were gathered with visual studio 16.11 so all of the optimizations that we're going to talk about throughout this talk are part of that release um so let's let's focus in on what we're trying to measure here and it's this this green square down here which is in main stood see out hello world the canonical c plus plus hello world program everybody has probably written this at some point uh so let's focus on the first comparison that kind of logically makes sense which is uh hash include to header import and as gabby mentioned these header these header units are really kind of just a formalized pch so they store some stuff in there in particular they store things like macros and we'll talk about why that's an overhead later but here's kind of like a valid comparison between these two things they they should ideally have the same exact contents so when you switch to header units you get a 5x speedup 5.1 5.13 speed up that's that's already pretty good but when you compare the hash include uh and when we say needed headers here we're just talking about iostream hash include iostream because because that's really all you need for stitchy help you get a 17.55 x performance speed up and this this number is really great but it's even worse than you might think and because the reason for that is because when you say when we said import stood you don't just get ios streams you get the entire standard library so we're not even close to comparing apples to apples if we wanted to get closer to that we have to look a little bit harder and in particular we have to do some comparisons which honestly um i mean the real the numbers here speak for themselves so if we do hash include all of stood which is a hash include of every single standard library header which would mimic the contents of import stood uh you get a 66.51 speedup so forget about hash include like that's that's just totally gone um but when you talk about the comparison between uh name modules and header units you would think that the comparison should be pretty similar but it turns out that's not the case you actually get a 7.11 speed up just using name modules over import stood and there's a couple of reasons for this header units have overhead and i mentioned this before but header units after you say import like iostream semicolon what the compiler has to do is it has to also import all of the macros immediately after that semicolon is done and this is this is non-trivial work it has to create you know macro objects paste all the text from the ifc into that macro so there's there's a lot of overhead just just materializing all of the macros from the ifc on top of this headers have include guards which are not just if def you know header.h usually they come in the form of pragma once and pragma once has a special implementation depending on what compiler you're in so the the import mechanism for importing that header and clue guard uh might have additional overhead on top of the macro that it could import for a header unit or for a header include guard um and finally importing declarations from a header unit does something that modules don't have to do and this is merging declarations with any existing declaration that's already in the translation unit because stuff like this has to work so if i have some header that said hash include vector so this is the copy and paste textual inclusion of header and then later on another header said i want to import vector and finally in my translation unit i say hey i want to instantiate stood vector there is some things that need to happen in the front end to make that work the compiler must provably say that hey this stood vector that you're that you're instantiating is the very same one that was textually included um in in hash include vectors so the the module machinery has to say you know based on based on the structure of the class you know match it for oer and not only just the class but you have to match all of its base classes you have to match all of the special member functions all of the default arguments so there's there's a lot of overhead that's involved with uh mixing hash include with import and really all of the overhead is specifically centered around the fact that we chose to use uh import header units so let's let's let's try to focus on named modules in general because they give us a lot better perf the odr com when we're talking about odr with name modules the comparison is very sim simple if you have that name in your translation unit you immediately issue an odr violation you don't have to do any comparisons or try to merge declarations you don't have to do any of that with name modules and that's a lot of what contributes to the speed win from header units to name modules so so how do we achieve performance in the compiler front-end and we're going to dive a little bit into into some code sample that i got back in 2018 kenny kerr the author of the c plus plus winner t came up to me and said hey you know modules perf isn't where it should be um we should be able to do a lot better we should be competitive with pch and reluctantly i agreed um and much to my dismay i had to implement a lot of really painful optimizations around this uh so he gave me this code sample and in this sample it's very simple all it does is it reads an rss feed um and like waits like co-awaits on the result of that rss feed and prints it all out a pretty simple application and this is this is kind of the graph we ended up with uh for performance so the scale on the left hand side here is in millisecond or sorry is in seconds and the x is like commits that went into msvc um so like lower is better in this case and we identify or i implemented four optimizations which took us from about six seconds of import time to about 1.8 and the first one of those things is uh delay loading so delay loading it's it's a pretty simple concept and the idea is that when you import a module uh what it should conceptually do is just populate names in scopes you don't have to materialize anything because all of the semantics of name lookup should result in materialization later and when i'm referring to when i talk about materialization in this case i mean the process the compiler takes which reads the semantic information from disk and creates a symbolic entity in the compiler which it can use to generate code with later so name population uh this this module on the left hand side has a couple of f's in various scopes and it kind of translates to a graph on the right hand side here with these buckets so we have the global scope at the top and inside that global scope we have void f and we have a bucket named n which is a namespace and inside that bucket we have another void f and you know so on and so forth so like i said name lookup drives materialization so when you say import m here the compiler is going to say okay when you do a lookup of n colon colon f it's really going to become a select over the graph so when that happens all we really have to do is retrieve the entry void f and take all of its semantic information materialize it from the ifc create a symbol for it and you know off you go this is what we refer to as on-demand materialization template specializations so in the context of delay loading let's switch gears a little bit and talk about template specialization because that demand loading uh optimization has actually been in the compiler since the ts era this particular optimization is one we implemented for the c plus plus one or t and every compiler uh is gonna have some kind of representation which is similar to this uh for templates so when you when you create a template it's gonna create this map of specialization arguments to what type of specialization it is in this case we have a bunch of explicit specializations and then we have an implicit instantiation so for correctness um the compiler kind of has to know when a template is odr used so it can figure out whether you're using an explicit specialization or it needs to actually instantiate the primary template and the compiler's already really good at this so if we take advantage of that and apply an optimization over that all we really have to do is say you know instead of materializing all the explicit specializations we're going to go ahead and just reserve all of those slots um and say hey you know come back later we'll we'll resolve this thing when this is actually odr used this is kind of the map that it creates right here and on the import side we have import m and when the user says hey i want to specialize s of char it says oh okay i have that template specialization it's a delay load specialization so i actually need to ask the modules code to go ahead and materialize this thing for me so you give it the id of that specialization type and it comes back with a symbol and that's how we get this resolved explicit specialization here and the other ones are just still reserved we haven't done anything to them yet and of course implicit instantiations are the same as they had been before so it's worthy to note that uh this optimization um relied on a data structure that was coin well that was defined by sean parent in cppcon 2019 and it turned out that that data structure uh when i saw it it was the exact right application or this was the exact right application for that data structure so go to cppcon you learn stuff and uh that gets us here so our first optimization using right data structures so when we were benchmarking the modules code we found out that we're using a lot of stood maps all over the place and it turns out that stood map when it allocates keys they're not very close to each other so when you need to iterate a whole bunch of keys that results in a pretty large perf overhead you know what if we try to use something different so like what if we try to use bucketed hash tables and it turns out that gave us just enough of a perf when to justify switching all of them to bucketed hash tables and it wasn't just on this repro that you were getting the benefit it was across the board all of the all the modules repros we tried after switching to bucketed hashes um gained gained at least some perfect increase serialization techniques so ifc is memory mapped in the compiler for faster random access because when we're materializing semantic information that information is kind of spread across the file at various points so you have to be able to bounce around the file very quickly and memory mapping is a perfect application for that kind of thing uh it was it was initially thought that um when we implemented the ifc we needed to kind of validate the contents that they haven't been doctored in some way so what we did when we exported an ifc is that we summed up the contents we created a hash of that stored the hash in the ifc and it's still there today but then on import what it what the compiler would do is it would first read that hash and then read through the bytes of the file and say hey i need to create a hash of all this stuff and compare it to the one that's stored on the ifc um just kind of as a very quick validation that nothing's been tampered with um what if we turn that off so that that's that's a pretty big perf win um and here's the reason why the the the overhead was not in computing the hash it completely negligible in the context of this overhead um the real overhead was the fact that we had to page in the entire ifc sum up the bytes um in order to sum up the bytes um so we went over to our friends at the c sharp team to figure out because they had something similar um a while back where they'd validate some kind of um some kind of hash that was stored on assemblies and we said hey you know what did you do for this they opted to also ditch this ditch this strategy because of perf overhead exactly for the same reason in favor of trusting the application itself and we decided to do the same thing but we're still providing a mechanism in the compiler to check that ifc on import so if you're still concerned about the ifc being doctored you can throw this switch and you get kind of just a first level validation that nothing's been changed in the ifc um in in some harmful way so we recommend at least trying to check that hash at some point even if you do it through a third party tool because it doesn't have to be the front end that checks it right so this is kind of the mantra of every uh performance performance benchmark or anybody who's working on performance in an application you got to measure everything all the time and after we measured a lot one last time we found that it was partial specializations this time the previous optimization i talked about was only applied to explicit specialization because that mapping was obvious uh but partial specializations are a little bit different um because they don't they don't quite map to an exact argument and in fact they actually the way they're modeled in most compilers is that they'll have a map themselves of uh of non-dependent uh specializations of that template that partial specialization which are instantiated so they're a little bit more complex uh but it was actually worth it to delay load them and that got us uh to our 1.8 speed up okay thanks uh kevron for uh you know the detail finishing up the journey going from not very competitive implementation i had then where you took us uh to uh it was a very you know instructive learning for for myself we had a long conversation about what needed to be done and it was very perseverant and get us there okay um so and most of what we have talked about so far is about the traditional tool sets come online stuff like passer code chain and linker but you know these days when we work we work in some kind of integrated uh environment having the id experience um there are several ideas out out there if you're also into cross-platform development then you actually have uh not just one compiler but several compiler in your environment right and in general people expect a some kind of consistent experience across the platform so you easily see that in the idea you're going to have several compilers trying to understand the same could be so what happens in that case well if each compiler has its own format for um storing all these metadata that is needed for modules you can imagine that you get into kind of real mess and you may not even get the increased performance that we're talking about earlier so if you really if you're developing an ide if you want to bring these performance that we can have you have proof for that we can have it today then it makes sense to actually share that common representation across other compilers or at least have all the compilers the ability to emit this common binary representation tomorrow i'm going to talk a little bit more about uh ideas about how to get there and my my expectation my my hope is that in the coming months and or years the super space community get together to define a common representation for these artifacts that you get from compiling modules and it will be a sea changing uh event for the c plus plus community so if you just have one if all the compilers agree on one format then you get really good benefits uh you know good experience in the ide you know it's it's very snappy the other thing is that in the old world where we have header files people have you know systematic complaint about header files taking a lot of time so compile writers have developed this notion of pchs they tend to be this kind of section of code where you say hashing with hashing include then a compiler okay just load those things process them and then save the result on disk so what does that mean saving the result on this for some compiler it means just dump this test of the process right now and the next time someone is just load back the process that'll go very fast because your pricing system knows how to load those things very fast it works the uh the downside is that what you know comparison on this is the state of the process at that time with all the other stuff that's going around so you can't move them around you cannot share them with other projects if you if you have used vectors many times across different projects you cannot share the pc edges across the project because each project has its own memory state when the compiler generates the uh um save the vpc for some other compilers oh yeah i'm going to save on this some structural data right and um and that can be some kind of performance but if you want the ability to share across translation across project then you need a common representation uh so uh i guess this what i'm trying to say here is that the same kind of increased performance that you have with the batch compiler one that generates code you can also bring that benefit performance benefit to the ide if you get all the compiler writers to agree on some kind of abstract common abstract representation you c plus plus developers you deserve that you should talk to your compiler vendors and not include them svc folks just go on their website and and and keep asking until the deliver um and don't tell my employer that i told you that um the the other thing is that when you're in the ide and you're just writing uh imports to like this example earlier or you you know someone hasn't for example built a module let's say beauty have imported in the brackets uh behind that the intellisense engine needs to systematically compute the dependencies right you don't want to be meeting the dependencies by hand nobody wants to do that so the id the development environment needs a way of computing those dependencies systematically for you we go back to the same problem that we had earlier with the uh representation of the semantic of c pass class for the dependencies it wouldn't make much sense that each compiler or build system has its own way of representing dependencies you want to be able to share those information because it is the same program it doesn't matter which compiler you use your program is your program it has the same dependencies no matter what compiler you use so one thing that i hope that you're going to get is a shared format to represent dependencies of c plus plus programs and thankfully the kit for folks have put forward um a a schemer that all tool vendors you know are encouraged to use to report dependencies uh the msvc compiler has implemented uh that format and and this specification of that format will be part of the upcoming modules to uh touring technical specifications so uh wg21 you know busy uh you know there is a study group that was created to to help move forward the tooling ecosystem especially because we got modules that kind of challenge the traditional header file so the upcoming technical specification for the modules tooling will recommend uh this uh this format and and we expect that two vendors would get involved and most of them are already part of the conversation uh going on so the idea needs to do its part and uh so this is mostly an advice to you know tool vendors uh but what you as c plus transfer members should do uh get away from some of the uh the old habit that we learned uh in the past few decades yeah so uh i don't know about you gabby do you do you feel like using pchs anymore yeah okay so don't don't use pchs anymore we have a better tool and it's standardized now in particular you should start with name modules and you should practice them and use them and use them frequently if you can because the the the opportunity to improve your project's hygiene is vastly outweighs any any possible benefits that you could get from using pch for one they could be reused across projects because it's a portable format that we have designed ourselves you can move it from machine to machine and it'll just work unlike pch's and oftentimes you'll see that it provides provides initialization guarantees that pchs may not be able to clearly define um and this is actually defined in the standard itself finally pchs in msvc are notoriously large because it is just a memory dump of the compiler state um which is then memory mapped back in so you'll see pchs which grow you know up to two and a half gigabytes sometimes and sometimes you can't even remap them back in because they're so big and that's like a huge problem ifc's are generally magnitudes smaller than that uh we've seen 2.5 gigabyte uh pchs translate to you know 600 megabyte or less ifcs and if you're really stuck on header files and if you're in a hurry you can try header units you get some of the benefits of name modules but not all of them our recommendation for sure is to start with name modules if you can okay uh so this has been a very long journey uh for me it has been about six years since uh four years yeah for four years uh so we learned a few lessons uh in the way um when i started uh the msdc compiler uh would use uh what i call ancient technologies to to process uh template definitions so and it wasn't actually the only compiler there was a couple of order commercial compilers out there there so what they will do is that when you write template definitions they will store the template definition as sequence of tokens and this is really really good it goes very fast and when you want to specialize a template like vector of games what they will do is that they will just replay the token streams and then set up scope whatever it needs like let's see see there and there's now aims and then and then reprocess everything so you run a password every time you want to specify a template uh please don't do that if you're an implementer out there doing that please consider moving to graph base uh you know semantics um parsing c plus plus code is in nontrivial amounts of it takes in montreal amount of time and in the case of ms you see if you're still using yak well you know it's not a good place to be so um we decided to move a little bit away from yak processing tokens for templates uh definitions to um instead of storing token if you're going to store the this syntactic grammatical construct that we use there and then very instantiate from there uh that experiment took about six months i'm like no don't do that there are still not just some sbc or there were a couple of photo compilers uh they're doing that please do not abandon tokens abandoned parsers because you will not bring the speeds that you can actually you know scan can have and one last uh thing that was quite interesting is that usually you know every time we have new sequence less standard we we don't erase the previous one we only add or improve so the existing compilers they already have their framework and and and sometimes they insist like this is how we used to do things then you should do it that way uh we've implemented modules what we found is we need flexibility we need to be able to be flexible um for msbc for example how you know simple things like how do you specify options um it doesn't actually work very well you know because you have to say something like no construction you need to use column to map a command line to argument it will take but again you know it adds on on windows for example column can also be used to delimit driver from file path and and you know when using modules you need the ability to map a module name to where you know the compiler can find its metadata and if you have a bunch of there it doesn't work so we have to go back talk to the architects like hey and i understand why it was done this way but i think we need to do something else um and then the lessons that uh we um we learn especially with uh strong ownership modules ownership is uh we need to do something with the linker so usually when you say we need to do something with the linker everybody is like what are we talking about like no we need to do something with the leaker okay so um we have to sit down talk a lot and and and convince the linker people because link they are very close to the operating system like oh [Music] spectacular improvements and we have increased uh odr guarantees as we showed earlier so it pays off to actually be flexible and bring you know listen to uh feedback considering new ideas and and i think that the c plus plus tool provider authors can actually do that for the benefits of the larger five million plus c plus plus programmers um i think this is probably uh we're ready to take uh questions uh but before that i just wanted to briefly mention that uh there are some other talks by microsoft folks tomorrow i'm going to talk about uh how to represent c plus plus programs in memory and on disk prompted by mole use implementation uh sony uh there is going to talk about uh static analysis how you can check [Music] doing checking modules and static analysis um herb is going to offer some thoughts on pattern matching and and psy uh we'll talk about what are the opportunities for a better error message with concept that we have now is plus bus journey uh with that said uh the floor is open for for questions uh yeah back there yeah uh could you use the mic oh is there mine no there is nothing you shout it and we'll repeat it okay yeah so the question is um if we have a project that has like hundreds or possibly thousands of cpp files do we have to scan them all in order to figure out the module dependencies and the answer to that is no if the build system knows beforehand what your module interfaces are we we have this experiment with ms build where we just need to scan the module interfaces and it turns out that if you only scan those that gives you enough dependency information for the rest of the program because those are the only cases where you'll actually need to bottleneck the the build system in order to build some kind of graph uh otherwise all of the other translation units can be built in parallel because they don't provide module interfaces themselves yeah so in addition to that um even if you didn't give uh some hints to the build system oh this is my interface file and this is something else the scanning let's say you're going to scan like 10 000 files of project in there so the scanning process is so fast because the specification of modules have been done so that you don't actually need to do full pre-processing you only have to go through the the file so the pre-processor is line line oriented so you just have to go through the line of source code and look at a few a couple of first words is it hash yeah okay i have to do something like we used to do before is it modules yes okay no you just have to consider that line is it import yes anything else doesn't count anything else doesn't provide any dependency information so the scanning process is very fast that would use for instance the specific compiler basically support modules so you need to provide undermined version some tool right for the compiler with some plans to actually do this scan of the certified you know x amount of lines do you provide a tool like that so so the question is um you know if you have a build system like make um and and uh you you need to have this scanning uh is there a tool that actually provide that information for example with msbc uh the answer is yes in the case of a base vcc uh you know cameron here is the one who actually wrote you know added that extension to to to cl dot xz uh you should just say cl slash search dependencies and if you want some more fancy information you provide an option and then it will the the compiler will run so the driver will run and it will write in you know either the uh on the standard output or any file uh the dependencies in the format that i you know i talked about earlier that was pushed forward by the kid war uh people and that's what is being used today for example by the ms built uh system but you can imagine that make file can be uh used to do that and and c make the feature that drive it's a ninja or some other and the second question is uh can you actually use the modules ts to kind of traverse your uh your uh yeah so the first question is whether the import student is something that is only msbc specific and implemented is it currently implementing so since 2015 the msbc toolset has been experimenting with some notion of stood module that you can use and it is experimental and it will evolve based on what wg-21 is doing so what is currently available in the msvc2 set is very different from what is being proposed and once it goes through wg21 the administrative set will be updated to to reflect that and the second question is whether you could use the ifc format to go through your function bodies and see where an exception is thrown or um yeah so whether it functions there's uh some exception handling i'll say more about that tomorrow but the basic answer is yes but i will say more about that tomorrow i think we have time to have one um so okay so to repeat the question um how is the front end conveying this ownership information uh all the way to the linker basically um what the msvc toolset does is uh we have we have a communication channel to the back end which is our il that's generated from the front end and this il was modified to add some ownership information about the module so if i say void f under the module per view um the compiler front end will say okay uh when i give the back end this this f um and of course it'll hand it its mangled name it'll give the ownership information to the il for the back end and what the back end will do is record that information in the cough sections for the linker itself so if you are familiar with our object format there's these cough sections which are specifically talking about like what is the linkage of a specific function or um some kind of symbol that's embedded in the obj and it's in that section that we record what module it was from and whether or not it's uh internal linkage like module linkage basically or external linkage and should be uh treated with the strong ownership model so it's all it's all embedded in the obj file itself yeah so uh complimenting what uh cameron just said now he's right the msvc combine the way it is structure is that you have a front end that runs and and generates something some kind of intermediate format of cil but if you're familiar with clang for example you know that clang runs and then generates some llcm bit codes and that led runs and then does optimizations on it so have that kind of separation and if you work on gcc i used to work on jesus a long time ago so it doesn't just doesn't use an intermediate for air in a file but it has an in-memory representation where you have gimple that is handed over to uh to the middle end and the midline that has there's optimization and then they write rtl which is that kind of immediate thing that great back and generate code from so what is done what has done the msvc compiler can be replicated for compilers that use the f format for example it's a gcc or or or client so the idea is that instead of using name mangling to uh convey this information so name england is a wonderful hack you know biannual apologies for saying hi but i believe it was it was this hack that vienna invented early uh 80s because he couldn't do anything with the linkers at the time but we are we have been at a era like long before modules where we are doing a lot more than c used to do so the linkers have actually been doing a lot more than what janine had to face with so you can actually encode in the format that you're giving to to the linker with the l format or llvm bit code that hey this function this symbol actually happens to have these additional properties and then the linker when it's trying to link will consult that information so that technique could be used let's say for conveying uh exception specifications this kind of stuff now the the linker at the end can add additional decoration right like those indicating earlier but based on that information that the front-end has conveyed to uh to the to the back end and all the compilers that i know of can do this i work with four different compilers gcc edg and msgc and client can you know i know of confidence that it can be done the real question is whether there is a will to do it i think there is a will if we put enough pressure that it is needed we're well out of time so unfortunately no more questions thank you very much thank you you

Info

Channel: CppCon

Views: 5,243

Rating: undefined out of 5

Keywords: c++ talk, c++ talk video, cpp talk, cpp talk video, c++, cpp, cppcon, c++con, cpp con, c++ con, c++ tutorial, c++ workshop, learn cpp, learn c++, programming, coding, software, software development, cppcon 2021, bash films, c++ programming tutorial, c++ programming, c++ modules, cpp modules, c++ 20, gabriel dos reis, Cameron DaCamara, Implementing C++ Modules, software engineering, C++ toolchain, modules, c++ 20 modules, Digital Medium

Id: 9OWGgkuyFV8

Channel Id: undefined

Length: 66min 5sec (3965 seconds)

Published: Sat Dec 18 2021