CppCon 2017: Nir Friedman “What C++ developers should know about globals (and the linker)”

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Figure there's no harm to mention: I'm the speaker, happy to answer any questions.

👍︎︎ 28 👤︎︎ u/quicknir 📅︎︎ Oct 29 2017 🗫︎ replies

Great talk - I learnt a lot!

👍︎︎ 7 👤︎︎ u/SittingOvation 📅︎︎ Oct 29 2017 🗫︎ replies

Could someone please explain the code sample at 20:52 with a "Marginally faster" title on it? I can't get my head around it. Given std::strings should read as static std::strings, what are all the templates and struct supposed to do? In the case if I need to export a global or two from my TU should I wrap them in such a struct?

👍︎︎ 1 👤︎︎ u/Jajauma 📅︎︎ Oct 29 2017 🗫︎ replies

Interesting and informative.

I've encountered many mysterious linking problem in my time; and I've rarely understood precisely what the issue is, and why the fix works (the fix usually being to change the order of linking). So the talk helped with that.

The take-home message seems to be to declare globals (if any) using inline in the header. That seems like an easy way to avoid some problems with linking.

👍︎︎ 1 👤︎︎ u/blind3rdeye 📅︎︎ Oct 31 2017 🗫︎ replies

I try Jajauma's https://github.com/Jajauma/GlobalsDemo on ArchLinux: gcc is 7.2.1 while clang is 5.0.1. Both run OK for initial crash version:

# ./Boom
String too long for short string optimization
String too long for short string optimization
👍︎︎ 1 👤︎︎ u/nanxiao 📅︎︎ Jan 17 2018 🗫︎ replies
Captions
Okay so, hello everyone. Welcome to the talk. What C++ developers should know about globals. And in parenthesis, the linker. Alternate title for this talk, things I never wanted to know about globals and linkers but was forced to find out anyway. That's probably a more accurate summary of how I came by this knowledge. I'm not a build engineer. I'm not a linker dev. I actually work in high frequency trading. So I came by this knowledge kind of incidentally. But I think that's okay. Most C++ developers in my experience are not experts on the linker. So, maybe this will make the talk a bit more relatable. Okay, so globals. I've never seen code, you know a code base, and thought to myself wow there's not enough globals here. If there were only more globals, this code would be so much better, please. So the talk's not an endorsement of yes, sprinkle globals everywhere once you know how to use them. But in my experience there are things that are best done using globals. So here's three examples that I've used firsthand often. So logging intrusive performance profiling, and polymorphic factories. So, logging is kind of a canonical example. So let me just pick on intrusive performance profiling. You know, typically you'll have macros in your code that record cycle counters or other information. They can be enabled and disabled by macros. They have to record that information somewhere. And you don't really want to pass down references to this object where they get recorded all the way down and up your code base. It doesn't make sense. It doesn't effect the behavior of your code. So, it's just easier to use a global and less intrusive for the surrounding code. So, right off the bat, most people, when you talk about globals, they go straight to singletons. And actually globals and singletons are different, or at least in the definitions and usages that I've dug up. So, a singleton is a class that can only have up to one instance. Okay, it's a property of the class. A global is a variable that has global scope or global access. You know, more technically in C++ parlance, you could say it's a variable that is at namespace scope. Sometimes a variable can have global access without it being at namespace scope. That's okay. So these two are actually orthogonal. You could have any one or both without the other. A singleton that's not a global is rare. But that could exist as well. But lots of globals are not singletons. And in general, I'll come back to this point. But singletons are even worse than globals. And I discourage them. And I'll explain a bit why, later. Okay, so compiler versus linker. So C++ developers in my experience really, really like the compiler. They like to talk about the compiler, they like to argue about the compiler. You know, the compiler's getting updated all the time. We had C++11, then 14, then 17. It's really interesting. But the linker, it does get worked on, it does get updated. But it's sort of this thing that has been around for a very long time. C++ was always designed to work with C linkers. That was a big part of the original design. C++ code written today in C++17, you could probably use linkers, maybe before I was born as an exaggeration, but probably not by much. So, this leads to weirdness. And if you don't believe me that C++ developers really like the compiler more than the linker, anytime you Google for help, you'll see that any question that involves the C++ standard and compiling, and the compiler, you will find 10 people on Stack Overflow with 100,000 rep arguing about the correct interpretation of the standard. When you Google for help with the linker, usually you can't find any answer. And when you do find an answer, it's got like two up votes or something like that. So, it's a lot harder to find out how you should be doing things. Okay, so here's my example. So here's code that looks pretty innocent. Or at least it may look innocent to some of you if you haven't encountered this problem before. So let's start with a static library. So, I show the header and the cpp contents in the same code block, just for simplicity. So, in the header file, we're just including a string, we've got our pragma once, we declare a global with extern. And then in the cpp file we define the global after declaring it in the header file. So far so good. Okay, so now we have a shared library. And if you're little fuzzy on static versus shared libraries, I'll also talk a little bit about the difference there shortly. So what does a shared library do? Well, it has a function that basically just gets this global that it gets from the static library. So the shared library depends on the static library. And then finally we have an executable. The executable depends on both the shared and static libraries. And it's just gonna try to print the string. It's gonna try to print the string that it gets directly from the static. And it's gonna try to print the string that it gets directly from the dynamic library, the shared library. Okay, so I'm curious. How many people would be able to build this from memory? I'm just curious. Hands? A few people, not too many. I mean, I had to look up all these commands. I mean, we're used to just writing some C++ code and then we just fire off our build command. We have a build engineer at our company. He tells us what to do or we use CMake. And I was very careful here. Many commands that you might issue the compiler, and by the compiler I mean basically, g++ or clang++. They actually invoke the compiler and then the linker, depending on what you're doing. So here, I've separated everything. So we had three things that we were interested in. The static, the shared, and the executable. So, I have six commands. One to compile the object code, and one to produce the actual target. So, first I compile the cpp file. Then I produce a static library with this command called ar. Next, I compile the code for the shared library. I link it against the static library. So, that's what that looks like. And then finally, I compile the code for the main executable, the object code. And I link the executable against the static and shared libraries. So, when I do this, and I run it, any guesses as to what prints out or what happens? Watching this program will segfault. So... Again, if you've encountered this problem before, maybe you kind of had a feel where this was going. But this program, this code, seems very, very innocent. And yet, this segfaults. And it actually only segfaults if you build it this way. If you only use static libraries, if you only use shared libraries, you're gonna be okay. Which in some ways makes it worse because you find the problem later. And this is probably a good place to point out that as far as this stuff is concerned, a class static variable is still a global. So that seems like a very reasonable thing from a software engineering point of view. If you are writing a class, let's say the class parses JSON. In JSON true is a literal that has to be parsed out as a boolean value. Maybe you would have a class static constant string which is set to true. And that seems totally reasonable and harmless but that can cause these segfaults. And it does. Okay so, let's try to understand what's actually happening. So, I suspect that a lot of C++ programmers have written a class like this at some point in their lives. Personally, I call it Informer, because it informs on everything that it does. So here, when it's constructed, it will print. And it will print its address. When it destructs it will also print and print its address. And it also has a get method which just returns its address too, why not. So I make that change, and I also will change my main. I mean, there's no string to print out anymore. So I'll just print out the address. And let's try to see what's going on with these global objects. Okay, so we're not doing anything interesting anymore, like allocating memory. So we don't segfault. But here's what the program prints out. One expectation might have been that you'd have two copies of the global, which was actually a very reasonable expectation, but it's wrong. There aren't two copies of the global. But the constructor runs twice in the same spot. And then the destructor runs twice in the same spot. So, it's not very hard to see that this is gonna cause big problems. If you have a string, what's gonna happen is, you're gonna allocate some memory, set some pointers, then you're gonna allocate again and set the same pointers to the new allocation. So that memory is gone forever now, you leaked it. And then you're going to call delete on the pointers in the string. And then you're gonna do that a second time with the same pointer. So that causes a segfault. That's not a good thing to do. Okay so, this is where we take a quick detour and try to talk a little bit about the linker. Okay so, how does a linker work? And this is very, very simplified, I'm assuming. (laughing) I don't know. I assume it's simplified. This is kind of like a spherical cow model of the linker. You provide a link line. The link line has a bunch of targets. The linker marches from left to right. And you can think of the linker as like it keeps a little notepad or a table. And every time that it encounters a target, that target has a symbol table. And the symbol table basically says to the linker look, here's a list of all the symbols that I need, whose definitions I don't have but I refer to in code. And here's a list of things that I do have that I provide in case anybody else needs it. And it marches left to right. And every time a symbol is needed, it adds it to its little list. And every time that a needed symbol is provided by a new target, it then does one of two things. And this is the difference between static and dynamic libraries. If the library is static, then it basically copies and pastes the object code into whatever target you're trying to make. So when you look against the static library, you copy and paste the code, it becomes part of your executable. That's why you don't need to ship a static library with your executables. If it's a dynamic library, then it basically just checks the box, and it politely says, thank you, this symbol is gonna come from this shared library. And then it tells another entity, or depends on another entity called the loader. And the loader will fetch this at runtime by loading the shared library the moment the executable runs. So that's our static and dynamic libraries. Okay, so I mentioned symbol tables. Maybe we should take a quick look at a symbol table. So the easiest way to look at a symbol table is to use a little command line application called nm. And usually you'll want to pass the double dash demangle flag so you can read the output more easily. This is symbol table... Actually not sure for which version of main. But the idea here, you can see there's a list of symbols, and they're marked with various letters on the left. Some of these letters are more important or less. I'm just gonna focus on three of them on this slide. So the U is undefined symbol. So this is basically saying I need this. So for instance, the static, sorry, the main executable doesn't define this global, g_informer. It needs to get it from somewhere else. W is a weak symbol. So this is a symbol that's defined. But it's basically one that's defined with, like a function that would be defined with inline. And inline, as we know, let's you have multiple definitions of the same thing. So the W is related to that. And T is just a regular defined symbol. Again, for our purposes here, the difference between W and T is not too important. But basically, needed symbols and provided symbols. That's the key thing in this symbol table. Okay so, if we just take this very, very simple mental model, and we apply it to our example, what do we know? Well, when we build the shared library, we build it against the static library. And that means that the code from the static library gets copied and pasted into the shared, or at least the code with regards to the global. When we build the executable, if I were to go back and you looked at the link line, you might notice that I linked in the static first, and then the dynamic. And remember I said the linker just marches from left to right. So the moment that it encounters a symbol that's provided in multiple places, it's just gonna take the first one that it meets. And if it encounters it again, it doesn't care. So the main executable is going to get that symbol from the static, the way that I built it before. So it's also going to get a copy. We're actually gonna come back to what a copy means exactly in this case. So you know, you could ask well now that we know that the link order actually matters, what if we change the order of arguments? So here, what if I link dynamic and then static? Now when I run it, everything just works. So, I mean this is again, another way in which the problem can basically be hidden. If you get all of your linker order arguments right, you may be able to prevent these issues. But you're probably not staring at your linker command line order. So that's probably not a good way to solve the problem. Okay, so let's try to dig a little deeper and figure out what's actually going on. Where does a global actually get initialized? So here I use objdump and I just dump the assembly. And there's a lot of stuff. But if you search through carefully, you will find basically something like what I have here. So, there's this section which is kind of suggestively named global var init. And this is a section which is used for things that have to happen immediately when the shared library is loaded. And one of the things that you can clearly see there is a callq, a call to the constructor of Informer. Now, the shared library in both of our examples, the one that worked and the one that didn't, it was built the same way, right. So the shared library will always try to initialize the global. Okay, now how about the executable? Well, the executable will sometimes try to initialize it. And what I mean by that is if you link against the static first, and then you run objdump, you will find code like this in the executable. Again, it's very similar. It calls the constructor, one thing that you might also notice here is that it actually loads the destructor into this kind of variable cxa_atexit, that makes sure the destructor gets called when the program ends. If you change the link order so that the shared library is first, the executable will not have this code. And then it doesn't run it. So, we actually don't get two copies of the global. What we get are... You only have one copy of the global, but the problem is there's more than one entity in your program that thinks it's their job to construct it destruct it. So... In a way, this is worse. Instead of having two correct globals, which is not great if you actually want it to be a global. But now you have a global whose constructor ran twice in the exact same spot in memory and then destructed. That's pretty bad. Okay so, there are a lot of things you could say here, right. Like you could say, well only write globals in shared libraries. Or only use either static or shared linking, don't mix. You could say all those things. But what I would rather say is what if we just wrote the code so that this could never happen? Because a lot of times the people who might write the code are pretty far removed from the people responsible for building it. And at a large company, maybe you originally only used it as a shared library, you never found this problem, someone downstream of you used it as a static library. So... Robust code is better. So actually, a friend just sent me this link a day or two ago. So, in Boost.Log, there's this caveat. "If your application consists of more than one module, "that use Boost.Log, "the library must be built as a shared object." So I haven't actually dug into the code. I don't know if this is exactly the reason. But it sure smells like it. Especially since logging is a very common use case for globals. I mean, I would prefer to see this written, right. You can build against Boost.Log however you want and everything's gonna be fine. Okay, so what do we do about this? So in C++17, it's actually really simple. We have the keyword inline. Inline... You're probably more used to thinking of inline as well we can just put this in the header. That's true but, it also means that regardless of whether you put it in the header or the cpp, and I'll come back to that point, you only get one copy of the global. And this just works. I'm gonna come back to why this just works. Okay, what if you're not on 17? I'm not on 17. I'm sure many people here are not. So here's a lazy solution. Literally lazy. You basically have a static local and your function returns a reference to it. And that's it. So this kind of popularized in the Meyers singleton. So there's some good things about it, there's some bad things about it. It's lazy, so it always get constructed when you need it. But if you're replacing an existing global, you know have to add these parens everywhere. That could be annoying. Bigger concern, there are some performance questions with this. So first of all, you're gonna have a branch every single time you access this global forever. There's also gonna be some extra code there to actually atomically ensure that this is initialized only once in a thread safe manner. And also, some people this may be an issue, for some it may not. But whenever you do something lazily in a complex program, you are taking the chance that it's going to happen at a really inconvenient time. So if you care a lot about performance, lazy may not be great. So, I've actually had plenty of issues with lazy things happening at the worst possible time. And it's not good. Okay, so what's one way to solve this? Well we could just stop being lazy. So this is a neat trick that's very simple but I haven't really seen suggested very many places or anywhere. So you basically still have your function, but now we just declare a very simple static global reference. So this is basically just a pointer. And we initialize it with a function. This just sort of makes everything work out. The function is forced to be initialized. And now this is not lazy anymore, which might be good or bad. And also, you're going to access through the reference, not through the function. So there's no boolean check anymore. The reference is basically a pointer. You have a pointer straight to your global. You just access it, no boolean. Okay. Now if you really, really, really care about performance when accessing a global, which might be a code smell in the first place, this mess is a little faster, produces slightly better assembly. And so what on earth am I doing here? Well, before I said hey inline just works, I'll explain why later. So if you take a step back and think about it, even before 17, linkers must have known how to do this correctly. And the reason why is because templates, generally speaking, have to be defined in header files> And a template, a class template, could have a static variable. And that static variable has to be handled correctly. So there's actually already a facility in linkers built in that lets you correctly build, correctly link as long as you use a static of a class template. But there's no easy way to access it. And it's a little funny in a way because templates are implicitly inline. So there's actually this implicit inline keyword here that we weren't able to make explicit until 17. As you can see, this is a lot of boiler plate. It's kind of annoying but like I said, it is a little faster than the previous slide. - [Man] Are you missing a static? I'm sorry? - [Man] Are you missing the static keyword? Oh, man that's a pretty bad bug, yeah. Thank you. So that should be a static std string, g_str. Okay. So... all of these techniques will basically help your globals not be double constructor, double destructed. If you're on 17, just use inline. If you're not on 17, pick one of the three. Okay so, the last thing that I'm gonna talk about, kind of briefly I guess is interdependent globals. So, if you just use one of those three tricks, it's actually very easy to write code that will always get constructed and destructed properly, no matter how you build, for one global. The bigger, nasty problem is what's called Static Initialization Order Fiasco. So in C++ there's no guarantee in what order different translation units are loaded. So because of that, once you have globals that depend on other globals, particularly in their constructors and their destructors, you really have a mess. So, unfortunately there is no silver bullet to this that I know of. But what I'm gonna do is basically point out a few pitfalls and alternatives and if you keep the number of globals in your application small to start with, then applying these judiciously should probably help you avoid most SIOF problems. Okay so, the best thing to do is to just not have interdependent globals as much as possible. And earlier in the talk, I mentioned singletons versus globals, and I said singletons even worse than globals. So what I mean by that is that when you write a singleton, you're taking all this useful functionality and you're locking it inside of a thing that can only have one instance. And that's very hard to justify. I mean, if you have a logger, why are you writing the logger as a singleton? There's no fundamental reason why you can't have two, or three, or four, or five loggers in a program. Maybe you want to have one global logger for convenience. So here's an example. Let's say I'm writing an IntrusiveProfileData. This is intended to be used as a global so that I can write my intrusive profiling macros. So in the first line, one option would be to just use the global logger and log something in the constructor. But now I have a dependency between two things that are going to be global. Another option is, you could just take the local logger class and say well because this intrusive performance data is itself a global, it's special. I'm just gonna give it its own logger instance and that may or may not be okay depending on your use case. But, it just completely removes that dependency, so. You'd have like one logging file for your main program. You'd have a different one for your intrusive profiler. And in many situations, that's okay. And that really actually makes things simple, simpler. Another thing, so when people talk about inline coming in 17, a lot of people say well it's great for header only libraries to define globals. That's true. But actually I'm gonna say I think you should always define your globals in header files. That may seem a bit weird. But because of SIOF, right. Consider that if you define a global in a cpp file, and you don't provide any kind of like special hook or function to force initialization of that global, there's no way that anyone else can ever use that global in their globals constructor. It's not possible. Because there's no way they can force that global to initialize first. If you define it in a header file, then you just pound include it. And because it's gonna show up before your global, it will always get initialized before yours. Okay, a couple of quick code examples. So, laziness and destructors. You have to be really careful. Lazy globals are great for some things. They're not so great when you want to so something in a destructor. So here's an example where we have a lazy logger, okay. So it says constructed, logging, destructed. So in this sample program I have another global called Foo. Okay, Foo f, and then my main program tries to do some logging. So what actually happens here right? Foo gets constructed first because it's first. Then main runs. Main tries to use the logger, it does some logging. Now, at this point we fall off the end of main and all the globals get destructed. So, logger was constructed second. So it will be destroyed first. So the logger gets destroyed, and then after that, Foo gets destroyed. And that means that Foo is now trying to log to a logger that doesn't exist anymore. And that's no good. So there's that caveat with laziness. And finally, so here... If you have a header that defines globals, you should include it in your own header. And again, to see what I mean by that, let's look at this example. I've condensed this all into one file but it could be a multi file example where the translation units happen to be set up in this order. What happens here basically is that I've decided to pound include iostream in my cpp file. But in my header file, I define the global. And what happens is that actually the program will see my global before the iostream globals. And if my constructor happens to use that global, yet another way to get a segfault. So, the advantage of header files is they enforce ordering. And when you're dealing with globals, that's very, very important. Okay, so the end of the talk conclusion. Just knowing the very basics of how a linker works, having like a rough mental model, being comfortable with nm and objdump to debug these sort of things, very recommended. Anytime you're creating a non trivial global, so something that's not an integer or a boolean, use one of the techniques that I discussed. Either if you're on 17, just use inline. Otherwise, use something else. Be aware of SIOF. Try to leverage header files to avoid dependency ordering problems. And just think very carefully before you introduce dependencies between globals. Because you're gonna have to think about that dependency forever if you introduce it. So maybe just better to not, so. Thanks very much. And, any questions? (audience clapping) - [Man] One small, additional point. As a useful thing, you can also just not have your globals destructed. If you do the lazy pattern with new to put 'em on the heap, your destructors aren't run. And it's not really a memory leak because the pointers still there, and the entire process is coming down so the OS will reap it. It is substantially safer than having your destructors run. So a couple of thoughts on that. I'm not a huge fan of that personally. It makes it harder to find real memory issues. But even aside from that, memory is not the only resource. And actually, very common when you're writing a fast logger is you're not gonna log things immediately. You're gonna have a buffer. That buffer needs to be flushed. And that will happen in the destructor. But I agree. Depending on your use case, that may be the way to go. Next question. - [Man] Yeah, just one addition. The way you described a linker, does the ordering resolution is specific to Unix linkers? Yes, sorry. - [Man] The linker doesn't work that way. And actually LLD doesn't work that way either. So that's... Is that the clang linker? I forgot the name. Yeah. Yeah, so the clang linker introduced a small wrinkle where it still goes left to right as far as I know. But it picks up defined symbols before they're actually needed. So you don't need to worry that every single symbol table that defines something is after where that symbol is needed. But as far as I know that's the only difference. - [Man] Yeah, exactly. With traditional Unix linkers, you can get in the situation where you have to specify a library multiple times. Yes, yeah. - [Man] On Windows and with LLD you don't have that. Right, I actually meant to prefix the talk. I'm kind of sorry. So, these issues are, as people probably realized, Linux or MacOS specific, I think that the Windows situation for this stuff is actually a lot better, that it's harder to create these globals foolishly on Windows. But I don't really know. I'm not on Windows much, so. - [Man] Hello, just wanted to mention one, maybe slightly neat idea. But now with the constexpr, that overhead where we have to ensure that the constructor's only called once. That overhead doesn't need to be paid because it's essentially the old C style type initialization. Yeah, but I don't think you can mark the constructor for something like a string constexpr right? 'Cause it can do a heap allocation. - [Man] Yeah, for the constructor it wouldn't be possible. But well, it might not be possible in all cases. But in the case that it can, it's now a way to also be able to not pay for that. Yes, yeah that's true. - [Man] Hi, I have still feelings that this is a problem of a build system. Because, I find it elegant way to demonstrate the problems with global variables. But you have duplicated symbols because you link static library to dynamic, and then you link again the static library to the executable. So if you have the functions there and you use both in executable and dynamic library, you get build error to what I understand. So I think the root of the problem is linking twice to the static library. Yeah, I mean. It's one of the roots in the sense that if you didn't do that, you wouldn't observe the problem. The problem is that basically if you don't do that, and this is actually, the build engineer at my company said something very similar to this, the problem is that like then what you're basically saying, and he says this happens a lot in practice that, whoever's distributing the shared library will say okay I need this library with the global in it. But now it's your responsibility as the end user to provide it. Because it has globals and we're not gonna link it in because you might get duplicates. So now it's your problem. So, you can do things this way and if you fully control everything from end-to-end, it's one way to go about it. But I don't think it's optimal. I think each target could be built by someone else. And each target should include its own dependencies. And if you think that those two are both good requirements, then this kind of thing can happen. But again it's a matter of approach. I just prefer to write code that will always work. Thank you. Yeah, not a problem, yep. - [Man] In the very first example, when I saw that we have this object initialized twice from the main executable and from the shared library, this address of this object. What address is it? Is it in the segment of the main executable? Is it in the segment of dynamic library? Or is it something completely different? So in this example, the segment rule... In the example that breaks, the global will live in the main executable. And I didn't really have time to get into this, but basically when the loader loads a shared library, it will just quietly discard any symbols that it's already seen before. And the global itself is a symbol. So this is actually why you don't get two copies of the global. The global will load the shared library. It sees the global symbol. It's already got the global symbol. It ignores it. But then it still runs the initialization code. And since that global symbols already defined, it runs it on the same one. So, in short it would have been in the static, in the example, sorry, in the main executable in the example that I gave. The one from the shared library would physically exist inside the space of the shared library. But it would never have been used by the program. - [Man] Saying with roles of the dynamic library using dlopen? That is another caveat that I didn't mention up front, maybe I should. I mean, I didn't get into dlopen with this talk. I've used it before but I'm not very familiar. So I actually don't know the answer to your question. Okay, thanks very much. (audience clapping)
Info
Channel: CppCon
Views: 42,322
Rating: undefined out of 5
Keywords: conference services, capture presentation slides, conference filming services, conference live streaming, conference recording services, conference recording, event videographers, Computer Science (Field), conference video recording, event video recording, conference videography services, CppCon 2017, record presentation slides, conference video recording services, Bash Films, + C (Programming Language), nationwide conference recording services, Nir Friedman
Id: xVT1y0xWgww
Channel Id: undefined
Length: 35min 3sec (2103 seconds)
Published: Sat Oct 28 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.