Okay so, hello everyone. Welcome to the talk. What C++ developers should know about globals. And in parenthesis, the linker. Alternate title for this talk, things I never wanted to know
about globals and linkers but was forced to find out anyway. That's probably a more accurate summary of how I came by this knowledge. I'm not a build engineer. I'm not a linker dev. I actually work in high frequency trading. So I came by this knowledge
kind of incidentally. But I think that's okay. Most C++ developers in my experience are not experts on the linker. So, maybe this will make the
talk a bit more relatable. Okay, so globals. I've never seen code, you know a code base, and thought to myself wow
there's not enough globals here. If there were only more globals, this code would be so much better, please. So the talk's not an endorsement of yes, sprinkle globals everywhere
once you know how to use them. But in my experience there are things that are best done using globals. So here's three examples that I've used firsthand often. So logging intrusive
performance profiling, and polymorphic factories. So, logging is kind of
a canonical example. So let me just pick on
intrusive performance profiling. You know, typically you'll
have macros in your code that record cycle counters
or other information. They can be enabled
and disabled by macros. They have to record that
information somewhere. And you don't really want to pass down references to this object where they get recorded all the way down and up your code base. It doesn't make sense. It doesn't effect the
behavior of your code. So, it's just easier to use a global and less intrusive for the surrounding code. So, right off the bat, most people, when you talk about globals, they go straight to singletons. And actually globals and
singletons are different, or at least in the definitions
and usages that I've dug up. So, a singleton is a
class that can only have up to one instance. Okay, it's a property of the class. A global is a variable
that has global scope or global access. You know, more technically
in C++ parlance, you could say it's a variable that is at namespace scope. Sometimes a variable
can have global access without it being at namespace scope. That's okay. So these two are actually orthogonal. You could have any one or
both without the other. A singleton that's not a global is rare. But that could exist as well. But lots of globals are not singletons. And in general, I'll come back to this point. But singletons are even
worse than globals. And I discourage them. And I'll explain a bit why, later. Okay, so compiler versus linker. So C++ developers in my experience really, really like the compiler. They like to talk about the compiler, they like to argue about the compiler. You know, the compiler's
getting updated all the time. We had C++11, then 14, then 17. It's really interesting. But the linker, it does get worked on,
it does get updated. But it's sort of this thing that has been around for a very long time. C++ was always designed
to work with C linkers. That was a big part of
the original design. C++ code written today in C++17, you could probably use linkers, maybe before I was born
as an exaggeration, but probably not by much. So, this leads to weirdness. And if you don't believe
me that C++ developers really like the compiler
more than the linker, anytime you Google for help, you'll see that any question that involves the C++ standard and compiling, and the compiler, you will find 10 people on Stack Overflow with 100,000 rep arguing about the correct
interpretation of the standard. When you Google for help with the linker, usually you can't find any answer. And when you do find an answer, it's got like two up votes
or something like that. So, it's a lot harder to find out how you should be doing things. Okay, so here's my example. So here's code that looks pretty innocent. Or at least it may look
innocent to some of you if you haven't encountered
this problem before. So let's start with a static library. So, I show the header and the cpp contents in the same code block,
just for simplicity. So, in the header file, we're just including a string, we've got our pragma once, we declare a global with extern. And then in the cpp file
we define the global after declaring it in the header file. So far so good. Okay, so now we have a shared library. And if you're little fuzzy on static versus shared libraries, I'll also talk a little bit about the difference there shortly. So what does a shared library do? Well, it has a function
that basically just gets this global that it
gets from the static library. So the shared library depends
on the static library. And then finally we have an executable. The executable depends on both the shared and static libraries. And it's just gonna try
to print the string. It's gonna try to print
the string that it gets directly from the static. And it's gonna try to print
the string that it gets directly from the dynamic
library, the shared library. Okay, so I'm curious. How many people would be able to build this from memory? I'm just curious. Hands? A few people, not too many. I mean, I had to look
up all these commands. I mean, we're used to
just writing some C++ code and then we just fire
off our build command. We have a build engineer at our company. He tells us what to do or we use CMake. And I was very careful here. Many commands that you
might issue the compiler, and by the compiler I mean basically, g++ or clang++. They actually invoke the
compiler and then the linker, depending on what you're doing. So here, I've separated everything. So we had three things
that we were interested in. The static, the shared,
and the executable. So, I have six commands. One to compile the object code, and one to produce the actual target. So, first I compile the cpp file. Then I produce a static library with this command called ar. Next, I compile the code for the shared library. I link it against the static library. So, that's what that looks like. And then finally, I compile the code for the main executable, the object code. And I link the executable against the static and shared libraries. So, when I do this, and I run it, any guesses as to what prints out or what happens? Watching this program will segfault. So... Again, if you've encountered
this problem before, maybe you kind of had a
feel where this was going. But this program, this code, seems very, very innocent. And yet, this segfaults. And it actually only segfaults
if you build it this way. If you only use static libraries, if you only use shared libraries, you're gonna be okay. Which in some ways makes it worse because you find the problem later. And this is probably a good
place to point out that as far as this stuff is concerned, a class static variable is still a global. So that seems like a very reasonable thing from a software engineering point of view. If you are writing a class, let's say the class parses JSON. In JSON true is a literal
that has to be parsed out as a boolean value. Maybe you would have a
class static constant string which is set to true. And that seems totally
reasonable and harmless but that can cause these segfaults. And it does. Okay so, let's try to understand
what's actually happening. So, I suspect that a
lot of C++ programmers have written a class like this
at some point in their lives. Personally, I call it Informer, because it informs on
everything that it does. So here, when it's
constructed, it will print. And it will print its address. When it destructs it will also
print and print its address. And it also has a get method which just returns its
address too, why not. So I make that change, and I also will change my main. I mean, there's no string
to print out anymore. So I'll just print out the address. And let's try to see what's going on with these global objects. Okay, so we're not doing
anything interesting anymore, like allocating memory. So we don't segfault. But here's what the program prints out. One expectation might have been that you'd have two copies of the global, which was actually a very
reasonable expectation, but it's wrong. There aren't two copies of the global. But the constructor runs
twice in the same spot. And then the destructor
runs twice in the same spot. So, it's not very hard to see that this is gonna cause big problems. If you have a string, what's gonna happen is, you're gonna allocate some memory, set some pointers, then you're gonna allocate again and set the same pointers
to the new allocation. So that memory is gone
forever now, you leaked it. And then you're going to call delete on the pointers in the string. And then you're gonna
do that a second time with the same pointer. So that causes a segfault. That's not a good thing to do. Okay so, this is where
we take a quick detour and try to talk a little
bit about the linker. Okay so, how does a linker work? And this is very, very simplified, I'm assuming. (laughing) I don't know. I assume it's simplified. This is kind of like a spherical
cow model of the linker. You provide a link line. The link line has a bunch of targets. The linker marches from left to right. And you can think of the linker as like it keeps a little notepad or a table. And every time that it
encounters a target, that target has a symbol table. And the symbol table basically
says to the linker look, here's a list of all
the symbols that I need, whose definitions I don't have but I refer to in code. And here's a list of things that I do have that I provide in case
anybody else needs it. And it marches left to right. And every time a symbol is needed, it adds it to its little list. And every time that a
needed symbol is provided by a new target, it then does one of two things. And this is the difference between static and dynamic libraries. If the library is static, then it basically copies
and pastes the object code into whatever target
you're trying to make. So when you look against
the static library, you copy and paste the code, it becomes part of your executable. That's why you don't need to
ship a static library with your executables. If it's a dynamic library, then it basically just checks the box, and it politely says, thank you, this symbol is gonna come from this shared library. And then it tells another entity, or depends on another
entity called the loader. And the loader will fetch this at runtime by loading the shared library the moment the executable runs. So that's our static
and dynamic libraries. Okay, so I mentioned symbol tables. Maybe we should take a quick
look at a symbol table. So the easiest way to
look at a symbol table is to use a little command
line application called nm. And usually you'll want to pass the double dash demangle flag so you can read the output more easily. This is symbol table... Actually not sure for
which version of main. But the idea here, you can see there's a list of symbols, and they're marked with
various letters on the left. Some of these letters are
more important or less. I'm just gonna focus on
three of them on this slide. So the U is undefined symbol. So this is basically saying I need this. So for instance, the static, sorry, the main executable doesn't define this global, g_informer. It needs to get it from somewhere else. W is a weak symbol. So this is a symbol that's defined. But it's basically one
that's defined with, like a function that would
be defined with inline. And inline, as we know, let's you have multiple
definitions of the same thing. So the W is related to that. And T is just a regular defined symbol. Again, for our purposes here, the difference between W
and T is not too important. But basically, needed
symbols and provided symbols. That's the key thing in this symbol table. Okay so, if we just take this very,
very simple mental model, and we apply it to our example, what do we know? Well, when we build the shared library, we build it against the static library. And that means that the
code from the static library gets copied and pasted into the shared, or at least the code with
regards to the global. When we build the executable, if I were to go back and
you looked at the link line, you might notice that I
linked in the static first, and then the dynamic. And remember I said
the linker just marches from left to right. So the moment that it encounters a symbol that's provided in multiple places, it's just gonna take the
first one that it meets. And if it encounters it again, it doesn't care. So the main executable is going to get that
symbol from the static, the way that I built it before. So it's also going to get a copy. We're actually gonna come back
to what a copy means exactly in this case. So you know, you could ask well now that
we know that the link order actually matters, what if we change the order of arguments? So here, what if I link
dynamic and then static? Now when I run it, everything just works. So, I mean this is again, another way in which the
problem can basically be hidden. If you get all of your
linker order arguments right, you may be able to prevent these issues. But you're probably not staring at your linker command line order. So that's probably not a good
way to solve the problem. Okay, so let's try to dig a little deeper and figure out what's actually going on. Where does a global
actually get initialized? So here I use objdump and I just dump the assembly. And there's a lot of stuff. But if you search through carefully, you will find basically something like what I have here. So, there's this section which is kind of suggestively named global var init. And this is a section
which is used for things that have to happen immediately when the shared library is loaded. And one of the things that
you can clearly see there is a callq, a call to the constructor of Informer. Now, the shared library
in both of our examples, the one that worked and
the one that didn't, it was built the same way, right. So the shared library will
always try to initialize the global. Okay, now how about the executable? Well, the executable will sometimes try to initialize it. And what I mean by that is if you link against the static first, and then you run objdump, you will find code like
this in the executable. Again, it's very similar. It calls the constructor, one thing that you might also notice here is that it actually loads the destructor into this kind of variable cxa_atexit, that makes sure the destructor gets called when the program ends. If you change the link order so that the shared library is first, the executable will not have this code. And then it doesn't run it. So, we actually don't get
two copies of the global. What we get are... You only have one copy of the global, but the problem is there's
more than one entity in your program that thinks it's their job to construct it destruct it. So... In a way, this is worse. Instead of having two correct globals, which is not great if you
actually want it to be a global. But now you have a global
whose constructor ran twice in the exact same spot in memory and then destructed. That's pretty bad. Okay so, there are a lot of things you could say here, right. Like you could say, well only write globals
in shared libraries. Or only use either static or
shared linking, don't mix. You could say all those things. But what I would rather say is what if we just wrote the code so that this could never happen? Because a lot of times the
people who might write the code are pretty far removed from the people responsible
for building it. And at a large company, maybe you originally only
used it as a shared library, you never found this problem, someone downstream of you
used it as a static library. So... Robust code is better. So actually, a friend just sent me this link a day or two ago. So, in Boost.Log, there's this caveat. "If your application consists
of more than one module, "that use Boost.Log, "the library must be
built as a shared object." So I haven't actually dug into the code. I don't know if this
is exactly the reason. But it sure smells like it. Especially since logging is a very common
use case for globals. I mean, I would prefer to
see this written, right. You can build against
Boost.Log however you want and everything's gonna be fine. Okay, so what do we do about this? So in C++17, it's actually really simple. We have the keyword inline. Inline... You're probably more used
to thinking of inline as well we can just put this in the header. That's true but, it also means that regardless of whether you put it in the header or the cpp, and I'll come back to that point, you only get one copy of the global. And this just works. I'm gonna come back to
why this just works. Okay, what if you're not on 17? I'm not on 17. I'm sure many people here are not. So here's a lazy solution. Literally lazy. You basically have a static local and your function returns
a reference to it. And that's it. So this kind of popularized
in the Meyers singleton. So there's some good things about it, there's some bad things about it. It's lazy, so it always get
constructed when you need it. But if you're replacing
an existing global, you know have to add
these parens everywhere. That could be annoying. Bigger concern, there are some performance
questions with this. So first of all, you're gonna have a
branch every single time you access this global forever. There's also gonna be
some extra code there to actually atomically ensure that this is initialized only once in a thread safe manner. And also, some people this may be an issue, for some it may not. But whenever you do something lazily in a complex program, you are taking the chance that it's going to happen at a
really inconvenient time. So if you care a lot about performance, lazy may not be great. So, I've actually had
plenty of issues with lazy things happening at
the worst possible time. And it's not good. Okay, so what's one way to solve this? Well we could just stop being lazy. So this is a neat trick that's very simple but I haven't really seen suggested very many places or anywhere. So you basically still have your function, but now we just declare a very simple static global reference. So this is basically just a pointer. And we initialize it with a function. This just sort of makes
everything work out. The function is forced to be initialized. And now this is not lazy anymore, which might be good or bad. And also, you're going to
access through the reference, not through the function. So there's no boolean check anymore. The reference is basically a pointer. You have a pointer
straight to your global. You just access it, no boolean. Okay. Now if you really,
really, really care about performance when accessing a global, which might be a code
smell in the first place, this mess is a little faster, produces slightly better assembly. And so what on earth am I doing here? Well, before I said hey inline just works, I'll explain why later. So if you take a step
back and think about it, even before 17, linkers must have known
how to do this correctly. And the reason why is because templates, generally speaking, have to be defined in header files> And a template, a class template, could have a static variable. And that static variable
has to be handled correctly. So there's actually already a
facility in linkers built in that lets you correctly build, correctly link as long as you use a static of a class template. But there's no easy way to access it. And it's a little funny in a way because templates are implicitly inline. So there's actually this
implicit inline keyword here that we weren't able to
make explicit until 17. As you can see, this is a lot of boiler plate. It's kind of annoying but like I said, it is a little faster than the previous slide. - [Man] Are you missing a static? I'm sorry? - [Man] Are you missing
the static keyword? Oh, man that's a pretty bad bug, yeah. Thank you. So that should be a
static std string, g_str. Okay. So... all of these techniques will basically help your globals
not be double constructor, double destructed. If you're on 17, just use inline. If you're not on 17, pick one of the three. Okay so, the last thing that I'm gonna talk about, kind of briefly I guess is interdependent globals. So, if you just use one
of those three tricks, it's actually very easy to write code that will always get constructed
and destructed properly, no matter how you build, for one global. The bigger, nasty problem is what's called Static
Initialization Order Fiasco. So in C++ there's no guarantee in what order different
translation units are loaded. So because of that, once you have globals that
depend on other globals, particularly in their constructors
and their destructors, you really have a mess. So, unfortunately there is
no silver bullet to this that I know of. But what I'm gonna do
is basically point out a few pitfalls and alternatives and if you keep the number of
globals in your application small to start with, then applying these judiciously should probably help you
avoid most SIOF problems. Okay so, the best thing
to do is to just not have interdependent globals
as much as possible. And earlier in the talk, I mentioned singletons versus globals, and I said singletons
even worse than globals. So what I mean by that is that when you write a singleton, you're taking all this
useful functionality and you're locking it inside of a thing that can only have one instance. And that's very hard to justify. I mean, if you have a logger, why are you writing the
logger as a singleton? There's no fundamental reason
why you can't have two, or three, or four, or
five loggers in a program. Maybe you want to have one
global logger for convenience. So here's an example. Let's say I'm writing
an IntrusiveProfileData. This is intended to be used as a global so that I can write my
intrusive profiling macros. So in the first line, one option would be to
just use the global logger and log something in the constructor. But now I have a dependency
between two things that are going to be global. Another option is, you could just take the local
logger class and say well because this intrusive performance
data is itself a global, it's special. I'm just gonna give it its
own logger instance and that may or may not be okay depending on your use case. But, it just completely
removes that dependency, so. You'd have like one logging
file for your main program. You'd have a different one
for your intrusive profiler. And in many situations, that's okay. And that really actually
makes things simple, simpler. Another thing, so when people talk about
inline coming in 17, a lot of people say well
it's great for header only libraries to define globals. That's true. But actually I'm gonna say I think you should always
define your globals in header files. That may seem a bit weird. But because of SIOF, right. Consider that if you define
a global in a cpp file, and you don't provide any
kind of like special hook or function to force
initialization of that global, there's no way that anyone else can ever use that global in their globals constructor. It's not possible. Because there's no way
they can force that global to initialize first. If you define it in a header file, then you just pound include it. And because it's gonna
show up before your global, it will always get
initialized before yours. Okay, a couple of quick code examples. So, laziness and destructors. You have to be really careful. Lazy globals are great for some things. They're not so great when
you want to so something in a destructor. So here's an example where
we have a lazy logger, okay. So it says constructed,
logging, destructed. So in this sample program I
have another global called Foo. Okay, Foo f, and then my main program
tries to do some logging. So what actually happens here right? Foo gets constructed first because it's first. Then main runs. Main tries to use the logger, it does some logging. Now, at this point we
fall off the end of main and all the globals get destructed. So, logger was constructed second. So it will be destroyed first. So the logger gets destroyed, and then after that, Foo gets destroyed. And that means that Foo
is now trying to log to a logger that doesn't exist anymore. And that's no good. So there's that caveat with laziness. And finally, so here... If you have a header that defines globals, you should include it in your own header. And again, to see what I mean by that, let's look at this example. I've condensed this all into one file but it could be a multi file example where the translation units happen to be set up in this order. What happens here basically is that I've decided to pound include
iostream in my cpp file. But in my header file,
I define the global. And what happens is that actually the program will see my global before the iostream globals. And if my constructor
happens to use that global, yet another way to get a segfault. So, the advantage of header files is they enforce ordering. And when you're dealing with globals, that's very, very important. Okay, so the end of the talk conclusion. Just knowing the very basics
of how a linker works, having like a rough mental model, being comfortable with nm and objdump to debug these sort of things, very recommended. Anytime you're creating
a non trivial global, so something that's not
an integer or a boolean, use one of the techniques
that I discussed. Either if you're on 17, just use inline. Otherwise, use something else. Be aware of SIOF. Try to leverage header files to avoid dependency ordering problems. And just think very carefully
before you introduce dependencies between globals. Because you're gonna have to think about that dependency forever if you introduce it. So maybe just better to not, so. Thanks very much. And, any questions? (audience clapping) - [Man] One small, additional point. As a useful thing, you can also just not have
your globals destructed. If you do the lazy pattern with new to put 'em on the heap, your destructors aren't run. And it's not really a memory leak because the pointers still there, and the entire process is coming down so the OS will reap it. It is substantially safer than having your destructors run. So a couple of thoughts on that. I'm not a huge fan of that personally. It makes it harder to
find real memory issues. But even aside from that, memory is not the only resource. And actually, very common when
you're writing a fast logger is you're not gonna
log things immediately. You're gonna have a buffer. That buffer needs to be flushed. And that will happen in the destructor. But I agree. Depending on your use case, that may be the way to go. Next question. - [Man] Yeah, just one addition. The way you described a linker, does the ordering
resolution is specific to Unix linkers? Yes, sorry. - [Man] The linker doesn't work that way. And actually LLD doesn't
work that way either. So that's... Is that the clang linker? I forgot the name.
Yeah. Yeah, so the clang linker
introduced a small wrinkle where it still goes left
to right as far as I know. But it picks up defined symbols before
they're actually needed. So you don't need to worry that every single symbol table
that defines something is after where that symbol is needed. But as far as I know
that's the only difference. - [Man] Yeah, exactly. With traditional Unix linkers, you can get in the
situation where you have to specify a library multiple times. Yes, yeah. - [Man] On Windows and with
LLD you don't have that. Right, I actually meant
to prefix the talk. I'm kind of sorry. So, these issues are, as people probably realized, Linux or MacOS specific, I think that the Windows
situation for this stuff is actually a lot better, that it's harder to create these globals foolishly on Windows. But I don't really know. I'm not on Windows much, so. - [Man] Hello, just wanted to mention one, maybe slightly neat idea. But now with the constexpr, that overhead where we have to ensure that the constructor's
only called once. That overhead doesn't need to be paid because it's essentially
the old C style type initialization. Yeah, but I don't think you
can mark the constructor for something like a
string constexpr right? 'Cause it can do a heap allocation. - [Man] Yeah, for the constructor
it wouldn't be possible. But well, it might not be possible in all cases. But in the case that it can, it's now a way to also be
able to not pay for that. Yes, yeah that's true. - [Man] Hi, I have still
feelings that this is a problem of a build system. Because, I find it
elegant way to demonstrate the problems with global variables. But you have duplicated symbols because you link static
library to dynamic, and then you link again the static library to the executable. So if you have the functions there and you use both in executable
and dynamic library, you get build error to what I understand. So I think the root of the problem is linking twice to the static library. Yeah, I mean. It's one of the roots in the sense that if you didn't do that, you wouldn't observe the problem. The problem is that basically
if you don't do that, and this is actually, the build engineer at my
company said something very similar to this, the problem is that like then
what you're basically saying, and he says this happens
a lot in practice that, whoever's distributing the shared library will say okay I need this
library with the global in it. But now it's your
responsibility as the end user to provide it. Because it has globals and
we're not gonna link it in because you might get duplicates. So now it's your problem. So, you can do things this way and if you fully control
everything from end-to-end, it's one way to go about it. But I don't think it's optimal. I think each target could be built by someone else. And each target should
include its own dependencies. And if you think that those
two are both good requirements, then this kind of thing can happen. But again it's a matter of approach. I just prefer to write
code that will always work. Thank you.
Yeah, not a problem, yep. - [Man] In the very first example, when I saw that we have this object initialized twice from the main executable and
from the shared library, this address of this object. What address is it? Is it in the segment of the main executable? Is it in the segment of dynamic library? Or is it something completely different? So in this example, the segment rule... In the example that breaks, the global will live
in the main executable. And I didn't really have
time to get into this, but basically when the loader
loads a shared library, it will just quietly discard any symbols that it's already seen before. And the global itself is a symbol. So this is actually why
you don't get two copies of the global. The global will load the shared library. It sees the global symbol. It's already got the global symbol. It ignores it. But then it still runs
the initialization code. And since that global
symbols already defined, it runs it on the same one. So, in short it would
have been in the static, in the example, sorry, in the main executable
in the example that I gave. The one from the shared library would physically exist inside the space of the shared library. But it would never have
been used by the program. - [Man] Saying with roles
of the dynamic library using dlopen? That is another caveat that I didn't mention up front, maybe I should. I mean, I didn't get into
dlopen with this talk. I've used it before but
I'm not very familiar. So I actually don't know
the answer to your question. Okay, thanks very much. (audience clapping)
Figure there's no harm to mention: I'm the speaker, happy to answer any questions.
Great talk - I learnt a lot!
Could someone please explain the code sample at 20:52 with a "Marginally faster" title on it? I can't get my head around it. Given
std::string
s should read asstatic std::string
s, what are all thetemplate
s andstruct
supposed to do? In the case if I need to export a global or two from my TU should I wrap them in such a struct?Interesting and informative.
I've encountered many mysterious linking problem in my time; and I've rarely understood precisely what the issue is, and why the fix works (the fix usually being to change the order of linking). So the talk helped with that.
The take-home message seems to be to declare globals (if any) using
inline
in the header. That seems like an easy way to avoid some problems with linking.I try Jajauma's https://github.com/Jajauma/GlobalsDemo on ArchLinux: gcc is 7.2.1 while clang is 5.0.1. Both run OK for initial crash version: