CppCon 2018: Geoffrey Romer “What do you mean "thread-safe"?”

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

I think this is one area where C++ should look at how Rust addresses the issue. Data race safety is enforced in the safe subset of the Rust language. A similar type of enforcement is available in a (data race) safe subset of C++. (A subset that excludes raw pointers and references.) (Shameless plug alert.)

First of all, you want to note the distinction between a type being safe to pass (by value) to another thread and being safe to "share" with another thread. (Rust calls these "Send" and "Sync" traits.)

And rather than just "documenting" the safe "passability" or "shareability" of a type, it can be "annotated" in the type itself. This allows thread objects to ensure/enforce at compile-time that none of their arguments are prone to data races. And any type that needs it can be annotated by wrapping it in a transparent "annotation" template wrapper.

For example, it's not really safe to share std::vector<>s among threads, because any thread could obtain an iterator to the vector and inadvertently dereference it outside the period when it's safe to do so. But you could imagine a vector type that (is swappable with std::vector and) does not support (implicit) iterators, and so would be more appropriate for sharing among threads. (And recognized as such by the thread objects.)

You could even imagine a data type that safely allows multiple threads to simultaneously modify disjoint sections of a vector or array.

I think we really want to get beyond having to explicitly deal with mutexes. Just like how std::shared_ptr<> (and std::unique_ptr<>) is premised on the notion that it's not a good idea for the lifetime of a dynamic object and the accessibility of that object to be manually coordinated by the programmer, it's similarly not generally a good idea for the synchronization of the object to be manually coordinated (with its lifetime and accessibility) by the programmer. I think the obvious progression is to have reference types that automatically (safely) coordinate lifetime, accessibility and synchronization of dynamic objects.

👍︎︎ 8 👤︎︎ u/duneroadrunner 📅︎︎ Nov 10 2018 🗫︎ replies

is vector::operator[] guaranteed to be "thread safe"? I don't see anything on it that says it's safe to call concurrently. He uses this in an example (on a non-const vector) talking about how it's safe, but I don't see how that's guaranteed.

relevant part of the talk: https://www.youtube.com/watch?v=s5PCh_FaMfM&t=23m14s

edit: I'm asking VERY SPECIFICALLY about whether the code inside std::vector::operator[] is guaranteed to be thread safe.

The const qualified version certainly is, but what about non-const qualified version?

👍︎︎ 2 👤︎︎ u/Xaxxon 📅︎︎ Nov 10 2018 🗫︎ replies

Captions

- Uh, my name's Geoff Romer, I'm a software engineer at Google where I work on improving the C++ development experience, especially as it relates to concurrency. I represent Google on the concurrency study group of the C++ standards committee. I wrote our C++ concurrent programming guide and I'm one of the engineers responsible for our C++ style guide. Today I'm gonna be talking about how to talk about thread-safety in C++. And please feel free to jump in with questions at any time. So when we're dealing with multi-threaded code, we often toss around the term thread-safe. It sounds reassuring, right? We have threads and this is C++ so we need all the safety we can get. So thread-safety sounds like a good thing. But what does it actually mean? If I see a function or a class or an object that says it's thread-safe, what does that allow me to do? So this is how Posix defines thread-safe. It says a thread-safe function can be safely invoked concurrently with other calls to the same function or other thread-safe functions. And that sounds reasonable, but what about code like this? For all the examples that I'll be showing, we're assuming that thread one and thread two are functions that might run concurrently. And Posix says that memcpy is a thread-safe function. So is this code safe? Hands up anybody who thinks this thread is safe. Good, yes it's not. It's definitely not safe. We have two concurrent rights to the same data, namely out. And those rights could clobber each other unless there's more synchronization. But how 'bout this? Who thinks this code is safe? Who thinks it's not safe? Okay. This code is safe, at least as far as I know. And all we've done here is changed that output parameter into a local rather than a global. So thread-safety isn't just about functions, it's also about their inputs. And what does thread-safety mean when we talk about thread-safe types? So here's one partial example, partial answer, from a talk that Herb Sutter gave in 2012. And incidentally, that talk is well-worth watching. There will be a link in the slides. So bitwise const or internally synchronized actually gets at a lot of the key ideas. But this is obviously nonsense. I'm being kind of unfair here. The point of Herb's talk was that both const and mutable imply important thread-safety requirements. And that slide was a way of summing that up with a memorable joke and obviously it worked because here I am six years later talking about it. But I think part of what was going on is that Herb was using the term thread-safe with a different meaning when he was talking about const than when he was talking about mutable because more precise terminology wasn't available to him. So in this talk I'm gonna present the terminology that we use at Google to document and reason about thread-safety and C++. So what does thread-safety mean? We'll start with what safety means. When we talk about thread-safety, what are we trying to be safe from? Anybody, what are we trying to be safe from? Any guesses? I hear data races. Anybody else? Bugs, corruption. Okay. I am simultaneously alarmed and relieved not to have heard the incorrect answer that I have my next slide about, which is race conditions. The trouble with race conditions is that the term just means that there's some valid thread timing or sequencing where the program doesn't do what you want it. And that covers basically every kind of concurrency bug other than a deterministic deadlock. So it would be nice to be safe from race conditions, but it's not clear how we would actually do that. Whereas when we talk about terms like memory safety, type safety or signal safety, we're not just talking about being safe from certain kinds of bugs, we're talking about safety from some specific locally avoidable bug pattern. Race conditions aren't locally avoidable in general, so I don't think race conditions is the answer. But I did hear data races. We're getting warmer. So just to review, what is a data race? This is a simple example. We have one thread that's trying to modify an int while another that's trying to print it. And that's what a data race is, two operations that happen concurrently, while one of them modifies an object while the other is accessing it. Which is pretty straightforward. But here's another example. This is pretty much the same code, but now we're working with a std string instead of an int. But again we have two concurrent operations accessing the same object and one of them is a modification. So is this a data race? Who thinks this is a data race, hands up? Okay, I am going to disagree with you, and the standard actually disagrees with you too. The standard gives a precise definition for the term data race, which mostly looks like what we'd expect. A data race is when two potentially concurrent non-atomic actions conflict, and they conflict if one of them modifies data access by the other. But it's not talking about objects here, it's talking about memory locations. In standardees, a memory location is basically an object with a built in type: integers, floats and doubles, pointers, anumes, things like that. Object of class types are never memory locations and in particular, a std string is not a memory location. So according to the standard, this second example is not a data race, or at least we can't say for sure that it's a data race without breaking open the abstraction and peeking into the implementation details of std string. It's still a bug, and it's still undefined behavior, but we just can't call it a data race. And that seems silly because from the programmer's point of view, it's exactly the same mistake either way. So I talked to a bunch of concurrency experts about this but they were pretty much unanimous that no, even informally, we can't really call this a data race. At most we can say that this code will probably result in a data race for most plausible implementations of std string. But you never know if your library vendor is a lunatic, std string might be implemented with atomics and this might not result in any data race. But it would still be incorrect. Since we can't call this data race unfortunately, we need to introduce a new term. And the best term that I've found for this kind of bug is an API race. An API race occurs when the program performs two concurrent operations on an object when that object's API contract doesn't permit them to happen concurrently. An API race is always a bug, and in fact, it's always undefined behavior, just like a data race. So I claim that when we're talking about thread-safety, we're really talking about avoiding API races. And API races are a huge category of bugs. They include all data races and also all of those bugs that look like data races but involve object types and other more complex types. And yet, this definition is still concrete and local, unlike race condition because it's about misuse of a specific object. And consequently it seems possible to systematically avoid API races and maybe even to detect them when they occur. One other thing to notice about this definition is that in C++ we answer the questions about how an object is accessed and what its API contract permits by looking at its type. And that means that rather than talking about the thread-safety of functions the way Posix does, we should be focusing primarily on the thread-safety of types. But if an API race is a concurrent access in violation of the contract how do you know what kind of operations the object's contract permits? For example, does this code have an API race on shared widget? We're concurrently invoking foo and bar, but to figure out if this is an API race, I have to go check the documentation for foo and bar to see if they can be invoked concurrently. And if this were a more realistic example with a bunch of operations, I'd have a sort of n-squared problem of checking each pair of operations to see if all of them are allowed to be concurrent. But it gets worse. With this example, foo and bar aren't methods of the same class, they're methods of separate classes. But they're taking the same object as a parameter. I could still go check the documentation for foo and bar, but now they might not have any idea that each other even exists, much less be able to tell me whether I can call them concurrently. I could go look at the implementations and see what they do with that shared widget, but in order for that work, I would not only have to look at the bodies of foo and bar, I would have to look at all of the functions that they pass the widget onto and so on transitively. It would also mean that I would have to look at all possible future implementations of all those functions. Because if somebody makes a change to the implementation details in there somewhere, they're not gonna come make sure that my code is still safe after that change. And obviously that really doesn't scale well. So in order to cope with this, we need some help from the widget type. The most obvious way that the widget could help would be for it to guarantee that there are never any API races on widgets. If that's the case, we don't need to dig through the implementations of foo and bar because we know a priory that there's no way for them to create an API race on shared widget. And obviously this is a really useful property. And at Google unsurprisingly, we say that such types are thread-safe. And having that shorthand makes it really easy to document those types, and more importantly, makes it really easy to recognize them when you're reading the code. There's one caveat to mention here, which is that even for thread-safe types, destruction can't be concurrent with any other operation on the object. You need some logic outside the object to determine when all threads are done with it and so it's safe to destroy. Shared putter is pretty good for this. And there's some better libraries for it making their way toward the standard. So if thread-safe types never have API races, should we just make all types thread-safe? Problem solved, right? No. The thing about thread-safe types is that they're not free. You typically need a mutex or some other synchronization perimitive inside every instance of a thread-safe type. And that can create deadlock risks, and even if it doesn't, it increases the memory footprint of the object and it imposes some performance overhead on every operation. Now it's true that mutexes are pretty small, and acquiring an uncontended mutex is pretty cheap but the problem scales up as you compose objects together. So this example, we have a jobrunner class that's maintaining a set of the jobs it's running and a set of the jobs that it finished. And it's using a thread-safe type jobset for both of those sets but it still needs its own mutex to ensure that every job is always in the running set or the done set. That means that job runner is paying the storage cost for three mutexes, and the runtime cost of locking and unlocking them, but two of those three mutexes are literally doing nothing. The jobrunner mutex on its own ensures that only one thread at a time will access either set. So we would've been better off if we just used std set directly. And incidentally, for those of you who were at Herb's planary talk this morning, I tend to disagree with his first example because it has this problem. His example where he had a class with data and a mutex, but the client was responsible for locking it. I think that's problematic for this reason. And I can go into that a bit more later. As a consequence of this issue, we generally don't make types thread-safe unless their primary purpose is to big directly shared between threads and those threads can mutate the object. So for example, std mutex and std atomic are thread-safe types but ordinary library types like std string are not. Fortunately, it turns out that there's a way for a type to provide a lot of the benefits of being thread-safe while paying very few of the costs. This is essentially the same as the previous example, but now we're sharing an int instead of a widget. In this case, it's a lot easier to tell if there's an API race, because as we saw earlier, the language will say that API races on built in types or in other words data races, only occur when someone is modifying the object. And in C++ we have const, so the type system can keep track of which inputs a function can modify. And that means to figure out if this code is safe, we don't have to dig through the implementations of foo and bar because it's enough just to know their signatures. With these two signatures, the type system guarantees that shared int won't be mutated, and so there's no API race in this code. So the question is, can we apply the same reasoning to a widget, is this code guaranteed not to have an API race? Not in general, but it is guaranteed to be safe if widget provides the same safety guarantee as a built in type. In other words, if it guarantees that an API race can only occur if one of the operations is a mutation. And at Google we say that a type is thread-compatible if it makes that guarantee. This isn't quite as simple a guarantee as being thread-safe but it's pretty close and it's far easier to achieve because thread-compatibility composes. If all your members are thread-compatible, chances are your type is thread-compatible with no extra effort on your part. And chances are, your members are thread-compatible because nearly all -- or sorry, all built in types are thread-compatible, nearly all standard library types are thread-compatible, and all thread-safe types are thread-compatible by definition. You only really have to go out of your way to be thread-compatible if you have const methods or friend functions that modify some part of your physical state, or in other words, they're logically const but not physically const. And the type system normally stops you from doing that accidentally so the main things to watch out for are members that are marked mutable since that indicates code that is explicitly opting out of that const type checking. So for example, here we have a pretty stupid string view like type that computes its size lazily. And so that means that the size accessor actually modifies the size member even though the accessor is const because it doesn't modify the observable state of the object. And that means that we need to add explicit synchronization in the form of a mutex in order to make this class thread-compatible. As an exception to that rule, it's okay to have a mutex as a mutable member and in fact, mutex members usually should be marked mutable so that you could lock them in const methods like in this example. And that's good news because if making the mutex mutable meant that you need another mutex to protect it, then we'd be in trouble. More generally, you might be able to avoid adding synchronization if you're mutable members are thread-safe but only if you make sure that you're const methods never break your types and variants, even temporarily. And that can be pretty easy to mess up. So generally speaking, it's better to avoid that whole can of worms and just make your const operations be physically const wherever possible. Keep in mind though that if some of your object state is behind a pointer, then that state will behave as though it were a mutable member even though it's not marked with a mutable key word. So that's a situation that you need to keep an eye out for. As a final caveat, I should mention that we stole the term thread-compatible from the Java folks, but we gave it a stricter meaning. In both Java and C++, thread-compatible is the term for the baseline level of thread-safety that nearly all types should provide, but in Java it doesn't mean that concurrent reads are safe because Java has no language level concept of a read-only operation. Java doesn't have const. So just be aware that if you hear this term in the context of other languages it may mean something different. So we've seen that if widget is thread-safe, then this example is safe, and if it's thread-compatible, then we just need to make sure that neither foo nor bar are taking non-const references. But what if widget isn't even thread-compatible? In that case, unless widget is giving you some kind of custom guarantee, it's gonna be nearly impossible to be sure that code like this is safe as written. Instead, you have to use a mutex or some other synchronization outside the object to ensure that only one operation at a time has access to it. But if you can make sure that only one thread at a time accesses the object, that object is guaranteed not to be the sight of any API races, no matter what. As I mentioned before, the most common reason that types fail to be thread-compatible is because they have mutable members or in some other way, they have unusual or broken const semantics. And here is an example that is near and dear to my heart. We have here a counter struct that has a call operator that increments its c member. And the call operator is not const, so counter is a perfectly well behaved thread-compatible type. But when we wrap it in std function everything goes south. We're calling f in two different threads, and f is const, f is originally declared const, so we know that we're invoking a const operation, and yet this code contains an API race because the counter is getting incremented in two different threads with no synchronization. Even though std functions call operator is const, calling it concurrently can be an API race, which makes std function one of the few types in the standard that's not thread-compatible. And the reason is that std function stores the underlined function object as the moral equivalent of a mutable member. The good news is that std function is a rare exception. Most types are thread-compatible or have thread-compatible alternatives, so you're rarely stuck with a non thread-compatible type. As I said earlier, in C++ thread-safety is primarily about types not about functions, but there are some rare cases where you do have to start thinking about functions. So going back to our widget example, if the widget is thread-compatible and foo and bar take the widget by const reference, that's still not quite enough to guarantee that there's no API race between these two lines of code. Specifically, foo and bar might say that you're not permitted to invoke them concurrently at all, even if their inputs are different. For example, their implementation might look like this, where behind the scenes they're both mutating the same static int with no synchronization, and that would be an API race, if you called them concurrently. So when a function called can create an API race on a object that's not one of its inputs, we say that the function is thread-hostile. And notice this is a property of a function, not a property of a type. And it's virtually always because the function is accessing some data other than its inputs. When you're calling a thread-hostile function, all bets are off. You have to check the documentation to figure out how to call it safely, or better yet, don't call it at all, and find a better function with no behind the scenes inputs. Functions virtually never actually need to be written this way. It's almost always an accident or a mistake of some kind. By the way a lot of sources refer to functions that are not thread-hostile as thread-safe functions. And that's particularly common in sources that are focused on c like the Posix definition that I showed earlier. I recommend avoiding that terminology because people tend to assume that it means more than it does. Especially when you describe a member function as thread-safe. Furthermore, in modern code, you can and should just assume that every function is not thread-hostile unless it specifically says otherwise, so we don't really need a special name for functions that are not thread-hostile. With those definitions in hand, we now have a pretty simple procedure for reasoning about the thread-safety of line of code. A given line of code is guaranteed to have no API races if it doesn't call any thread-hostile functions, if there are no lifetime issues, and if each input is either not being accessed by other threads or it's thread-safe, or it's thread-compatible and not being mutated. And that's pretty much all you need in order to avoid API races in the first place, or at least track them down after the fact. There are a couple subtleties though, mostly around the issue of what counts as an input. So consider this example. Like almost all standard library types, vector is thread-compatible, not thread-safe. And here we have two different threads mutating the vector. So does this code contain an API race? Who thinks it does? Some of you. Yeah, I would argue no it doesn't. This code is safe because the threads are mutating different elements of the vector and those elements count as separate inputs. You have a question? - [Man] Yeah if you go back to the last slide a second, I don't know if I just missed this but could you just go like what you mean by an input being live again? - By an input being live? Just in the C++ sense that it hasn't been destroyed. As I mentioned earlier, you have to handle lifetime separately outside the type in some way. - [Man] Cool, thank you. - You're welcome. So as general principle, when a thread-compatible type exposes some of its sub-objects, like how vector exposes its elements, for thread-safety purposes you can treat each sub-object as a separate object. And you can also treat the remainder of the parent object as a separate object. And that applies not only to the elements of containers, but also the members of pairs and tuples, the values of types like optional invariant, and public data members of classes and struts. However, that only applies if those sub-objects are real objects that the API gives you direct public access to. And here's the exception that proves the rule. This is almost the same example as the previous one, only now we're working with bools instead of ints. And this example does contain an API race and that's because the elements of a vector bool aren't real objects, they're just notional boolean values that are represented using some unspecified implementation details inside the vector bool. The rule of thumb is that for thread-safety purposes, a sub-object is independent only if you can take its address or form a reference to it. And you can't do that with the elements of a vector bool, so this code is not safe. And that principle means that if you're writing a thread-compatible type you need to be thoughtful about when you're exposing pointers, references, or even const references to internal objects. If you do that, you have to clearly document which of your operations can read, mutate, or invalidate those sub-objects. Of course, you have to do a lot of that anyway, or your API will be confusing even for single threaded users but thread-safety raises the stakes. And by the same token, when you're using one of these types in a multi-threaded context you need to have a clear mental model of what operations on the parent object can access the sub-objects. So going back to the vector int example, there's a related issue that it highlights. Sometimes a non-const method isn't a mutation. In this case, the square bracket operator overload is non-const but it doesn't actually mutate the vector itself, it just gives non-const access to one of the vector's elements, to the calling code. So sometimes you can treat a non-const method as being non-mutating for thread-safety purposes but only if their API guarantees that they really are non-mutating. And the standard containers make that guarantee for the square bracket operator and for most of the other methods that you'd expect. So here's another way that the notion of an input can be tricky. In this example, we have two concurrent calls to f and they have no arguments in common. Nothing that's an argument to one is an argument to the other. And yet, this is an API race because they're both incrementing the second element of v concurrently because they're operating on overlapping ranges. You could argue that means that f is thread-hostile, but we would hope not because f is a perfectly reasonable and well-behaved looking function. This could have been a standard algorithm if I'd wanted to do it that way. So I claim that no, f is not thread-hostile, and the reason for that is that even though the second element of v isn't an argument to either of these calls, it's an input to both of them because it's clear at the point of use, at the point of the function call, that these functions are going to access that object. As other examples of the same kind of thing, an object might be an input without being an argument if it's a sub-object of an argument, or if it's pointed to by an argument, whether by a pointer or a reference or a smart pointer or what have you. The thing to be extremely careful of here is if you have a private class member that points to data that might be shared. Like in this example here. In this case, the widget has a hidden pointer to a counter but the counter isn't necessarily private to the widget because it gets passed in through the constructer. And that means that we can wind up in a situation like we see on the right where these two calls to Twiddle have an API race even though they have completely separate inputs. And that makes Twiddle a thread-hostile function. So it's much better to avoid having private handles like that, but if you can't, one option for dealing with that is to make sure that the shared data can't behave like an input for thread-safety purposes. And at a minimum that means that you need to make sure that the shared data has a thread-safe type. It's not enough to add a mutex to widget because the int is potentially shared by multiple widgets and maybe even by code that's completely unrelated to widget. So you have to switch to sharing data whose type is inherently thread-safe. So for example we could change this code so the counter is a pointer to an atomic int rather than and ordinary int. You also have to be very attentive to the risk of race conditions in a case like that if you have any invariants that relate the shared data to other parts of your program. So for example, that fix of turning the counter into a pointer to an atomic int, might be sufficient if the counter is just used for monitoring or something like that, but if the counter actually affects the logic of the program, I'd be very worried about this code, even if the counter were atomic. The other option for dealing with this situation is to make your type very very explicit about the fact that it points to external data and so that data is potentially input to any function that takes your type as an input. And that's what the iterator was doing in the previous example. An iterator type is a type that is very very explicit about the fact that it confers access to some underlying range and so if a user sees code that passes an iterator as an argument, they're not surprised when the underlying data gets accessed. As a final note, it's important to keep in mind that the more layers of indirection there are between the formal arguments and the actual inputs to your code, the harder it can be to determine whether the inputs are mutable because the const part of the type system essentially has a harder time getting a hold. So for all those reasons, it's generally better to keep the relationship between the arguments and the actual inputs as simple and as direct as possible. So much for the theory, how do we actually apply this in practice? When you're defining types, you should make sure that your types are at least thread-compatible if possible and you should avoid patterns that make thread-compatibility hard like mutable members. You should make your types thread-safe if you expect public mutable instances of the type to be accessed by multiple threads, but otherwise you don't need to worry about making a type thread-safe. You should always document if your type is thread-safe or if it's not thread-compatible because those are the unusual cases. Of course it's better to document all three cases and to document every type as being thread-safe, thread-compatible, or not thread-compatible. But if you omit thread-compatible, that's what readers will tend to assume. Assuming you make your type at least thread-compatible, you should be thoughtful about directly exposing any of your sub-objects because that requires you to make it clear to the user how the sub-objects relate to the parent object for thread-safety purposes. Question? - [Man] You seem to -- your classification is thread-safe, thread-compatible, and not thread-compatible. - Yes. - [Man] You're not saying a -- - I'm not saying thread-unsafe, yeah. Internally we've, uh, we've talked about thread-safe thread-compatible and thread-unsafe. The thing I don't like about that is that it's not a hierarchy. Every thread-safe type is also a thread-compatible type, but every thread-compatible type is not a thread-unsafe type so it's a little -- it seems a little simpler conceptually to me to just talk about thread-safe and thread-compatible as two strengthenings of the baseline fact that it's a type. Does that make sense? - [Man] Yeah, I buy it, cool. - So when defining functions, including member functions, you should avoid making them thread-hostile at all costs, just never do that. And that means that there should be no hidden, mutable shared state and you should be very very careful about having private pointers to data that might be shared across threads. If you have a thread-hostile function that you can't fix, you should document very clearly that it's thread-hostile and explain how to use it safely or point readers to safe alternatives. When you're writing concurrent application code, avoid calling thread-hostile functions, and make sure that all inputs to a given piece of code are either thread-safe, not being accessed by other threads, or thread-compatible and not being mutated. And usually the best way to do that is to make sure that all shared objects are either thread-safe or thread-compatible and mutable. And if you need to share state that doesn't meet those requirements, define a wrapper type that does using a mutex. And in the very rare cases where you can't follow those guidelines, read the documentation and be very very careful. And with that, are there any questions? (applause) Thank you. (applause) - [Man 2] Question, why did you not talk about re-entrancy concepts? - Sorry why didn't I talk about what? - [Man 2] Re-entrancy. - Re-entrancy? - [Man 2] Re-entrant. - Um, mostly because -- - [Man 2] It's very close to thread-compatible right? - Re-entrancy has, particularly it has some close connections to thread-hostility. Honestly the reason I didn't talk about it is just that it doesn't come up that much for us. Normally if a function has re-entrancy problems, it's probably gonna be thread-hostile too. So that's the main reason I didn't talk about it. I will say there's been some discussion recently on the committee mailing lists about what exactly do we mean by re-entrancy. Some people were thinking that it's just about functions and other people were thinking that it's about types. That like, whether a member is re-entrant is a question of whether you can re-enter a function on the same instance versus separate instances. And that's all pretty fuzzy, but again, in practice, if you're writing code in a concurrent world, at least we haven't found that re-entrancy is an issue that comes up very much. - [Man2] Okay thank you. - [Man 3] My question is about your definition of what is a data race. Let's consider a case when two threads set the same stream or something, of course -- - Sorry I'm having trouble hearing you. - [Man 3] So let's consider the situation, two threads set the same stream simultaneously and it is a data race. Then I add mutex to (unintelligible) that as a function, well from the definition, it's not a data race anymore. The question is if I add a mutex but there's only purpose to (unintelligible) an exception. If the mutex is hard to locate, is this situation, it a data race or it's kind of not a data race because well, simultaneous access to the object is generally considered the exception. - I'm sorry I'm having trouble hearing the question. - [Man 3] If I cover possible API race moved by some, will form its behavior, but just so an exception in our threads who is not capable to acquire it, is it a source? - If you have a mutex but it's just for -- - [Man 3] So exception, if it's locked. - Throwing an exception if it's locked? - [Man 3] Yes. - That, I think it would depend on the specifics of the situation, but it sounds like, so long as you, uh -- at the end of the day, an API race is just a situation where you have concurrent access that the object's contract doesn't permit. If you have a mutex guarding an object then whether you throw if the mutex is held, or block if the mutex is held, shouldn't matter so long as it enforces mutual exclusive access. - [Man 4] I kinda wanna take slight issue with one of your earlier points about not having the possibility of an API race if you don't have concurrent access. And it's sort of a special case but I've definitely run into it before. I would say you would be correct if that was qualified to say that the functionality does not explicitly or implicitly rely on anything that is tantamount to thread local storage. However, I have run into cases where you can get API races depending on which pseudo-thread context accesses the object first and this is particularly the case with certain types of call objects in the apartment model that might get your object implicitly loaded into it the first time that you access it. So my slight quibble with that is you have to qualify that by saying there's no implicit reliance on something like thread-local storage otherwise there still exists the possibility of an API race. And I guess the follow up to that is, is there any context in your internal nomenclature to mark something that would be potentially dependent on something akin to thread-local storage for that type of situation? - Um so I think I would be inclined to describe a situation like that by saying that the operations that potentially access that thread-local storage are thread-hostile. Because the -- at least if I'm understanding correctly, you're saying that you could wind up in a uh, in a race even if the inputs are different? - [Man 4] It's not a classic data race, it's more of your API race where the behavior of the underlying APIs might be different depending on which apartment context you had implicitly loaded into, which is based on the timing of which thread accesses the object first. - Mhmm. Yeah I guess uh -- - [Man 4] I mean you could construct a simple example where you had a thread-local storage variable and depending on which thread, it did something different. - Yeah, I guess I would say that I would classify that as a form of thread-hostility because it's about hidden implicit inputs, namely the thread-local storage, which isn't explicitly referenced by the function call. - [Man 4] Okay, I just wanted to note that you can have those even if you're doing effectively immutable operations against the particular object. - Sure, yeah. Once you're in the realm of thread-hostility, the whole -- basically thread-hostility is where the notion of types as an abstraction for dealing with thread-safety breaks down and you have to start looking at individual functions and things like immutability stop mattering as much. - [Man 4] So then would it be fair to say that you would characterize anything that's implicitly dependent on thread-local storage context as implicitly thread-hostile? - Um, if it's implicitly dependent in the sense that the contract forbids you from calling it on different threads then uh -- - [Man 4] In this case you can, you just have to be aware of -- - Well if the contract permits it, then yeah, we're no longer in the realm of API races, we're talking about race conditions where you have some, you potentially have some higher level logic bug that depends on threads or timing in some way. But you're not violating anybody's contract. - [Man 5] Hi, how does your classification differ from how the standard talks about thread-safety of its types and is the standard gonna move in the direction of what you're describing or -- - So the standard for the core language just talks about data races and built in types. So the core language doesn't really have to worry about these issues. The library standard does. And the library wording in this area is very mushy and like, even the people who wrote it are not very happy with it, but it actually attempts to capture exactly these concepts. It was co-written by a Googler who's aware of this concept. The intent roughly speaking is to say that standard library types are thread-compatible unless otherwise specified. - [Man 5] And the way that they talk, you mentioned the case of the vector bracket operation being a non-const operation but is not an API race. How do they express that and is that -- I'm just curious because your terms seem pretty clear. I'm just curious if the standard is going to move in the direction of describing types in this way and describing exceptions to them using this framework. - I'm not aware of any movement toward -- also, well, there's two parts to the answer. Regarding the square bracket operator case, there's wording, I believe it's in the general description of containers that says methods with these names are -- it doesn't say methods but, um, functions with these names uh, don't access or, don't mutate the object for purposes of determining a data race. That's one thing that I think is not ideal about the library wording, is that the library wording is still talking about data races rather than API races. The other half of your answer is there's one small way that the standard is starting to move in this direction. There's a proposal that is pretty well-advanced, although it's probably gonna go into the concurrency TS rather than the standard right away, where there will be an is race free trait that defaults to false, but defaults to true for const types and then is overwritten to be true for things like std atomic. And that intentionally reflects the notion that we can just assume that types are thread-compatible unless they say otherwise. - [Woman] Your talk was mostly theoretical. What about more practical advices, like what problems with straight (unintelligible) you have mostly productions or inproduction, like, I don't know, people using mutables or internal production or -- - Sorry I'm having -- can you move closer to the mic? - [Woman] I'll try to repeat. So your talk was mostly about theory, what about more practical advices like types don't use mutable because we had too many production incidents or something like this. - Yeah, um. I wouldn't say that uh -- I can't draw a straight line between much of this guidance and production issues. This is mostly about how to organize your thinking about concurrent code and make it tractable. I don't really have any specific practical advice to add beyond what I've already said, particularly the stuff from the last slide. - [Man 6] If you wanna take it offline afterwards, I have a fair chunk of practical advice from Google on this front. - Any other questions. Oh yeah. - [Man 7] Hi Geoffrey, um more of a question slash comment. Are you familiar with exceptions that distinguish between strong and basic exception guarantees? - Somewhat yeah. - [Man 7] I was just wondering if the term strong versus basic are better adjectives for what you're describing? - I don't immediately see any problem with calling this, like saying strong thread-safe and basic thread-safe, it's a little more wordy. I can't claim that these are, you know, the best possible terms. These are just terms that we've used at Google for quite awhile and they've worked. - [Man 7] Yeah and it's just a suggestion I guess, but like you said, when people say just the term thread-safe they mean different things. - Yeah, so Herb Sutter actually mentioned to me a couple of days ago that he prefers the terms internally synchronized in place of thread-safe and externally synchronized in place of thread-compatible. I'm not sure I like that terminology quite as much because types that are internally synchronized might not actually -- like, std atomic doesn't contain a mutex, and an externally synchronized type might not actually be externally synchronized at all. It might just not be accessible to other threads, things like that. But that's another set of terminology that exists for this. - [Man 8] A couple quick things. First you mentioned std function, you mentioned the problem that was there. I know your answer to this, but I think it would be great if you said it here. You consider std function to be const, correct? - Yes, I think this is a bug instant function, plain and simple. There have been attempts to fix it, time's starting to run out for C+ 20, but maybe we can make it. - Yeah and I guess, you mentioned that there were other places where we are not thread-compatible, did you have any other ones in mind, other than like, you know, vector bool or well, not vector bool, other than std function I should say. - Std function is the main one, is the main type that I know of in the library that's not thread-compatible. There's, I believe, there have been -- there's been at least one type in the standard, I think, it was some kind of reverse iterator something like that, something that was specified in terms of having a mutable member and thereby became thread-incompatible, but I think that was fixed and off the top of my head I don't remember which type it was and I haven't tracked it down. - [Man 9] Do you have any thought on how to doc something like std maps operator bracket, which is only thread-compatible when insertions do not occur? - Um. I don't think there's any good shorthand for that, I think documenting a situation like that is just gonna be a matter of writing complete sentences. I think the square bracket operator on maps is a fiasco in a bunch of different ways. I'd really like to see a proposal for overloading the square bracket equals, in other words, a mutating square bracket separately from an accessing square bracket. But that hasn't happened yet. But yeah in the mean time, operations that don't look like they mutate but do are very problematic when it comes to thread-safety for precisely this reason. Go ahead. - [Man 10] Obviously you talked a lot about how you guys talk about thread-safety with respect to objects. Is there any movement or impetence to attribute types appropriately and then do some compile type checking on that? And then just as a total aside note, I'm gonna take a slight disagreement with you on std function. I think std function internally has the equivalent of a private pointer and it has the exact same problems as any other class that has a private pointer, where the pointer itself can be const, but it can have mutable operations inside it, and if there were an effort to attribute code that would have to somehow be encapsulated where the type is const but the things that it could be pointing to may be non-const. - So, taking that sort of in reverse order, um, I see your point but I disagree with the notion that std function is essentially a pointer like type. And the reason I disagree with that is that std function's cpy deconstructor performs a deep copy of the underlying function object. Which is actually turning out to be kind of a problem because it means that you can't use std function to wrap a move only function object. So that to me makes it much more of a value type than a pointer type. And makes the behavior that showed const incorrect. As for attributing in this kind of thing, we haven't done any work in that direction, we do have some -- there's some thread-safety annotations that I think are now public in client, but that's more for marketing things like this data member needs to be protected, you need to be holding this mutex when you're accessing this data member, that kinda thing. And it's a pretty limited best effort kinda thing. I have a sort of pipe dream, but one of the reasons why the experts won't let me call an API race a data race is that they wanna reserve the term data race for things that TS can diagnose. I have a sort of pipe dream that maybe we could attribute types as being thread-compatible or thread-safe and then have TS diagnose those rather than waiting until they turn into actual data races but there are some formidable obstacles to actually making that work. Any other questions? Okay well thank you for coming. (applause)

Info

Channel: CppCon

Views: 21,557

Rating: undefined out of 5

Keywords: Geoffrey Romer, CppCon 2018, Computer Science (Field), + C (Programming Language), Bash Films, conference video recording services, conference recording services, nationwide conference recording services, conference videography services, conference video recording, conference filming services, conference services, conference recording, conference live streaming, event videographers, capture presentation slides, record presentation slides, event video recording

Id: s5PCh_FaMfM

Channel Id: undefined

Length: 53min 43sec (3223 seconds)

Published: Fri Nov 09 2018