A Programming Language for Games, talk #2
Video Statistics and Information
Channel: Jonathan Blow
Views: 85,068
Rating: 4.9216256 out of 5
Keywords: Video Game Development (Industry), programming language, Video Game Design (Field Of Study), Video Game Development (Conference Subject), c programming language, c++ programming language, Computer Programming (Conference Subject), Rust programming language, D programming language, Java (Programming Language), Compiler (Software Genre), Game Engine (Software Genre), Software Development (Industry)
Id: 5Nc68IdNKdg
Channel Id: undefined
Length: 90min 26sec (5426 seconds)
Published: Fri Sep 26 2014
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.
This has been discussed heavily on /r/rust.
In Rust closures have a different syntax to function declaration because, as /u/pcwalton says:
In Rust functions can be used wherever you can use a closure, but closures cannot be used where-ever you can use a function. This is because a closure must carry around an environment containing the data it has captured (you can think of a raw function as having a pointer to a zero-sized environment.
After spending a little time thinking about it, all of this talk of closures strays a long way from his actual concern - namely the idea of migrating functionality to larger and larger scopes over time. In Rust you can simply nest any item inside a function (including functions) in order to limit their scope:
There's no need to introduce closures at all.
I thought this one was better than the previous one. I think the biggest issues is how his syntax and desired features will accomplish his big ideas for his language, but he is still in a brainstorming phase.
I think the things he describes about large functions and how to migrate their code into smaller functions are closer to the industry wide best practices than he assumes, and he gives way too much credit for the stuff you learn in school being considered good practice in a professional environment. There is unfortunately a lot of unlearning that fresh graduates need to undergo when starting their careers.
When he talked about preferring a larger function over several smaller sub-functions and assumed it was controversial, it sounded to me like he discovered YAGNI. The sort of localized abstraction he talks about with moving inline code to local lambdas to module scoped functions to globally available functions is something I, and many of my coworkers, have done for years.
Something about his emphasis on large functions made me uncomfortable. Whereas Carmack's email came across as balanced and very considerate of the issues involved, Blow's explanation gave me a vibe of, "This giant ball of mud function is less readable and maintainable, therefore it must be faster!" Carmack seemed to advocate something like: small pure function > large impure function > small impure functions, whereas Blow sounded like he prefers: large impure function > all. One example was his assumption that the lambda version would be slower without measuring it. (C++ lambda's are easy for the compiler to inline if it chooses to. Mind, it might be slower, but I'd want hard numbers first.) Carmack spent a lot more time developing his point in the email than Blow did in his video, so I hope it is just me being unfair and reading too much into it.
I do think guidelines like 'consider splitting up functions > X lines' are useful, as long as you realize that "no, that function is fine as it is," is a perfectly acceptable answer. Thoughtlessly splitting up a large function is much worse.
His syntax for functions and variables looks similar to Rust and many ML derived languages, and I approve.
His idea of generalizing captures to blocks is interesting (is it a big idea?), though I'm not sure how useful it would be in practice. I'm unclear on the semantics of an empty capture list. In C++, this means that a lambda captures nothing. It sounds like this might still be the case for Blow's language, but then for other blocks, it allows all variables in scope.
There is a pragmatic reason for C++'s capture block being first, it makes parsing easier. In positions where an array operation is allowed, it cannot begin a lambda, and in positions where is can begin a lambda, it cannot be an array operation. His syntax seems like it might be ambiguous with other expressions. As a user of a language, I really don't care about the hell the compiler writer has to go through to make my programs work, except insofar as it hurts availability on various platforms, makes it easier to produce vexing parses, and makes tooling difficult. Since good tooling and joy of programming are features he wants his language to have, it might be worthwhile for Blow to consider more in depth the consequences of his syntax, and to do so sooner.
He seems to love to claim that he really does understand C++ better than the naysayers would claim, and I trust that he does, but I wish he would show that rather than tell. He keeps making simple mistakes that seem to indicate otherwise, and when called out on it, he just bashes people on twitter and in these videos instead of offering corrections. Carmack and Sweeney seem much more careful and deliberate when making statements similar to Blow's. In this video, the incorrect claim that stood out most for me was that C++ lambdas can allocate memory behind your back. I assume he confused lambdas with std::function (something I've seen several people do), whose job it is to unify function pointers, member function pointers, lambdas (potentially with captures), function objects and any other callable entities into one object. This sort of type erasure often (but not always) requires some dynamic allocation to store details of the underlying function-like entity. But std::function isn't the type of a lambda, and c++ lambdas won't allocate hidden heap memory.
Which leads to the next concern. He wants functions to look like lambdas, but regular callable functions are very different beasts to lambdas with captured environment variables at a system level. How does he plan to store them? Many languages require a hidden heap allocation, something I'm sure Blow would find unacceptable. C++ lambdas are syntactic sugar for a function object; the capture list becomes the constructor and member variables, and the body becomes
operator()
. This avoids allocation, but puts the burden on the developer to not do silly things like capture stack variables by reference that will go out of scope. I'm not sure how he plans to have his version avoid implicit allocations or make those allocations clearer to the programmer. He handwaves a lot of these concerns as easy problems to resolve later, but I'd think that avoiding allocations is a big feature (big idea?) of his language that he would want to work out early.I'm beginning to believe that a 'Big Idea' language is a language with an idea Blow doesn't like. All languages have big ideas. It is interesting that he dismisses Rust's focus on memory safety as a Big Idea, but then later seems like he is taking the first step down that road when he wants his compiler to detect some lifetime violations that C++ currently doesn't check. It will be interesting to see how far he walks down that path. For AAA games, when the safety checks get in the way of performance, the safety checks need to go, but it would be nice to have a language with great syntax for custom memory layouts (like he describes in his first video) that can go pretty far in protecting the programmer from mistakes.
His insistence that games need global variables (presumably mutable global variables, I haven't seen complaints that constants like pi are global) seems contrary to his desire to make concurrency easier. I'm curious how he plans to resolve that. I'd also love to see Carmack's or Sweeney's current thoughts on globals. Blow sounds rather positive about them, whereas in 2005, Sweeney treated them as an occasional necessary evil.
I'm still eager to see Blow continue these discussions, as I still believe that a good game programming focused C++ replacement is welcome. I've also been encouraged by the responses of over at /r/rust and their openness to consider Blow's criticisms and add features suitable for game programmers.
Edit: silly typo, In C++ [] captures nothing, not everything
Blow and I don't agree at all, at least with the first thing he brings up. He's arguing that you shouldn't refactor large (impure) functions into smaller functions for the following reasons:
You can't know when the smaller function is called.
You can't know what state the program needs to be in before the function is called.
Commenting to give the reader that information is not reliable because comments are hard to write (?) and may end up not being accurate.
As for the first point, if you need to know that a function is only called in one place, you can declare it static, and define it as close as possible to the place where the function is called. If you need to know where a static function is called, you can just search for it within the file and you can know it won't be called anywhere else because the linker won't expose it. (I'm talking about normal C-style functions here, but I think you would accomplish roughly the same thing when working in classes in C++ by using private member functions.)
The second bit is often (not always) easily avoided by leveraging pure functions and immutable state as much as possible. This is one of the messages from that Carmack email (no one tell Blow that Carmack now does a lot of Haskell programming), and confusingly, Blow agrees later that that's true. Given that all of his example functions are startlingly imperative, I don't know if this is something he's really internalized. A lot of the issues he raises are pretty easily fixed by embracing purity whenever possible.
As for the last point... I'm not even sure how to approach that. Why are we even commenting at all if comments are hard to write and inaccurate? Surely modern languages won't have any way to write comments since they just don't work.
Also, Blow's infatuation with globals is confounding. It's fine if your application needs "global" state, just put it in a structure that you pass around to every function that needs it. This has a lot of advantages: if you design your program without any global state whatsoever (outside of whatever global state the C language mandates), you have a lot of guarantees about your program's behavior. You now know that no function modifies your global state unless they have a (non-const) pointer to the state, you know that your global state is re-entrant, etc. You also may have more solid guarantees about thread safety, depending on how you designed the API for the global state object, of course. And this lets you spin up multiple "copies" of the global state within the same process, which is probably less important for games but rather important for other kinds of development. (His agreement with Carmack about the long functions is especially odd in this case, given that Carmack has talked about how bad global variables are, and how many troubles they invariably cause you, and how he appreciates that Haskell doesn't let you do something that has been empirically observed to be a bad idea.)
His points about the maturation cycle of software are interesting, though.
Like last time, I have trouble relating to much of what he said. Then again, I don't do game engines/loops. So, maybe there is a point in focussing on game dev for his language project.
Two things I noticed:
In the talk he says something about C++ lambdas possibly incuring a heap allocation. This is not the case. Only when you wrap a function object into a
std::function
, you might experience a heap allocation overhead. And that depends on the size of the function object and whether your standard library has implemented an optimization for small function objects or not. I mean, they removedstd::reference_closure
from an early C++11 draft becausestd::function
was believed to be implementable with similar performance characteristics. Also, you can use lambdas directly (without having them wrapped instd::function
) and you should if you want to enable the optimizer to inline these kinds of functions.I also asked myself how he wants to have his cake and eat it, too. I'm talking about the type of lambdas/closures being the same as the type of normal functions. Sure, he could make all function pointers fat and optionally use the remaining space to refer to the closed-over variables somehow. But what are the semantics? Does it refer to a stack-frame and is hence not allowed to outlive the scope it was created in? If so, this will open up a whole lot of lifetime issues. You just don't know whether it's okay to hold on to a function/pointer because the type doesn't tell you. Or does he want to use the extra space in a fat pointer to refer to a heap-allocated block that is considered "owned" by the function? I don't think so. Blow doesn't like heap allocations enough for this approach. So, my claim is, you actually don't want these things to have the same type. It might work well for things like C# delegates (although, garbage collection in C# conveniently helps avoding the lifetime issue) but not for arbitrary closures where possibly more than a single pointer-sized variable is "closed over" (if you don't like heap allocations, that is).
so i'm an hour in and i noticed there's an incosistency between declaring a datatype and declaring a function. in that the type of the data has to be between the equal sign and the colon, and the definition happens after the equal sign. whereas the type of a function sits in the definition area.
I fear that the promises he makes he will be unable to keep. He has the Utopian view of what his language would be but ignores the finer details. In this talk he dismisses how he'd want to handle the payload difference of a function and a closure.
I'd like to hear more about the game industry requirements. He's disregarded GC and RAII, so what is his replacement (in the first talk he said exercising the code and debugging, so malloc/free?). He mentions vector3 and says there are reasons for many implementations.
Personally I'd rather see him improve the state of developing languages (Go, D, Rust [preferably D]) then to build a language from scratch. Fork D strip out what you don't like but take advantage of the existing compatibility with C (no foreign function interface) and the expansion of compatibility with C++ (which it sounds like his language will require). The license for D is permissive (Boost) so it shouldn't be an issue and it works with LLVM.
Does he really say near the end that the largest function in The Witness has 14000 or maybe 17000 lines? I fear I might need to get a hearing aid.
(time-jump link: https://www.youtube.com/watch?v=5Nc68IdNKdg&feature=player_detailpage#t=5191)