A Programming Language for Games, talk #2

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

This has been discussed heavily on /r/rust.

In Rust closures have a different syntax to function declaration because, as /u/pcwalton says:

one of the design goals of Rust is to make things that have different machine-level semantics look different.

In Rust functions can be used wherever you can use a closure, but closures cannot be used where-ever you can use a function. This is because a closure must carry around an environment containing the data it has captured (you can think of a raw function as having a pointer to a zero-sized environment.

After spending a little time thinking about it, all of this talk of closures strays a long way from his actual concern - namely the idea of migrating functionality to larger and larger scopes over time. In Rust you can simply nest any item inside a function (including functions) in order to limit their scope:

fn foo() {
    struct Bar { x: i32 }
    {
        fn baz(b: Bar) { /* ... */ }
        baz(Bar { x: 1});
    }
    // baz is not visible here
}
// Bar and baz are not visible here

There's no need to introduce closures at all.

👍︎︎ 6 👤︎︎ u/bjzaba 📅︎︎ Sep 28 2014 🗫︎ replies

I thought this one was better than the previous one. I think the biggest issues is how his syntax and desired features will accomplish his big ideas for his language, but he is still in a brainstorming phase.

I think the things he describes about large functions and how to migrate their code into smaller functions are closer to the industry wide best practices than he assumes, and he gives way too much credit for the stuff you learn in school being considered good practice in a professional environment. There is unfortunately a lot of unlearning that fresh graduates need to undergo when starting their careers.

When he talked about preferring a larger function over several smaller sub-functions and assumed it was controversial, it sounded to me like he discovered YAGNI. The sort of localized abstraction he talks about with moving inline code to local lambdas to module scoped functions to globally available functions is something I, and many of my coworkers, have done for years.

Something about his emphasis on large functions made me uncomfortable. Whereas Carmack's email came across as balanced and very considerate of the issues involved, Blow's explanation gave me a vibe of, "This giant ball of mud function is less readable and maintainable, therefore it must be faster!" Carmack seemed to advocate something like: small pure function > large impure function > small impure functions, whereas Blow sounded like he prefers: large impure function > all. One example was his assumption that the lambda version would be slower without measuring it. (C++ lambda's are easy for the compiler to inline if it chooses to. Mind, it might be slower, but I'd want hard numbers first.) Carmack spent a lot more time developing his point in the email than Blow did in his video, so I hope it is just me being unfair and reading too much into it.

I do think guidelines like 'consider splitting up functions > X lines' are useful, as long as you realize that "no, that function is fine as it is," is a perfectly acceptable answer. Thoughtlessly splitting up a large function is much worse.

His syntax for functions and variables looks similar to Rust and many ML derived languages, and I approve.

His idea of generalizing captures to blocks is interesting (is it a big idea?), though I'm not sure how useful it would be in practice. I'm unclear on the semantics of an empty capture list. In C++, this means that a lambda captures nothing. It sounds like this might still be the case for Blow's language, but then for other blocks, it allows all variables in scope.

There is a pragmatic reason for C++'s capture block being first, it makes parsing easier. In positions where an array operation is allowed, it cannot begin a lambda, and in positions where is can begin a lambda, it cannot be an array operation. His syntax seems like it might be ambiguous with other expressions. As a user of a language, I really don't care about the hell the compiler writer has to go through to make my programs work, except insofar as it hurts availability on various platforms, makes it easier to produce vexing parses, and makes tooling difficult. Since good tooling and joy of programming are features he wants his language to have, it might be worthwhile for Blow to consider more in depth the consequences of his syntax, and to do so sooner.

He seems to love to claim that he really does understand C++ better than the naysayers would claim, and I trust that he does, but I wish he would show that rather than tell. He keeps making simple mistakes that seem to indicate otherwise, and when called out on it, he just bashes people on twitter and in these videos instead of offering corrections. Carmack and Sweeney seem much more careful and deliberate when making statements similar to Blow's. In this video, the incorrect claim that stood out most for me was that C++ lambdas can allocate memory behind your back. I assume he confused lambdas with std::function (something I've seen several people do), whose job it is to unify function pointers, member function pointers, lambdas (potentially with captures), function objects and any other callable entities into one object. This sort of type erasure often (but not always) requires some dynamic allocation to store details of the underlying function-like entity. But std::function isn't the type of a lambda, and c++ lambdas won't allocate hidden heap memory.

Which leads to the next concern. He wants functions to look like lambdas, but regular callable functions are very different beasts to lambdas with captured environment variables at a system level. How does he plan to store them? Many languages require a hidden heap allocation, something I'm sure Blow would find unacceptable. C++ lambdas are syntactic sugar for a function object; the capture list becomes the constructor and member variables, and the body becomes operator(). This avoids allocation, but puts the burden on the developer to not do silly things like capture stack variables by reference that will go out of scope. I'm not sure how he plans to have his version avoid implicit allocations or make those allocations clearer to the programmer. He handwaves a lot of these concerns as easy problems to resolve later, but I'd think that avoiding allocations is a big feature (big idea?) of his language that he would want to work out early.

I'm beginning to believe that a 'Big Idea' language is a language with an idea Blow doesn't like. All languages have big ideas. It is interesting that he dismisses Rust's focus on memory safety as a Big Idea, but then later seems like he is taking the first step down that road when he wants his compiler to detect some lifetime violations that C++ currently doesn't check. It will be interesting to see how far he walks down that path. For AAA games, when the safety checks get in the way of performance, the safety checks need to go, but it would be nice to have a language with great syntax for custom memory layouts (like he describes in his first video) that can go pretty far in protecting the programmer from mistakes.

His insistence that games need global variables (presumably mutable global variables, I haven't seen complaints that constants like pi are global) seems contrary to his desire to make concurrency easier. I'm curious how he plans to resolve that. I'd also love to see Carmack's or Sweeney's current thoughts on globals. Blow sounds rather positive about them, whereas in 2005, Sweeney treated them as an occasional necessary evil.

I'm still eager to see Blow continue these discussions, as I still believe that a good game programming focused C++ replacement is welcome. I've also been encouraged by the responses of over at /r/rust and their openness to consider Blow's criticisms and add features suitable for game programmers.

Edit: silly typo, In C++ [] captures nothing, not everything

👍︎︎ 13 👤︎︎ u/oracleoftroy 📅︎︎ Sep 27 2014 🗫︎ replies

Blow and I don't agree at all, at least with the first thing he brings up. He's arguing that you shouldn't refactor large (impure) functions into smaller functions for the following reasons:

  • You can't know when the smaller function is called.

  • You can't know what state the program needs to be in before the function is called.

  • Commenting to give the reader that information is not reliable because comments are hard to write (?) and may end up not being accurate.

As for the first point, if you need to know that a function is only called in one place, you can declare it static, and define it as close as possible to the place where the function is called. If you need to know where a static function is called, you can just search for it within the file and you can know it won't be called anywhere else because the linker won't expose it. (I'm talking about normal C-style functions here, but I think you would accomplish roughly the same thing when working in classes in C++ by using private member functions.)

The second bit is often (not always) easily avoided by leveraging pure functions and immutable state as much as possible. This is one of the messages from that Carmack email (no one tell Blow that Carmack now does a lot of Haskell programming), and confusingly, Blow agrees later that that's true. Given that all of his example functions are startlingly imperative, I don't know if this is something he's really internalized. A lot of the issues he raises are pretty easily fixed by embracing purity whenever possible.

As for the last point... I'm not even sure how to approach that. Why are we even commenting at all if comments are hard to write and inaccurate? Surely modern languages won't have any way to write comments since they just don't work.

Also, Blow's infatuation with globals is confounding. It's fine if your application needs "global" state, just put it in a structure that you pass around to every function that needs it. This has a lot of advantages: if you design your program without any global state whatsoever (outside of whatever global state the C language mandates), you have a lot of guarantees about your program's behavior. You now know that no function modifies your global state unless they have a (non-const) pointer to the state, you know that your global state is re-entrant, etc. You also may have more solid guarantees about thread safety, depending on how you designed the API for the global state object, of course. And this lets you spin up multiple "copies" of the global state within the same process, which is probably less important for games but rather important for other kinds of development. (His agreement with Carmack about the long functions is especially odd in this case, given that Carmack has talked about how bad global variables are, and how many troubles they invariably cause you, and how he appreciates that Haskell doesn't let you do something that has been empirically observed to be a bad idea.)

His points about the maturation cycle of software are interesting, though.

👍︎︎ 8 👤︎︎ u/[deleted] 📅︎︎ Sep 27 2014 🗫︎ replies

Like last time, I have trouble relating to much of what he said. Then again, I don't do game engines/loops. So, maybe there is a point in focussing on game dev for his language project.

Two things I noticed:

In the talk he says something about C++ lambdas possibly incuring a heap allocation. This is not the case. Only when you wrap a function object into a std::function, you might experience a heap allocation overhead. And that depends on the size of the function object and whether your standard library has implemented an optimization for small function objects or not. I mean, they removed std::reference_closure from an early C++11 draft because std::function was believed to be implementable with similar performance characteristics. Also, you can use lambdas directly (without having them wrapped in std::function) and you should if you want to enable the optimizer to inline these kinds of functions.

I also asked myself how he wants to have his cake and eat it, too. I'm talking about the type of lambdas/closures being the same as the type of normal functions. Sure, he could make all function pointers fat and optionally use the remaining space to refer to the closed-over variables somehow. But what are the semantics? Does it refer to a stack-frame and is hence not allowed to outlive the scope it was created in? If so, this will open up a whole lot of lifetime issues. You just don't know whether it's okay to hold on to a function/pointer because the type doesn't tell you. Or does he want to use the extra space in a fat pointer to refer to a heap-allocated block that is considered "owned" by the function? I don't think so. Blow doesn't like heap allocations enough for this approach. So, my claim is, you actually don't want these things to have the same type. It might work well for things like C# delegates (although, garbage collection in C# conveniently helps avoding the lifetime issue) but not for arbitrary closures where possibly more than a single pointer-sized variable is "closed over" (if you don't like heap allocations, that is).

👍︎︎ 3 👤︎︎ u/sellibitze 📅︎︎ Sep 27 2014 🗫︎ replies

so i'm an hour in and i noticed there's an incosistency between declaring a datatype and declaring a function. in that the type of the data has to be between the equal sign and the colon, and the definition happens after the equal sign. whereas the type of a function sits in the definition area.

👍︎︎ 2 👤︎︎ u/nsaibot 📅︎︎ Sep 27 2014 🗫︎ replies

I fear that the promises he makes he will be unable to keep. He has the Utopian view of what his language would be but ignores the finer details. In this talk he dismisses how he'd want to handle the payload difference of a function and a closure.

I'd like to hear more about the game industry requirements. He's disregarded GC and RAII, so what is his replacement (in the first talk he said exercising the code and debugging, so malloc/free?). He mentions vector3 and says there are reasons for many implementations.

Personally I'd rather see him improve the state of developing languages (Go, D, Rust [preferably D]) then to build a language from scratch. Fork D strip out what you don't like but take advantage of the existing compatibility with C (no foreign function interface) and the expansion of compatibility with C++ (which it sounds like his language will require). The license for D is permissive (Boost) so it shouldn't be an issue and it works with LLVM.

👍︎︎ 2 👤︎︎ u/nascent 📅︎︎ Sep 28 2014 🗫︎ replies

Does he really say near the end that the largest function in The Witness has 14000 or maybe 17000 lines? I fear I might need to get a hearing aid.

(time-jump link: https://www.youtube.com/watch?v=5Nc68IdNKdg&feature=player_detailpage#t=5191)

👍︎︎ 1 👤︎︎ u/grumpy_orange 📅︎︎ Oct 08 2014 🗫︎ replies
Captions
thank you everybody for coming to another one of these talks about language design this one's going to be shorter than the last one and more specific this one is about declarations so we're going to go into some syntactic issues in the language about how you declare things like variables and functions and factor ability which is the act of you know moving parts of your program around relative to each other you know and I I would say refactor ability which is how I said that last time but you know the word refactor sounds very serious you know it sounds like oh my code is terrible and I had to refactor it right which is a thing that we do but I want to get across the notion that code transformations and factorings you know for certain styles of programming well are fluid things that happen all the time right so that's the subject of this talk is to go into that a little bit specifically and talk about how a language could support that kind of programming in a way that is not really being supported by other languages all right now to kick it off you know one of the things that I talked about last time and one of the reasons the talk was really long last time is because I felt the need to establish that there was a really big gulf between what are generally considered to be best practices among programmers generally and if you go read a programming book what the author of that book will tell you is write about how to program or if you go to programming school what they will teach you right there's a big gulf between that and what high end game programmers are finding empirically to be true when they try to solve really hard problems at high levels of performance now I don't want to put forth the idea that game programmers have some unanimous idea of how to program well that's obviously not true either right if you ask a lot of people you'll get different opinions but it is generally agreed among most high-end game programmers that I talked to that at least some subset of what are generally considered by the programming public to be best practices or actually disruptive now I wanted to point at another example of somebody saying these things right so it's not just me or if this idea that best practices are not necessarily best is shocking there's some other people who will give you their own viewpoint on that so one of those people is Mike Acton he gave what appears to be this really great talk at CPP con just a few weeks ago and the slides are up here on SlideShare net the video of the talk is not available yet but I'm hoping that it'll be made available soon because it's a really good talk and what it will show you is when a serious game programmer actually needs to sit down and confront problems in reality how that leads them to conclusions that are different than the ideas behind these best practices right um you know Mike is the engine director at insomniac games he works on a number of major major well-regarded big and complicated games and has to solve problems on those games and so he's really worth paying attention to and worth taking seriously so if you're one of those people you know if your arrest enthusiast or a D enthusiast and you think wow this guy Jonathan Blow is kind of a wacko you know not believing in all this stuff that everybody else tells me is the right way to program you don't have to listen to me you can listen to some other people too who again I'm not going to pretend that Mike has the same viewpoint as me but there are similarities right so he approaches game programming or high-performance game programming from this lens of what he calls data oriented design and I think it's a very valuable viewpoint and I would encourage people to check out this talk and see what it says now there's a problem okay so I've said that these best practices are different from what people are finding in reality and when it comes to programming languages that leads to this problem in reality which is that when someone sits down to design a new language they do it through the lens of idealism manufactured by this consensual belief in these best practices right so these languages are designed to support those best practices or sometimes to end force those best practices when it comes to issues like safety for example right and again the problem is that these so-called best practices are not actually compatible with what we are finding to be a good idea in high end games so these languages are trying to enforce things that are unproductive in the best case or destructive in the worst case and I'm going to make at least a couple examples of that as we go later in the talk but again part of what I'm saying is just a recap of what happened last time so if you saw the last times talk and you didn't quite get it maybe after these few slides you can go back there and re-listen to some of the points that were made there with this understanding that there is very explicitly a cultural difference and if you see something in one of these talks where it seems like I don't understand you know how you're supposed to wrap a pointer to be safe or something it's not that it's that I disagree and that many people in games disagree and if we and we disagree because if empiric numbers that empirical numbers that come back in our profile reports right or just the experience of how hard it is to develop big software okay so talking about best practices and talking about programming school I want to kick into a little segment which may be a recurring segment in future talks called what I learned in programming school right one of the things that I learned in programming school is that anytime you write a really big function well you know functions can get really complicated and hard to understand so if you write a big function and it looks like the function could easily be factored into smaller functions each of which has a explicit and clear purpose that you should do that and that's a good style and it helps you program better right so here at the top I've got this function called major function right and here in comments where it's got like comment minor function one that's just a placeholder imagine there's a lot of code there that does some job that you could call a minor function one right I'm using the comments just because there's not room on the slide to put like a whole long function but you've got some code that does function one and some code that does function two and some code that does function three I mean programming school I learned that anytime you have that you always or almost all very strong recommendation that you should immediately factor those functions in fact you shouldn't even write it like the top version you should write it like the bottom version where you identify what the functions are then you write minor function one as some external function with a body and here I've put the body and dot dot dot because who knows what it is it's probably some non-trivial task right so you write those as separate functions and then in the body of your major function you just call those functions right now there are a few reasons why that seems like a good idea right it's easier to understand the function as a unit and so it seems to logically follow that then it would be easier to understand your program since you've made part of your program easier to understand right you provide greater code reuse because if minor function one was buried inside major function nobody else could call it right but as soon as you move it outside anyone else in your program could call it and that's good right and also by by externalizing these functions making them into their own objects or their object is not the right word but you know by making them into their own things with curly braces you prevent the functions from flowing over into each other which may help you structure the program cleanly right like if all this stuff was in one big block function then you know the lines could cross over each other and the functionality could blur and before too long you may not even know that you had three minor functions right so that was what I was taught in programming school and through experience I came to understand that this is wrong and the reason is that all of those good things that I said about why you want to factor come along with negative consequences and those negative consequences are often bigger than the good thing that you got so for example by factoring minor function Y out into its own body you make it easier to understand minor function one maybe maybe that's not even true because maybe if you just isolated it well enough in the main body it would have been just as easy to understand but you know maybe you think that makes it easier to understand but it actually makes your program as a whole harder to understand because you've increased the number of functions in your program right the number of ideas that you're juggling around and when you factor that function out knowledge about that code that was previously explicit in your program becomes implicit and implicit knowledge is easy to lose you have to like keep it in your working set all the time when you edit that code it makes it harder to program actually so what are some examples of this implicit knowledge that that makes this a bad idea to factor well for example who even calls this function right if you you know if this function is just inside a big block if it's inside major function then you know only major function can execute this code it's not even calling a function it's just inline right but if this function is out in a file somewhere then suddenly it's an open question like well I don't know I have to search through the code to know who calls this function or I have to use a third-party tool to report on my code graph or something like that so that makes things harder right another question that becomes raised or knowledge that becomes implicit is when the function is called right by which I mean what needs to be true in the rest of the program while calling this function program could be in all kinds of different weird states and you know what state does it have to be in what are the preconditions and all that stuff and you know again when it's embedded in a bigger piece of program those questions aren't maybe completely answered but the number of questions is reduced because now instead of two functions that you have to answer that for there's only one question we have to answer that for and the longer the fact that that other function is longer and more major may make it easier to answer now when you do this factoring you could try to provide this kind of information by commenting the function but I claim that's not really a good idea okay first of all because writing informative and easy to understand comments is actually hard how many of you have come to a comment later even if it was written by you when you're like I don't understand what this is trying to tell me and or I'm confused about something in this comment isn't clarifying for me that's actually a lot or most comments right but the other thing about comments is when they try to state factual information about a program your comment is essentially code that never runs right your comment is trying to express constraints about relationships between things in the program which is what code is right and but ever as we all know code that never runs probably has bugs your comments never run so your comments probably have bugs which means they misinformed people right and also because it never runs your comments never get tested which means their statement of what is true is going to diverge from what is actually true in the program over time because the program is not a static thing it evolves over the course of the development cycle so the argument that you can make up for factoring and there's lots of knowledge by inserting comments I would claim is untrue so to recap that when you factor out a function the result of correct use of that function often depends on the outer state of the program right and if it looks like you can understand that function as a separate unit maybe that's really helping you understand it or maybe that's just deceiving you right and there's something subtle about the function that you didn't understand that introduces bugs and after years of programming experience I came to understand this at some subconscious level but I never understood it fully consciously until I read an email that John Carmack sent to programmers it'd software in 2007 that lays it all out very specifically now this is a really long email and it covers a lot of issues that I'm not kind of quote from the whole are not going to read the whole thing it would take a while but I'm going to isolate a few spots that are relevant to what I'm talking about he starts by talking about relatively small code base for a simpler problem right the armadillo aerospace rockets and he says the flight control code for the armadillo rockets is only a few thousand lines of code so I took the main tick function and started in lining all the subroutines right that's the reverse of the factoring process we were just talking about while I can't say that I found a hidden bug that would have caused a crash literally I did find several variables that were set in multiple times a couple control flow things that looked a bit dodgy and the final code got smaller and cleaner right so then later in the email he's talking about larger games like rage which was his project at the time and he says if something is going to be done once per frame there is some value to having it happen in the outermost part of the frame loop rather than buried deep inside some chain of functions that may wind up getting skipped for some reason besides awareness of the actual code being executed in lining functions also has the benefit of not making it possible to call the function for places that sounds ridiculous but there is a point to it as a code base grows over years of use there will be lots of opportunities to take a shortcut and just call a function that only does the work you think needs to be done there might be a full update function that calls partial update a and partial update B but in some particular case you may realize or think that you only need to do partial update B and you're being efficient by avoiding the other work lots and lots of stem from this most bugs are a result of the execution state not being exactly what you think it is and this last sentence is important so I'm going to be it like a dead horse most bugs are a result of the execution state not being exactly what you think it is and so when you factor you're coding and more functions than you needed you encourage that kind of bug to happen right and this whole idea that I've spent the last five or ten minutes talking about is contrary to commonly accepted best practices it's contrary to what I was taught in programming school right and thus it's contrary to the assumptions behind many modern programming language designs which is one of the motivations I claim behind wanting to make a language that does what as game programmers actually want to do now there's a caveat to this whole worry about factoring which which john himself wanted to make sure that I put forth which is in the case of pure functions or so--that's functions that don't have side effects either to their arguments or global state of any kind don't do input-output probably right don't read global state of any kind in the case of pure functions you don't have any of those kinds of problems I talked about like you know when during the program can this be run well it can be run any time so it doesn't matter so you don't have those problems you can factor more readily you still might want to be a little careful about it because you are you know way increasing the number of concepts in your program but it's better in the case of pure functions and that can be a goal of language design and as it often is and we'll come back to that later but going back to John Carmack's email for one last point he says I know there are some rules of thumb about not making functions larger than a page or two but I specifically disagree with that now if a lot of operations are supposed to happen in a sequential fashion their code should follow sequentially and you know John Carmack is one of the best programmers in games if not the best programmer so when he says something about how to program I at least take it seriously right fortunately in this case he was saying something that I already agreed with so it's easier to believe but I didn't realize consciously that this is what I thought right and so I think there's a lot of value to people coming out and saying blatantly things like this right which is one reason why I want to put forth things like this in this talk it's not to make C++ or D or Russ programmers mad it's to say no really this is the experience that we've gained over these years and it's important to us to leverage this experience and in do whatever way we may find a programming better in the future so and now suppose we want to find middle ground sometimes suppose we've got a big long block of code and we don't want to totally factor it apart into pieces and make it more confusing but we want to use some code a little bit inside well if our language supports locally scoped functions then we can factor it into a local function instead of a global function which to a small extent Riaan crease or re introduces those kinds of problems I talked about but to a greater degree it mitigates them right which helps understandability of the program so the functions in locals local scope we know who calls it it's somebody else in that scope we may not know exactly who if that scope is big but it's greatly limited it can't be anyone in the whole program right if you know if it's a locally scoped to that function then what times can it what states can the external program be and well only the states that could be true for that external function right so it's much like the case when it's just an inline block so I think locally scoped functions are very interesting for that fact right they aid you in keeping a clear understanding of program structure C++ didn't until recently provide locally scoped functions but it has for a long time along with C provided local blocks right so you're just programming along a big linear sequence of statements and you can with curly braces open a new scope and just start typing stuff but you can declare some new variables in that scope and then you close that with curly braces and whatever you declare it inside there can't then be referenced from the rest of your function right so that the concepts don't spill out it gives you a way of increasing hygiene and keeping things contained while keeping all the goodness of that inline code idea and without introducing the confusion of factor things out so local blocks local scopes or a good idea and there's another idea that they're good for that I think a lot of people are not going to be convinced by in the beginning but this is a real thing they're good for when you're doing a series of similar actions that you don't want to factor into a common function yet if ever right and so what I'm going to talk about next is when you're programming in an exploratory way you're just getting started with these concepts that you're programming so you don't know how they're going to come out right and you're just trying some stuff so here I've got a code example it's way simpler than what I'm actually thinking this is a simple analogy for what I'm actually talking about which would not in any way fit on this slide because big functions are big right so I'm making this array of some character pointer right and each one is data for a person so the array is called people and then I've got three blocks here that are all basically doing smite slight variations of the same thing I'm creating a new character I'm setting his name and I'm adding him to the people array and yes if you're a C++ fanatic you'd probably put the name in the constructor and all that but again imagine that we're doing this for a situation that's more complicated than what is actually on the screen right so I do this three times one time the character's name is Mike one time it's Leon and one time is Josephine and as I've commented at the bottom an example this simple looks like bad Bessie ugly paste code and it kind of is like you shouldn't probably ever type this into the computer but in real and more complicated cases where you're building a complicated program you might not really know what you're trying to do and the difference is between each of these instances might be more than just setting the name you might start out with 10 or 11 differences and after experimenting for a little bit you might and they might the differences might not even be the same right they might be disjoint for each one of these sets of curly braces right and then over time as you gain experience concepts you wrap your head around it you change around the abstractions a little bit and then you figure out what the real differences are and then you're ready to factor the code but but at this point when you just type it in if you don't understand your code you don't want to factor it prematurely right so maybe I've developed these concepts for a while I understand now clearly that all I need to do is change the name so now to take the next step which is let me clean up that code by factoring it into a function which here I've called maker which does all that job of making a new character and setting the name and adding it to the array but you know now that it's factored into one function I can't have pasted errors and that kind of thing and I just call it three times with the names right so that's better code i ambiguously that on the previous slide I would like to make a locally scope just regular old function but in C++ I can't do that so here I've done it with a lambda which is what this syntax is and we'll get more into this syntax later if you're not familiar with it but anyway I'm declaring a function that's allowed to access this array people by reference now one reason I might not want to do a lambda like this in C++ as performance is now questionable I've heard reports that a lot of compilers will do like a heap allocation to store the data for the variables that that lambda captures like people and that's not really something you want if you don't want to take on that overhead at random points in your program but in a language that's designed more around this concept you wouldn't have that problem so whereas you wouldn't always want to do this in C++ especially if you're doing something performant in a newly designed language or in a more modern language that cares more about closures it's much more feasible right okay so then the third step again is well maybe set up the people is not the only function that wants to set up people you might have a set up to other people somewhere in your program and now that wants to maker and so now we've factored it out all the way into global scope and now anyone in our program can call maker someone says proofreading fail I'll check out that later um so anyone in your program can call this now and hey that's good stuff maybe again though it brings in those complications that we were talking about later now note that the syntax for the function declaration changed between when it was a lambda and when it's in global scope and and why was that really necessary um we'll answer that but people might be having this question why do it this way in multiple stages why didn't you just write this version first where you declare the function globally and then you just call it a few times and again the reasons are because in a real world situation you'll have a much more complicated set of constraints you have to deal with which may be very hard to think about there may be many functions there may be many disjoint you know requirements that you have for each function and if you type that last version first or the analogy of that last version for a much more complicated scenario you'll usually be wrong and you'll usually have to spend a lot of time and effort rewriting it and refactoring all that stuff you have to might have to tear it up and rewrite it three or four times right and I know this from experience because I used to program that way and I used to have to rewrite that stuff all the time instead if you start simply by a linear set of statements indicating what you have to do and then as you discover what you really have to do and what the commonalities are then you factor out then what you're doing is what I'm talking about down here you're letting the solution to the problem determine the structure of your program rather than deciding the structure of your program a priori and then hoping that you're right that it conforms to the solution to your problem right because you're often going to be wrong if you try yes but the way that I'm talking about where you write it at an inline level and then you gradually lift things out into existence in the global scope gives you a smoother way to approach the problem where you don't really have to rewrite things and just discover what your program wants instead of you trying to tell your program what it should want this is a very important point if you haven't heard this before if it sounds strange if you're hearing that and saying oh I don't really know if that's the right way to program I can reference you to another person's talk which is specifically about this my friend Casey Mira Torre has it talked about semantic compression or it's actually it's an article and you can read it at this location and he talks all about a similar process to what I've just discussed you know with a different example with a user interface example so if it sounds weird check this out I promise it's a really good way to program actually once you learn how to program this way a lot of things become a lot easier quality of life goes way up because you don't have to tear up things that you wrote all the time okay now if we want to design a language that supports this kind of programming paradigm we want to choose a syntax that's supportive and it should probably diverge from the syntax of C and C++ because as we saw they're not necessarily very friendly as I move that function around the syntax has to change so let's start simpler than functions though let's just start with declaring regular variables scalars my friend Sean Barrett in response to the last talk wrote in and said hey back when I was making a language I came up with this assignment syntax where you use colon to mean you're declaring something and equal to mean you're assigning something right and it's this orthogonal set of symbols that you can use in different combinations to do the kinds of things you want to do in C or C++ right so in this top line F colon flow it means I'm declaring the type of F it's a float I have an assigned at a value the next slide I'm saying f : float equals 1 so f is a float with the value 1 so I'm declaring F and I'm initializing it in the third line F colon equals 1 it's like f is a type I'm not specifying but it's equal to 1 in this case the type inference in the language would kick in and fill in the blank between the colons in the equals write in the last case I can say F equals 1 and that just means while F had already had better been declared as an integer or some type that is compatible with the number 1 so you know as with any syntax it's a little bit alien if you haven't seen it but once you're used to it it makes a lot of sense I don't want to make the case that this is exactly a syntax that any new language for games should use I don't want to do that at all that's not Sean's opinion either there are reasons for example you might think that colon equals is a little too much like equals and that you might get confused or something I don't I mean I think they're less like each other than equal and double equal aren't C and C++ but it's a legitimate thing to think about you know you might think that it's too much punctuation in your program or something I don't know but what I do want to put forth is our target of consistency and understandability right we can benchmark any kind of syntax we come up with against whatever idea whether it's this or some other idea that we find has maximum consistency and maximum understandability and we can even give up consistency and understand ability in order to get things like brevity or something like that right but we should at least understand what we're giving up and if you start with the syntax from something like C++ you'll never understand what you might be giving up because it's so confusing already right now to be clear about that colon equals these two statements are the same right if I use this auto like I have in C++ then F colon auto equals 1 just is the same f : equals 1 there's an invisible auto in there so that was scalars and now if we're designing functions how do you declare a function we want a similar consistency and we want to avoid meaningless syntactic changes that create drudge work as I gradually factor my program from its beginning state into its final state right so what are my desires what do I want to be able to do I want to be able to paste a function between local and global scope I want to be able to switch that function between being named or being unnamed a lambda right anonymous function is another name for that and I want to switch between that function being a method or not if methods are even really a thing in whatever language this is and there's a good reason to think that maybe we should get off the method train or never have gotten on it in the first place that's beyond the scope of this talk that's probably next talk right so for this talk I'm talking about points 1 & 2 about how do you just move this function around in your code without having to do meaningless syntactic changes so to illustrate that I want to go to another example that instead of just declaring a function uses a function as an argument to another function which we didn't do last time right last time we were just calling a function so here we're again we're setting up our people we've got an array of people we're putting Mike Leon and Josephine into that array of people now we've put this locally scoped lambda down here well it's even more than locally scoped it's written in line to the function call which is as locally scoped as you can get so we've got some functions somewhere that's not defined here called modify names and it takes an array of things whose names we want to modify and then it takes a function and that it applies to the name of each person in this array and here's what our function does it has an empty capture so it only refers to its arguments which there's one argument which is called name and what it does is it looks for the first occurrence of the letter I in that name and if it finds the letter I changes that I to an O right really simple function tractable to type it into this one point in the code I have no idea when I type this if anyone else in the code is ever going to want to do this stupid thing of changing it I to me maybe they will but but maybe not right so it's best from a code hygiene standpoint if I keep it here now it's not obvious so I've set up here this may or may not be considered good style the reason you might not consider a good style is just because maybe this is hard to read when it smashed into this function like this I don't know so you might prefer not to do this but you might is a legitimate argument to be had and I think it has a lot to do with the syntax of the language containing the lambdas but anyway this is one of the things lambdas are supposed to be for is to be this light throw away thing where I define a function I use it and then it's gone and nobody else has to think about it right that's sort of the original motivation of a lambda okay so now I want to use that same function twice because I now not only have an array of people I have an array of cities as well and I wanted to perform the same operation on both arrays so I've factored out my anonymous function into something with a name I've called it I - oh it's still declared as a lambda because you have to do that for local scope and C++ to be clear this is still C++ code it's not in our imaginary new language yet same function right it takes a name and it changes i2o in the name and then we call this twice right we call it on people we call it on cities it's great that we factored it out instead of typing it twice which will be horrible because if we typed it twice we make pasted errors and might fail to update one as we modify the and all that stuff right so we want to factor it but we only want to factor it as far as we need which is right before we use it however all those performance caveats that I said before about lambdas apply here too right okay ideally in C++ I would like to do this so I'm gonna flip back and forth so you can see the funds the difference between these slides I would like to just declare this with the same syntax as I declare a globally scoped function but I can't do that I have to do this why because C++ I and then again at a later step we might want to go to file scope right so now we want to let other people in this file call I do not necessarily anyone in the program yet because maybe I think that everyone who wants to modify names is contained within this file so I'm still keeping hygiene I'm declaring this as static right so that other people won't see me they can't call me my link time is improved over the whole program if I do this for a lot of functions because there's fewer functions to confuse the linker right so that's a good idea but then at some point if I decide this becomes a really common function that really everybody wants to use I can take away the static now the whole program can use this right so this is like this cycle that this one bit of code has gone through from being this little inline a little bitty thing that nobody cares about to being a global concept that everybody cares about right and so this is about how relevant is this piece of functionality to the rest of the code controls where it lives in the program so my point here is that code has a maturation cycle that it goes through you don't just write the final thing usually because you don't know what the final thing should be when you start if you are only have experience writing simple programs or programs for school they give you a stated exercise and you just have to type in the answer then you won't necessarily believe that but for complicated programs solving hard for our problems in the real you can't just write the final thing you have to work your way there from humble beginnings and so your programming language should be designed to support the transition from those humble beginnings to the final project or product C++ doesn't help you with that very much right so let's switch to a different function this is a very classic function Square where I give it an argument X and I return the square of X right X times X here's how you define it in C or C++ if you want to pass that as an argument to a function then here's how that function has to declare the argument types you can type check it or at least that's the classic C way which you can also use in C++ now suppose you don't want that to be a global function though suppose you want to be anonymous like those early examples well not only can you not quite declare the function in the same way you have to use this capture syntax and Auto and stuff that don't really apply to the global function the argument type for the person who takes the function as an argument has to be totally different you probably have to use this standard function craziness or you have to write something yourself and as soon as you pull in standard function like you're you're in for a lot of complications that you may not be ready for again having to do with compile time quality of error messages you know size and speed of code understand ability of the program all these arguments that I made in the last talk come into play here now rust has lambdas as well and they do a little bit of a better job right um but there's still weird meaningless syntactic differences that would inhibit a similar kind of code migration now I apologize if I've made mistakes in this slide I am NOT a rest programmer but this is my understanding according to the commonly available rest documentation on the web if I want to declare a function at global scope and rust like square I use this syntax where I started out with fun I declare the argument I use this return value syntax that's really not different from the C++ lambda syntax and then I give the body of the function curly braces are mandatory for the body of the function but if I want a lambda and then name it as a local variable inside a functions local scope then I have to use the pipe symbols to declare the argument list why I don't know and then curly braces are not mandatory for lambda function bodies um I guess that would go here right around the x times X they're not mandatory although you can use them as long as you don't put a semicolon here it's it's weird right it's to me relatively meaningless and I'm not sure why they did it unless they had problems parsing the syntax of their language and wanted to make it easier which I would argue you know implementation is always a concern but the greater concern should be usability right but the good thing that rust does is at least to my understanding the type of both these things is the same so if you want to declare a function that takes both of these arguments or that takes an argument that could be either of these things then the type of the argument that you declare in that function header not header in the function declaration is just like this right it's the same for both you say it's some function it takes an argument float and it returns a float so that's cool that's better than C++ for sure all right but the fact that this in tax is different still bothers me and the fact that as I migrate a function from very local to very global I'm gonna have to change the syntax still bothers me right and to illustrate exactly why I'm going to go back to another segment of what I learned in programming school but this one is going to be a positive example of what I learned in programming school which is something they taught me that is actually true mostly true and is actually a useful understanding which is that anonymous functions or lambdas are really not a big deal right lambdas are just functions without names you use them for convenience you often use them in cases where you want to do something light and simple and fire-and-forget and frictionless right that's great it's it feels good to use them however if you're designing a language right or if you're programming and you think that they're a big deal you're going to be bringing this mindset of encumbrance with you and you're sort of working against the purpose of lambdas and that's that's a problem it's important not to do that and I'll say why in a bit now the reason the language I was taught first in college was scheme which is a very simplified version of Lisp in scheme you have the syntax where everything is parentheses and if you're just fine with square you would do it like this like the function is always the first argument within parentheses and then the arguments to the function are always the rest of the arguments in parentheses so I'm defining a function called square that takes an argument called X and it's x xx is how you compute it you may notice there's no type information here I would never recommend that people use scheme for real purposes in the real world precisely because it doesn't have type information it doesn't so you can don't get static type checking you know it doesn't the syntax makes it hard to read sometimes there's a lot of reasons it's garbage collected you know our favorite or my favorite reason not to use things in the real world but it was a really cool language for learning principles of programming in a way that's different than you see them in a language like C so at first though it looks like the syntax might be a little different like okay if I want Square to be a lamb then I have to do this and well at least all this part looks exactly the same but you know this part to the left is a little different including the word lambda and isn't that weird well a little bit but actually what you learn as you learn more scheme is that define is a syntactic sugar it's a macro that actually expands to this line that I have on the bottom to define symbol lambda right or R value or whatever you put in that right side if you're defining a function right it's this so what does that mean well the syntax then the syntactical difference between a define and a let is just the word define or let the rest of it is exactly the same so in a language like scheme lambdas are just not considered a big deal they're organic they're part of how you do stuff and you just happen to give functions a name a lot of the time if you want to refer to them now part of the reason it's so natural in a scheme and Lisp right which is which has had this since the 1960s is that because the language is garbage collected you can just make a closure that holds the value of your variables and a pointer to your outer scope and your outer scope will stay alive for as long as you want and the thing that makes lambda does seem a little bit complicated in modern languages is that you know if you assume the outer scope is alive that may be a mistake because a lot of the time you don't want the outer scope to be on the heap we usually want it to be on the stack actually if we want our language to be performant right so you have to capture things off the stack or whatever but the thing to keep in mind is that that's just an optimization making it more complicated than the foundation is conceptually right lambdas are not new top-secret government technology introduced by C++ they've been around since the frickin 1960s before I was born and I'm pretty old at this point right so please understand that so I think that should be reflected in the language so if we concoct a basic function syntax our lambdas shouldn't look different from our function from our non lambda functions or like lamp is not even the right word like our anonymous functions should not look different for renamed functions right so here's a function in C here's how I would declare Square and in our proposed language I'm going to propose and again this is not a serious proposal for all time this is only for the purposes of this talk to illustrate a consistent and understandable function syntax that doesn't change depending on where things are in the program so at top level or local scope I'm going to claim I can do the same thing I can type exactly the same series of characters I can say square colon equals write that was from our slide with sean's ideas before where the colon says I'm declaring something and the equal says I'm also giving it a value and that value is some function that takes an argument that's a float called X and returns a float return value and here's the body of the function return x times X so that's not that different from a lambda in C++ a few a few relatively minor syntactic differences but we're using it everywhere right so if we go to some random code and we call some function where we want to pass this as an argument what we type for that argument is exactly the same as what's up here right the only difference is the name assignment this square colon equals right that's all the rest of it is exactly the same and again if it seems a little verbose recall that in most of these new languages you can elide this return value if you want to whoops so how you spell both you can elide this return value and have it be implicit I'm not sure that's a good idea because that return value can be valuable documentation about what your function does but if you want to do that you can so now let's look through the kind of possibilities how you might use this with the colon and equals that were introduced earlier right so I can say some value F is being defined so this is a new symbol as a function that takes an argument float and returns an argument float but haven't given it a value there's no equals here so I'm not saying what function this is I'm just saying if it's some function with this type or I can say F colon equals we're giving a value that type with this certain body now note that we haven't specified the type in here because that would be really verbose but because our function declaration also says the type it's clear so the implicit auto in here basically would say this if we bothered to type it all out and we probably never would so what's cool about this is the difference between a type declaration and the the definition is just we add the body on the end right it's easy to understand you don't have to remember like the oh the name kind of goes in the middle of the function between the type and the body or whatever you're just tacking stuff on and that's cool if F is already declared and it's just some variable and you have another variable existing lambda of the same type you can assign F to that right just with equals like it would work in any language and or if you didn't have it in an existing variable and you wanted to type it explicitly you can type a thing and the thing that you type is exactly the same thing that you would type if you were defining this as a function at local scope or a global scope right it's the same same same same no complications because it's the simpler your language is the nicer it is to program and usually you can go too far but usually the nicer it is and the fewer complications you have to absorb into your mental model which means actually that we're free to spend that complication budget on other things that might give us new features that's my claim anyway so lastly then if I wanted to pass this as an argument to some other function like we were doing with that function that changes eyes to O's in people's names then say we've got some function that does that called process array you pass it an array of floats and it's going to call some function on every float like it's going to square every float or whatever you want to do well here is the type of that function argument it's not some crazy standard : : function thing it's the same thing that we typed up here to declare the function it's as straightforward as you can get here's the body of whatever that would be which you know we don't need to go into that so somebody might say isn't this a relatively surface subject right this is just about syntax and syntax isn't important after programming in C++ for a while I think syntax is actually very important right this is about that joy of programming which as I said last time is very important to the morale of the programmer right if you are joyful while you're programming if you enjoy what you're doing then you're going to you know again I claim and in my experience and in the experience of many of my friends are going to be much more productive where at least you'll have higher quality of life which we all want right just like I want to believe that my life has meaning I want to feel that when I'm entering all these keystrokes that those keystrokes have a meaning and a purpose I don't want to feel like I'm doing meaningless drudge work just to get just to move a function from one place to another or just to move a variable from one place to another right that's annoying and after a long time it grates on you and header files are maybe the worst offender in that regard but once you get rid of header files there were other offenders that now become the worst offenders and this is one of them so that was anonymous functions but they weren't didn't really have closures right they didn't have a capture whenever we had a capture it was empty so now let's talk about capture so if I'm in C++ and I wanted to find a function I'm just calling an F now to fit it on a slide but if you look at the body it's a lot like Square where I'm multiplying X by X but now I'm adding some Y and where does Y I come from well it's not in the function arguments it's still only a function that takes one argument where that Y comes from is the value gets captured at the time that I declare or evaluate or whatever depending on the at the time that I declare the function I take on this value of Y right and so what that means is I can pass this to that function that wanted things like Square as an argument but I've snuck in an extra value here right this Y now I'm going to suggest again what seems like a minor syntactic change but that actually I think turns out to be kind of major and to inspire really interesting and useful ideas which is part of what I love about thinking about programming languages is we get these surprises something like oh I thought this was a tiny thing but it turns out to be big so instead of this capturing Y over on the left I'm going to say that we're capturing Y over on the right right and the rest of the syntax is just we proposed we've got colon equals we've got the arrow for the return value right so again this is the type of the function this is the capture and this is the body of the function now there are a bunch of reasons why this is a better order I claim to declare things in one is that you don't really want the Captur to be considered part of the type of the function right you want the type first of all to be cleanly on the left so that if you want to delete this if you want to change this from an assignment into a declaration you just put a semicolon here and delete the rest of the line right and now that's a very fast copy and paste operation right you don't have to like do all sorts of weird edits or whatever in the middle of the line so that's useful just from a pragmatic day-to-day using an editor thing but also if you pass this to a function what does that function care about well if you're overly constrained in a language like C++ by default it would care about the capture because that determines how much storage the function takes but in usage we actually don't want anyone to care about that which is why a standard function provides type erasure to like get rid of the capture right so what we really care about is how many arguments does it take and what is the return type right that's what we want the type checking in our language to do for us is to make sure when we provide a function that that function is what the the collie expects you to pass it right or what the struct member expects to be stored on the struct or whatever right so we want the type by itself the capture is not part of the type instead you can think of the capture as a property of the code block and not of the function header or the type declaration of the function now once you think of the capture as being part of the block you can apply it to any block suddenly array like oh I have an inline block in the middle of a function why couldn't I put a capture on that well what would that mean because it's not really a function so it's not really capturing a value that's a value that that block could already access so what's the point of that the point of that is a capture restricts you so that you can't read any other variables just like the lambda wouldn't be able to write so what happens all the time to me is to happen just the other day I'm writing a bunch of straight line code and then I have some local block in that code that I decide it's time to move into a function and that local block might be 50 lines hundred lines it might be pretty long right so what I do is I mentally think okay this block when I move it out of the scope of this function it's going to need to take some arguments now to supply the local variables out of that function scope that that block used to be accessing and I think I know what those were you know it's this variable in that variable is only two right and usually so i factor that out I define the function I hit compile and I get a ton of compile errors because there weren't really two variables they were like seven and I didn't realize it right and so now I've got kind of this mess because C++ compilers are not very good at at continuing after giving you the first error - so often you can't even see which variables you needed in one shot and have to like fix them one or two at a time and recompile and fix a couple more and recompile and eventually you get them all right but imagine that instead of factoring straight into a function straight away you could just open up a little capture straight in front of the block right just by typing the little square brackets with some variables in it that would be a much smaller syntactic change and it would tell you if you're right about which variables that function uses right and if you're wrong that the error doesn't happen until the line in you know in the block where where the first one that you missed was and then hopefully I claim because the situation is simpler it might be easier to make a better error message or maybe making a better error message is just a fact of the compiler author Carrie Moore anyway so if you could do that it would be a smaller syntactic change to figure out what the arguments are going to be once you've got that figured out maybe okay so you know maybe you were wrong about that and there were seven arguments but you think there's not a good reason for two of those arguments you should only have five so before you've moved that code and while it's still next to the other code in the function you can do some refactoring to move some of those dependencies out of that block and not require those arguments anymore right then once you've done that and you've got this as a self-contained unit that you're satisfied with then you can move it out right so this capture can help you with this migration from local inline code to function without being so crazy about it right in the case of a complicated function now this basically just said what I just said so yes now this is all the same stuff by the way that you can do basically everything I just said you can achieve by doing a lambda like right where that block is and using that as a temporary stage right C doesn't support that but C++ does but the problem is that for a little bit at least now you've got a confusing situation where you're going past the lambda right because you're defining the lambda and then at the bottom is the call to the lambda and so you're taking code flow that used to be linear and you're making it nonlinear for a bit and then maybe you check that in and go home and leave it there or maybe there's some unresolved questions about how you factor better and leave it there and so for awhile you've actually made your code more confusing so if you do the capture you've actually not made the code more confusing you made it less confusing because it just continues in a straight line people might be thinking I'm talking a little too abstractly here so here's some actual code right so you've got some big function a lot of code happens up here then we set up our people array and here's our block where we create a character named Larry and we add them to the array but it's not a function but it's got a capture right and what this capture says is I assert that no variable from the outer scope gets accessed aside from people if anybody else gets accessed then give me a compile error right and this is nothing new this is exactly the behavior that we get with functions but we're extending it to act on local scopes and it's not a complication actually it's a more of a generality because now we've taken this capture concept and made it orthogonal to the function concept right so it simplifies our programming language to do that so then we go on and lots more stuff happens so now we've got this whole sequence of stages that we can lift a block of code on to anywhere that we want right from anonymous from block with no capture to block with capture where we care about what's inside and learning about what's inside to a function that takes arguments and is locally scoped to a locally scoped function with a name or write globally scoped function with a name right in our syntax is proposed here both of those are exactly the same right so capture instead of and captures the wrong word now right because capture implies I'm copying these data into my closure but what it really starts meaning in this proposal is a subset of the outer lexical scope that I would normally be able to access is now being paired down strictly to this list of variables right and the actual copying becomes an implementation detail based on whether this capture is for a function or not now once you've got this you might think oh wow doesn't that help me make things thread-safe you know because usually when I write code that's supposed to go in a thread especially if it's complicated it probably starts out life as single threaded code because I wanted to write it and debug it in the simplest possible configuration and in fact I probably left debug hooks in where I can compile my code in a single threaded mode that just keeps that code synchronous because if something's wrong you want to go back to the simplest possible situation right so most code starts life as single threaded and goes back to life occasionally as single threaded within a multi-threaded program now because code started single threaded who knows what it's doing right if I write an audio system or something and it's originally synchronous it might be grabbing some local variables or you know talking over the network if the programmer was crazy or something and I don't know and all that stuff is going to be bad if I then put it into a thread that's asynchronous with the rest of my code usually when I put something in a thread I have to audit that thread all the code in it I have to know for sure that there are no thread safety bugs in the code and this is harrowing in C and C++ because you can kind of do it but the bigger the code is and the more colleagues the code has which by the way it might be external libraries that you don't have the source code to or whatever the more that happens the lower your morale goes about that code the more likely it is that you're making a mistake and there actually is a thread safety bug in there somewhere that you are not seeing right so it would be great if we had more weapons to bring to bear on this situation that C and C++ do not provide and maybe in fact capture is such a weapon and we can use it to help us do throw away asynchronous so maybe if we managed to resolve all kinds of other questions like how do you specify what this means you can have a very lightweight syntax where you say I'm going to do some block of code in an async way and that block has this capture which means I'm assured that it doesn't look at any outer variables so now when I audit this code I only have to audit it for a very limited set of things and in fact that actually can get better as I'll go into a ferret for a bit so maybe we can get in an extreme case more carefree about doing things like an asynchronous for loop so what this means is instead of doing one to ten sequentially I do them all in parallel and hey I do this code but that code is made hardened for threading at least somewhat by this capture right so again this capture documents data dependencies it helps us make them explicit threaded code won't compile if we violate the capture so this helps us tremendously with debugging our programs for thread safety it's my claim I haven't implemented this but this is my claim now we don't just have to put this capture on blocks within functions we can put this capture at functions in global scope right which helps us document and audit that code to write so here I've got a pathfinding function it takes a grid that we're going to find a path through and we're in the grid we want to start and it returns some path results and it's got to capture what is it capture well I've got this debug config to help visualize the path find while I'm in development and maybe when I compile a release build I even compile this out so the capture would be empty right but for now I'm going to say I want to be able to access this but aside from accessing that I can't actually touch any other Global's so even though this path finding routine may get very complicated I have an insurance policy about what it touches and that's cool right and if I don't want that insurance policy if I'm just starting out and I want to just make it work and I don't want compile errors I just don't put the capture I just do this and I just type my code right and if I never want to capture I never want the capture and I'm flying free right I'm going commando but if I want this added insurance policy I can add it optionally right and again that's great because it's a concept that applies to functions and now it's orthogonal I can use it anytime anywhere I could probably even use it at the statement level if I want to I probably don't need curly braces now one thing about capture is if we use the C++ semantics for it right then the problem is that this is only a local insurance policy right like what this means when you define a lambda is I'm only putting containing these values in the local scope right but then if I call some people they're not subject to the same requirements so if we want a really serious version of this we can introduce another syntax which basically changes the single bracket to a double bracket or whatever other notation you might like but what this says is this is contagious like Const in C and C++ so that any colleague is subject to this restriction as well if I try to call another function and it does not also provide a guarantee that is a subset of my hard capture guarantee then we'll get a compile error right um now I'm not really a big fan of con and C++ and C I think it's actually a giant pain in the ass a lot of the time you end up in all these situations where you have cons Thrun all through your program and then you need something not to be constant you have to change tons of things so I don't like that but I believe at least it seems to me that this kind of opt-in mechanism would work a lot better than const because you know when I make things Const is the term for it which has nothing to do with program correctness so its propaganda term but when you make things Const correct right you're not getting that much of a serious real-world benefit out of it you're sort of getting the benefit that people aren't going to change what's inside this value right if it's a pointer and you know I usually don't have that problem so it's not that great you're getting maybe the vague fluffy assurance that sometimes the compiler could do a better job optimizing the code but I you know in most cases I don't care so I don't want to poison my whole program with that but here I've had a lot of agony trying to make things thread safe and pulling C pulling hair out because because I didn't know if my code was thread safe and this really would make me feel better so as an opt-in safety mechanism I would be all over this again you know someone's talking about why the double brackets the difference is just if I really want this to be an orthogonal concept to what happens in functions then single brackets only applies to my current function just like it would for a lambda right like if a lambda calls somebody else that somebody else can access any global variable they're not subject to that capture it might have a totally different capture right and what we're saying here is anybody that I call once I pass this entry point anybody that I call can never access anything besides the scope defined here it's a very very hard restriction right to why it's double bracket since then single dragons anyway so like I said oh well there's another benefit to this which is not just generally that it helps us with things but you know a lot of the time or as I said at the beginning of the talk one of the times when you don't have to worry so much about factoring code out is when you're factoring it into a pure function right because pure functions don't have these problems about what's the external state of the program right they don't have problems like what you know what side effect is happening because they don't have side effects right so and it doesn't matter who's calling it really because it can't do anything to the rest of your program so usually it doesn't matter it matters if you want to change the body of the function then you have to sort of know what impact that's going to have but the problems are a lot less so the thing about that is in C and C++ you can declare with cons that your function is kind of pure but that won't prevent you from reading global variables right so um what captures do is it helps you ensure that you're theoretically pure function actually is pure which for a small function may not be that hard to verify but for a big function that like has macros in it or calls a lot of other functions it could actually be pretty hard to verify C and C++ don't let me do that but with a capture on that function we can do it trivially and that is pretty cool with a hard chapter like I was saying not not a soft lambda style capture now another way of phrasing what I was doing with the other functions functions that read global stage or that have side effects is we're helping to make those impure functions more like pure functions right because we're documenting their data dependencies which will help us reduce their data dependencies because they're made very explicit in the code and in fact you might have a good reason to factor a big function into smaller functions if one of those smaller functions has many fewer data dependencies and is therefore easier to audit right that's a real reason to factor something into smaller functions so again code has a maturation cycle right and this proposed syntax supports or understands that maturation cycle and tries to make it go well right you write code without the captures you put the captures in when you want it's completely opt-in now this is one idea I have about making a language about what we serious game programmers want to do right if you look at a lot of other modern languages they have weird propaganda that in my opinion is not based in reality like you shouldn't ever declare about global variables you know so Java you can't declare Global's people want Global's what do they do they make a static class and use that as Global's it's stupid they have to do it like the language is making them run through hoops right in rust they had this policy note you can't have Global's and now you know apparently that's changed a little bit and they're saying ok you can access Global's but you have to do it inside unsafe because using Global's is unsafe and that's not really true right um I think this is kind of dogmatic propaganda that someone decided is good programming style right it's not enforced by empiric data that I've ever seen or you know that's unambiguous or anything like that of course you can argue for anything in the perfect world so I'm also not saying that that's conclusionary but you know I think this is a kind of decision that's made like I talked about last time through this justification of like oh you're preventing bad programmers from making mistakes and wrecking your code I don't care so much about that I want to help good programmers do the best that they can do right when a language does this to be when they say you can't use global variables I feel like it's treating me like a child and I want it to stop right it's lowering my quality of life it's making me miserable if I have to type unsafe everywhere I use Global's I use Global's a lot right especially in higher level gameplay code that doesn't have to be so unsafe right or that doesn't have to be so thread safe that doesn't have to be so performant you might be using Global's all the time right and in rest if you put on around all that then you're throwing away rust safety mechanisms which in theoretical you want right so it's this weird these languages have this weird succumbing to propaganda that I would rather not do right even though I just spent a few lot a few slides talking about the problems with global variables games want global variables the de facto will use a great truckload of global variables if you've never seen how many Global's are in a game a big game especially take a look sometime it's a lot especially when you count all the texture maps and meshes as global variables which usually we would factor those into some other management class but seriously there's just a lot of global data in a game and so the case I'm trying to make is it a lot of these new languages instead of you know they're kind of going overboard by making this like forbidden fruit of global variables instead we can gain safety when we want it and only when we want it and therefore not induce friction on the rest of the language by using this kind of capture idea and making it a lot easier to audit the leaf code of our game which is the code that needs to be performant and thread safe and all those things right so my case is a language needs to understand the maturation cycle of code that I've discussed through several examples here our development cycles are really long and if we have to pay for safety through the entire development cycle that constant friction adds up it drives up the cost and may be so big that it prevents us from finishing the game right if we're trying to push a big boulder up the hill the more friction there is on that pushing the harder it is to push it the lower the probability is that we will finish and the lower quality of life we will have right so if you can optionally pay for increased safety when you're ready to ship and only then or only when you're trying to debug the program that's a win and we want a language that helps us do that not a language that's trying to like force us into paradigms we don't want and don't believe so that's about it to close I'm going to go back to another quote from that same John Carmack email where he gives an extreme example of what I'm talking about here he says most of you have probably read various popular articles about the development process that produces the Space Shuttle software and while some people might think the world would be better if all software developers were that careful the truth is that we would be decades behind where we are now with no PCs and no public internet if everything was developed at that snail's pace right now I don't exactly want to draw too close of a parallel between the space shell software development process and these more modern languages that try so hard to produce safety like rust you know Haskell or something else but there is an analogy to be made right there is this induction of friction throughout the entire development process in the name of safety and I claim that that's inappropriate for game programming in most cases right the safety the safety induced by those languages and thus the friction applied to us as programmers while we're trying to use them is less than this crazy space shuttle development process which you should read up on sometime if you haven't heard of it because it's serious but the same principle applies and I would rather not live in that world where I have to fight that friction all the time so that's my thoughts for today thank you for attending the talk it'll be up on YouTube before too long if you have thoughts you can email me at language at Tecla comm this is not a male alias again this is just a one way email address for now but I've gotten a lot of very thoughtful emails from people who are very interested in making a language for games so far probably about a hundred and twenty-five emails and some more probably come in the past hour so if you have thoughts if you want to say some of these ideas we're good if you want to see they say they're terrible and they suck email me at language at telecom and we will talk about it and thank you and I'll take some questions for a while why not use contracts instead of captures so contract oriented programming is the whole thing that actually gets really complicated again so I talked about this concept of Big Idea languages last time contract oriented programming is a big idea there may be AI sub ideas from that big idea that are very useful and obvious to us now however as far as I'm concerned it's not proven on big projects of the kind that we do in games that have to be performant they have to talk to hardware they have to be like and when I say performant I don't mean what a web developer would think of as performant I mean like using the computer as fast as you humanly possibly can right so you know like our concerns are just different than the concerns of the people making these kind of languages there's a you know there's a if a contract oriented person is saying but know all these things go in as zero cost abstractions and they won't affect performance it's not exactly true there's a C++ talk by Chandler Carruth about so-called zero tossed cost abstractions in C++ I don't have the URL handily handy but in that he talks about how cost happens quite easily and quickly and it's actually quite hard to get rid of the cost when you would think that your abstractions are zero costs so yeah take a look at that maybe I'll dig out the link for my next talk but my point is again with contract oriented programming I don't think we should say hey let's all dive into the pool I hope there's no sharks in there especially because the effort that would go towards implementing that may take away or clearly does because you've only got a limited amount of effort to build anything does take away from the effort required to do things that game programmers probably directly care about so until we agree as a community that something is a really good idea we probably shouldn't do it especially if it's complicated and controls the shape of the whole language I'm a strong believer about code maturation cycles you think that this cycle could be applied to memory and type for example garbage collect until you specify how to control the memory I don't know about garbage collection specifically and the reason is that you know usually the way you go from non garbage collected code to org that you go from garbage collected code to non garbage collected code is that first your code isn't marked up at all and then later you have to find all these specific parts that aren't marked up about where you have to free memory and that seems error-prone to me and it just seems it seems friction full which is not to say that it would be I mean I'm not saying it's a bad idea and you could try it I've never tried it I don't know how it goes but maybe you know maybe you could do that I just feel like you know in the earlier part of this talk I focused on having a syntax where as you migrate code around as you change its level of exposure to the main program the syntax stays mostly invariant and you know as you go from non garbage collected to garbage collected probably the syntax doesn't stay invariant now maybe in many cases you can make this in tactical changes small as I was talking about in last times talk but I'm not I've never tried it so it's an interesting idea you could try it sometime but I still would want to get rid of all the garbage collection before I ship that's for sure do I have an idea for how to make memory management areas errors easier to fix I talked about that at length in the last talk if you didn't see that one I went through it a number of different ideas actually how do I think templates generics and function overloading would work given this syntax that is an interesting question probably if I can fit it all into the next talk the next talk is going to be about the combination of you know method versus function and overloading versus not overloading because there's a there's an inconsistency in the concept of overloading which is that that interferes with this migration which is if you're going to declare like a function in local scope and assign it to a variable right like F equals lambda of whatever now you can't overload that function right because you would have to assign F to some other value because there's most languages that I've ever heard of don't let you overload regular old variables you have to use this function idea right this weird extra binding of a function to a name and I was going to say extra judicial binding of a function to a name which maybe I'll keep that I don't know so there's this weird thing that we don't think about that that's the only reason why overloading works is because functions are really different from regular variables and maybe you want to make functions more the same as regular variables so maybe you don't really want overloading but then if you want to get rid of objects and inheritance maybe you do want overloading again I'll talk about all that next time what are my thoughts on providing more support for control of memory management like specifying arenas you know and all this kind of stuff yes game developers especially high end game developers all the time want to control memory allocation they want to specify their own allocators have different properties any language should support that I actually don't think that's one of the harder problems so I'm deferring until later but we should ensure that we don't lose that capability we should ensure that we don't land on a language design where you know that capability becomes hard or messy or whatever right because it's fundamental to what we want to do what I have built-in types like vector three vector for matrix feed matrix for I think that would be a good idea but one of the fundamental questions for next time is what is really required to make that kind of type useful to everybody and that is contained in this idea of objects and methods and whatever and whether you drank the object-oriented kool-aid someone's talking about co-routines being super useful I actually have no experience with co-routines in real production shipping code I would be interested in hearing experience from anyone out there who uses co-routines and finds them robust for example I've heard that folks at cue games lately use co-routines a lot and swear by them that was one of the things that came in by email I'd be interested in hearing about that any thoughts on namespaces modules etc modules and rusty and Python are giving me friction when I'm used to organizing the namespaces and C++ a c-sharp okay I don't really have C sharp experience I have a lot of C++ experience and my experience there is that there's too many names for the same thing like you have file scope which is sort of a namespace of its own but not really you have a structure of class which has its own namespace right implicitly and then you have a namespace namespace which is sort of like a global static struct with no members in which you declare things in by putting them in curly braces which is a pain I don't really like C++ is that much but I like rusts even less because rust has this thing where hey anytime you're declaring a struct or a module or anything because of this nebulous thing about safety that is not really justified by real-world concerns that I've ever seen at least not in my field maybe there's research in other fields showing this to be so but in games not they've decided that these things should not be public by default right they should only like a member of a structure to only be seen by people in that struct a member of a module should only be seen by people in a module so now when you write code you have to put hub for public in front of every freaking member of obstructed it's a nightmare so I mean it's just tedious right but I don't want tedium I want quality of life I want to feel like my language is working with me instead of against me there are many good things about rust again I don't want to come off as a total rust naysayer the language has a lot of good things about it that is not one of them one of the most important factors in adopting a new language is considering what the standard library would provide given that this proposed language is for games what would be included in the standard library I think there should be one I think it should provide some basic stuff like for example matrix and vector types I think you can't just put matrix and vector types in there and expect them to be the best thing for everybody to use I think it requires some thought and it requires some thought not at the level of the standard library but at the level of the language design and I would like to talk about that again in a future installment because man if we had a vector3 that was standard in in C++ or in games it would make integration with libraries and with other people's code so much easier like oh my god it would it would be great but there are good reasons why we don't actually and why there hasn't been some committee declaring that the game industry shall now use this one specific vector three so I would like to get into why that is next time someone is saying because of the way a DS modules work their date/time module was a single 30,000 line file cool I don't necessarily think that's bad actually unless it impacts your compile time heavily or something just like I don't think functions have to be small my biggest function in the witness is like 14,000 lines I think I don't think files have to be small maybe it's 17,000 lines trying again how do you capture is different from call-by-reference and restricting scope access I think the captures should be function arguments maybe you could do capture as function arguments but the problem with those function arguments is that if somebody else wants to take your function right as an argument and apply it right as you often do with lambdas then you'll get a type checking mismatch because those additional arguments that are now out parameters or whatever do not um do not match the type declared in the function right and you want that you want the generality to have different captures without a type mismatch so I think it's a better idea to not pretend their parameters I think it's better to call it capture however maybe we decide that's not true and it simplifies the syntax to put the capture in the parentheses and it's just at the end like maybe after you're done with all your function arguments you put and captures these things like that would be cool too it might be more readable it might be less readable but that wouldn't really apply to blocks anymore because blocks don't have argument lists and I have to say the blocks have argument lists which maybe make some more it might confuse things it might be better I know it's cool idea should be thought about have I given thought to using compile-time interpretation of code to generate generics not necessarily to generate generics I don't know if that's necessary but yes in general yes named parameters were not discussed this time games often have insane pipelines and require intimate knowledge of memory layouts of the data in the engine would you like to see a language where the compiler is required to make available the memory layouts of data structures you can make tools that are more tightly in sync with actual engine code yes not even for external tools but for the debugger right when you compile your program there should be some specification in your program to a greater detail than we have instant plus plus today about where data resides right and what exactly that is I think only can come from specifically looking at what we want I gave some hints about a little bit of it before having to do with the stack and heap but that's just to do with allocators right and generally with regard to your program as a whole you might benefit greatly from having this kind of data embedded in the program maybe it's in a metadata file alongside the executable that gets output I don't know we should talk about it I haven't thought about that seriously but I guarantee some people in the games industry have and I guarantee some people are already doing it actually so that is an area for future discussion thank you everyone for coming by and I'm apparently going to do more of these talks because I've at least got an idea for one or two more so I look forward to doing those in the future I'm going to post this one up on YouTube sometime maybe tomorrow we'll see thank you again and again email be here at Language at Tech law comm with comments suggestions etc good night a good evening have a good morning whatever part of the world you are whatever time it may be
Info
Channel: Jonathan Blow
Views: 85,068
Rating: 4.9216256 out of 5
Keywords: Video Game Development (Industry), programming language, Video Game Design (Field Of Study), Video Game Development (Conference Subject), c programming language, c++ programming language, Computer Programming (Conference Subject), Rust programming language, D programming language, Java (Programming Language), Compiler (Software Genre), Game Engine (Software Genre), Software Development (Industry)
Id: 5Nc68IdNKdg
Channel Id: undefined
Length: 90min 26sec (5426 seconds)
Published: Fri Sep 26 2014
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.