"Noether: Symmetry in Programming Language Design" by Daira Hopwood (2013)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

my name's dara i'm i've invented a programming language called neta um but what i'd like to talk to you today is to try and convince you of the importance of symmetry in programming language design so i'm not just describing nerta i'm using nota as an example of this approach to designing languages and how it helps and and i'm going to be talking about that in general and then going through the sub-languages of nerda in order of decreasing symmetry and maybe if we have time and some discussion of how to implement things efficiently can everyone hear me by the way okay is that better now thank you um okay so what do i see the problem as being in uh facing programming language designers well computers have become huge in terms of their power and their storage capacity we need languages that can express powerful abstractions in order to reduce complexity and this quote by dykstra which is too long to read um is wonderful [Music] so he's saying that in 1972 because the size of computers had increased by a thousand-fold but they were too complicated well since then they've been they've increased by another million-fold and in terms of storage for a given price and that's improved by i think seven orders of magnitude so have languages and tools improved enough over that time and i don't think so i think we're still in a software crisis because um our programs are too large and complex and we don't know how to write correct secure and reliable programs um importantly we don't know how to write them at economical effort so um to get back to the political point about the nsa the nsa has been successful in exploiting flaws in widely used software because the tools and languages that we have are not good enough to be able to write secure software in practice i mean you can if you spend enough effort if you do the kind of things that say nasa is doing uh software that uses on the space station then you can write secure and uh reliable software um no one does that in practice apart from nasa a few others um so i i claim that correctness security and reliability are three sides of the same coins metaphorical kinds that can have three sides um and we have techniques that help with this um like pure functional programming and object-oriented programming but we need to understand how they help um and i think the way that they help is by imposing symmetries um so a symmetry uh in general the um kind of symmetry that you might have learnt um in junior high school or whatever are geometrical symmetries um so if you change a shape in a particular way then you get a similar shape similarly you can change a program in a particular way and get a similar program so in the simplest case a program that does the same thing and this concept also applies in physics um so uh there's charge priority uh time symmetry um you change everything to antimatter um swap things mirror things in space and change the direction of time and everything works out the same okay in programming you have lots of very interesting symmetries that apply or that nearly apply to practical programming languages um some of these don't actually apply to full programming languages that people use in practice because they have these particular warts and special cases um and what i'm trying to claim is that we could we should pay attention to those what's in special cases and try to reduce or eliminate them um so some examples you've got confluence which is that in a pure functional language you can evaluate things in different orders and you'll get the same result and you can move pure functions to different parts of a program and they'll still give the same result so and you can have symmetries that are associated with particular operators and functions you can rename variables and you can add and remove annotations that this is an interesting category of uh symmetries um for example functional abstraction so um you can take inline code extract it to a function and call that function and that will give you an equivalent program and similarly with macros modules and types in zooms okay obvious one at the end which is almost too obvious to see is comments and you can make arbitrary changes to the comments in the program and it won't do any anything to what the program does that by the way is not true in c um because of the the texture of actual system um okay so many languages and properties uh interfere with symmetries that we would like to have in our language and some of those interfering features are really quite important um things like side effects state failure handling and concurrency and some of these other ones well we can leave those out from a language perfectly fine no one uses dynamic binding anymore um implicit coercions if you've ever used the language with implicit coercions like javascript i mean i know javascript is the favorite language to bash um the implicit coercions um cause a fair proportion of the bugs that you uh you end up introducing um and features that are unhygienic in the sense that they break obstruction so for example if you have a function in some languages that will be opaque but in other languages you'll be able to extract its codes change its code to reflection on it and that breaks symmetry and of course everything anything that violates memory safety um okay and you can end up misdesigning a feature so that it doesn't have the symmetries that you would naively assume that it would do um so one of the things we can do is just leave out these features and that gives us a simple simpler language so um if we possibly can leave out these features then we should but what about the ones and that we really need like side effects failure handling and concurrency um i i'm going to claim that we do need concurrency um i know some languages do without it but it's becoming more more important um so what we can do about this is to define stratified languages so languages with different levels of complexities and in a language that's say a pure functional and all functions terminate um and at the outer level you have mutual stake currency value handling everything that you would expect in a full language and my claim is that usually for practical programs you can um write most of your code in the inner languages and use the auto answer coordination language and the claim is that this will give you a program that is overall easier to reason about and obviously this isn't not a new idea because that's the idea behind haskell it's um to make sense the idea behind a lang which has um a functional language to define processes and then a concurrent coordination language so that's the idea that we're using but we're going to take it further and at each level of the language we're going to add features and we're going to break one of the symmetries that we had at the previous level um some of these symmetries they're so useful that we shouldn't break them at all for example the um actor locality laws are a set of laws that um define or constrain how information propagates through a program or how authority propagates through a program um this was work done by carl hewitt and goliga in the 70s i think um and they're really very important for be being able to constrain um what programs in language can do so we want to retain those in full language and there's a trade-off between keeping the number of levels small to keep it understandable and sort of optimizing the curve which gives you a given expressiveness for a given symmetry um but we can always collapse the levels later um i'm going to be short on time so i'll skip some of this i really encourage people to uh read through the slides in detail because they're they're very dense um as you may have noticed um and they'll be on the web obviously um so i've lost my track uh yeah the there's another issue which happens when you have a stratified language is are you retaining the properties of the inner languages when you compose them in the coordination language it's important that you do because otherwise you lose the advantages of stratification so an analogy of this if you have if you consider the inner languages to be an ordered state of matter like a solid the outer languages to be a gas or whatever then if you move a solid within a gas it retains its shape its rigidity and similarly what you want to happen is that when you move um sub programs and that are highly ordered within the coordination language that they also retain their properties and it's easy to end up breaking that actually um yeah i'm going to skip this slide okay so um the neuter language um it's an object capability language um i don't know how many of you are familiar with that but um it's an approach to security which basically says that um if you have a reference to something then you have the authority to operate on that thing so um it avoids having a separate layer of access control because um the um the flow of authority within the program reflect its functionality that you don't need to add extra access control on afterward it's parallel it's concurrent um i've separated those two because um concurrency is really complicated it's really complicated to reason about but parallelism for efficiency can be done in a pure functional um so if we separate those two then we're optimizing the symmetry expressiveness okay and i'm only going to talk about semantics um i don't really care that much about syntax and we can use the syntax of any language we like okay those are the reasons why i think a new language is needed okay why is it called neta well emmy nurture was a um a physicist uh in the uh did most of her work in 1920s and 1930s um and she was well known for her study of symmetry and physics and in mathematics and actually there are some technical reasons why [Music] her work applies directly to nerta because we can consider the inner languages as a neterian rewriting system and we can always also use analogies with physics like we have done so i'm going into a lot of detail on this slide and there's uh quite a bit of mathematics on it um i i encourage people to look at the mathematics later um but it's it's not practical for me to explain all of it here basically what we're doing is um modeling the execution of the language as a rewriting system so we rewrite terms to simpler terms in order to model evaluation and we need to go into this much detail and take this much care about it in order to make sure that we get the symmetry properties that we want very easy thing to make mistakes with i'll just point out a bit of the notation that i'm going to be using later that this three-line swiggle equivalence operator is one of the most important things because what that says is that two terms two programs are equivalent and they're they're equivalent in any context that you might use them so nerta is a non-deterministic language in general and the way it handles failure is that any program is allowed to fail if it's deterministic then it gives a single answer or it fails and i think that's necessary because we need to be able to model um cases where um real computers fail to [Music] to satisfy the ideal model of a computer for example they've run out of memory or um the network breaks or um there's a hardware failure failure or whatever and if we include that within our model of the language and then we're being worried more realistic okay so symmetry parties that always hold basically are for renaming properties that um might hold in particular subsets so um a lot of this is basic computer science really but we need to be careful about um the case of failures and because a lot of the the work that's been done in fundamental computer science doesn't really consider failures very wide um it considers um terminating languages that always give you a given result which isn't actually very realistic okay so these are properties um that might hold for a pure language and this one is interesting on foldability and basically it says that if you have a function definition um and this this v x means some expression that uses x um and you have s and t which are equivalent expressions and you substitute s you call f with expression s then you get the same thing as expanding the function body with the equivalent thing t so that's called unfoldability and it's basically the essence of pure functional programming sometimes people say that referential transparency is the essence of functional programming but these two things do not actually imply each other in a non-deterministic language right so this is the the structure of the overall language so we have the um the most structured uh subset on the top left there um uh which is uh nuta p and that already includes parallelism because it turns out that parallelism doesn't break any symmetry properties a parallel program can do exactly the same as a serial program and be just as easy to reason about because the parallelism is in some sense automatic and then we add features in turn so we add failure handling um and as we add failure handling for example that breaks the the execution order independence um because of obviously when a failure happens affair affects which value you get these two languages the red and orange ones are not even sure and complete because you can you can do a lot in languages that aren't true and complete so most programs on most subprograms are intended to terminate and even when you have a large program say say for example a web server um each request to the web server is intended to terminate um and almost all of your code is in the handling of those requests the bit that sort of loops over requests is in a framework probably um and then we we add some less interesting features um oh then we add mutual state so when we add mutual state we know we no longer have a functional language and so we lose all these symmetries that are associated with functional languages um after that we add currency we first of all add concurrency um in a form that's very easy to reason about because it's um basically static data flow so um one of the talks before was saying that we want fixed data flow networks in order to to be able to reason about them yes that's true but it doesn't mean that the whole language has to be restricted to fix data flow [Music] and then on the outside we have an event loop language so i don't know how many of you uh know about event loop concurrency but it's it's basically the language that's used or going to be used in the model that's going to be used in javascript in lots of web applications it's used in languages like e and um sort of erlang so it's a very expressive model but it's not um the same as shared state concurrency with threads and locks um because those turn out to be impossible to reason about um i mean i know someone some people have tried with type systems that try to ensure locking i don't think it works very well okay um influences on nurture all those things but we haven't got time to go through them okay um a lot of the languages that are um sort of have a reputation for being safe languages things like um haskell or camel they have loopholes so um here are two examples um in our camel there's this um module that's part of the standard library um which is to do um marshalling and martian and also called serialization um and it's unsafe and you can very easily crush the the whole program by using it um similarly in haskell um you have unsafe things like um on say from perform web it's really good that they have unsafe in their names i mean it's not trying to mislead anything anyone and you have to import them from the unsafe module um so that's fine but they still um do crash your program if you use them um a subtle point here is that these primitives are used all over the haskell libraries um so even if you were um have a pure functional program and if you use things like statements or whatever they're going to be using this under the hood and you have to rely on the fact that the library writers have used them correctly and if you look on the haskell wiki that there's a bunch of conditions for how to safely use unsafe perform io and the very compiler dependence and implementation dependent so the basic problem here is that we have a lack of expressiveness in how to do these things safely um so example that the actual thing this program is trying to do needn't be unsafe and that similarly for the o'camel uh example there are ways to do typesafe serialization and serialization but this particular ocam library didn't use them okay so by designing a language in terms of symmetries they keep us honest in language design so for example suppose we want to avoid implicit coercions there are some behaviors that we might include in our language that are like implicit coercions but because they're not like what we recognize as an implicit version from other languages we don't know that that's what we've effectively added but that kind of thing will end up breaking symmetries and we'll have to either explicitly include that feature in another level of language or we'll have to avoid it entirely so as i say it keeps us honest in the language design okay um so even in safe languages um it's there are certain kinds of language properties that um make it much more difficult to avoid bugs so for example um if you have a language with moderate modular arithmetic or floating point arithmetic in place of integers you're going to have a mental model in your head of what addition means that is basically mathematical addition and if you implement that using an operator that actually does modulo addition or floating point addition then you're going to make mistakes and this is an example of an actual mistake that's made in um secure secure ecmascript and basically that there was a corner case at the top range of the um the set of integers that can be expressed as floating point exactly and the problem here is divergence from intuition oh i'll i'll just say how nerta addresses that so don't have operations that do the wrong thing and this is actually quite hard but remember that operations are allowed to fail so instead of doing uh modular arithmetic for example we can have an offer an addition that fails if you get the wrong answer um sporting programming without state because state is one of the things that um if if you get into an inconsistent state then it's very um difficult to recover if you didn't use state in the first place then you have no inconsistency um and transactionality so um transactions are probably familiar from database programming um and to some extent that they've started to be adopted into language design and things like software transactional memory [Music] but they haven't made much of an inroad into commonly used languages they're still sort of a research project and the um a big reason for that is performance so most software transactional memory systems for example have um if you try to run them on two processors then they're i don't know 10 20 times slower than running on one processor so but we can do transaction transactional programming efficiently and if i have time um i'll explain how it's done so i mean a lot of the compromises made in language design are because of performance or perceived performance so um if you remember the slide um from near the beginning where the performance of computers had increased by a million-fold in 30 years you might think that no one should really care about performance anymore but as dijkstra said we're trying to solve more difficult problems in your smartphone for example you you've got all of this graphics and bells and whistles that it's trying to do at the same time as computing something and the actual computing something um may be quite trivial um but because of all these uh bells and whistles it it's still not necessarily fast as you perceive it to run yeah another quote that i love about security so hor gets it spot on here um he says that every syntactically correct program should give a result or error message predictable and comprehensible in terms of the source language so if you need to understand the implementation in order to understand the errors that your program is giving you then you don't have very much chance of fixing the book and it's possible to design that problem out of languages and but there are still far too many languages in common use where that's not done i mean c is very very commonly used still but it's not just c it's lots of failure cases and even languages that are perceived to be more safe okay so let's um go into um detail about the actual languages that are part of nerta so we start off with um a deterministic strict pure functional language um where everything terminates remember that um failure is always an option so we either succeed and give a single answer or we fail in a finite time um failing in a finite time doesn't mean that um it's going to fail in a practical length of time but my claim here is that if we write in languages that are terminating then we'll understand better what the um even though we're not guaranteed to terminate in a practical length of time we're less likely to make bugs that cause us to have infinite loops because we can't do that in the language so we will understand the performance better even though the language isn't controlling performance per se and actually we can analyze security features that allow us to control performance as well but that's a separate problem okay um as more justification of allowing programs to fail so um symmetry is associated with equational reasoning so um okay the program is deterministic its outcome and side effects are dependent only on its input state determinism is a weaker property than being pure functional um you can have a deterministic program that still uses mutual states that still does have side effects but they're always the same side effects every time you run it so neta 0 is deterministic but so much uh uh uh quite a few of the uh larger languages are also deterministic um [Music] okay we we've gone through time uh location variants before so you move up a pure program to a different place then it computes the same result um and that also applies if you move it to a different task in a in the concurrent languages languages if you remember before we were talking about coherence the properties that um sort of structured success retain their properties and that's an example of that uh how are we doing for time by the way five minutes more yeah okay right um so okay so because we don't have a mutable state and because all programs terminate we can we can treat definitions within the programmers equations so for example here we have a program that doesn't terminate and so would be excluded from this subset and because it doesn't terminate and this attempt to use equation reasoning fails because we subtract loop x from both sides we get zero equals what and that fails because the program didn't terminate so that the properties that neta zero actually has is definiteness so that's the fact that a program gives a specific result or it fails you might say if we're not enforcing success then what's the advantage of that well the advantage is that we still have this on photo foldability reasoning property that holds even if in the case of failure um so that's actually the same unfoldability property that we discussed before i haven't updated the slide to say that and if a program fails we can recover from that if it goes into an infill infinite loop then detecting that is possible but the problem is that you have to use something like a timeout and a timeout makes your program non-deterministic so you lose the symmetry prophecy either way so how do we approve termination well we do it automatically um termination proving is a a well understood um problem um i think a lot of a lot of people have the impression um that non-showing complete languages are useless um actually um you can say so much without showing completeness that it's likely to be only a tiny uh part of your program that can't be automatically proven terminating i i've already gone through this about bugs that cause excessive from time so the next thing we add to this um strict sequential pure function language is parallelism this is an approach that um has been researching that originally in the silk language um it's now an intel project called silk plus um called fork joint parallelism and the the property of this is it allows you to run things in parallel but it has exactly the same semantics as a serial program and actually the original silk that statement wasn't quite true because a program could have race conditions and in the case where it has race conditions then it doesn't behave the same way as a sequential language but because we're enforcing these extra symmetry properties we can guarantee that there are no race conditions and like i say that not only does this give parallelism it's actually quite efficient in practice and there's been lots of research on how to do this well using um work stealing schedulers and they produce well not optimal results but um so within a constant factor of optimal scheduling okay and yeah like i say improvement so we have identical parallel and serial thematics and the restrictions we make um relative to silk aren't going to be a problem because we'll add full concurrency later okay yeah we're running out of time okay this is a justification for why we want explicit parallelism annotations rather than having everything being uh being fit uh the next thing we add is um local mutual state so that's mutual state that only applies within a function um and because it only applies within a function um we can um so we could compile it out we could um produce an equivalent pure functional program so although it makes that particular function more difficult to understand it's not making the whole program more difficult to understand although it does break some symmetries there's this interesting symmetry called um landing correspondence which is basically um sort of um inlining functions um and this doesn't quite hold because of the mutual state so we we have lost something the next thing we add is failure handling so failure handling in nata is transactional if like in a database transaction if you get an integrity um violation then the the transaction you're in will be rolled back now inerta that applies to every function call i explain later why that isn't as expensive as it sounds um so this is not like exception handling with reception handling if um gets an exception then the state may have been updated by the point you get the exception you know because the you lose information about exactly when the exception happened um it's actually very difficult to um recover from that inconsistent state and inertia it happens automatically so um c plus plus programmers um call this the strong exception guarantee and inertia you have it automatically for every function um okay and we lose another symmetry about evaluation there um yeah we're not going to have time for this okay okay um so you then add true and completeness text and then you add um laziness um and then static names which are um they're like um nonces in uh oz if you know what that is um and then you add reducers which are a way of um they're a weaker form of state so um they're more easier to reason about than um than full useful state okay and then and we had full state um we only had task local states so um as in languages like a lang which we message passing concurrency we have no state that is shared between different tasks and can be accessed directly transactions okay i'm not going to have um time obviously to um to justify why transactions are not too expensive um so you have to read the slides afterwards there's a detailed justification of that the next thing we had is um data flow concurrency but we had deterministic concurrency so um uh basically the the restriction here is that because you're only waiting for one possible event um everything is deterministic okay and then eventually we had the whole thing message passing concurrency and event loops um now the language that we have at the end here is still um quite easy to reason about it it's similar to languages like e and jewel and olang so it's still completely memory safe [Music] it doesn't have race conditions or it doesn't have low level race conditions um it's pretty easy well it's never easy to write secure programs but you're more likely to be able to write it than in some other languages okay um oh and there's a there's a distributed extension um which has the same semantics as the um the language on the single machine and it's um this uses a a protocol called can and which is almost magical in its ability to recover from failure but you can read about that later okay and lots of security features confirming whatever okay questions let's take questions at lunch

Info

Channel: Strange Loop Conference

Views: 1,967

Rating: 4.5789475 out of 5

Keywords: programming language, Noether, security, trust, concurrency

Id: jm5qWioS9S0

Channel Id: undefined

Length: 44min 20sec (2660 seconds)

Published: Sat Feb 27 2021