Martin Odersky: Scala with Style

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] so what I want to talk here is essentially give you we give the talk I gave at scatter days so that was last week it was a great conference I think a lot more everybody on the conference really enjoyed not just this thought but all of the talks are way amazing so the keynote I gave there I want every give here because I think that maybe there's something interesting for you here as well I assume that most of you here are scallop programmers so we're going to talk about Scala with style or what style to use mascara so to give you a little perspective in this talk I'll argue that we are in a transition period between two programming paradigms imperative object-oriented programming that's what we used so far and I think we are transitioning to more and more functional in the end I believe it won't be a transition that we will completely replace one one with the other but we'll see a fusion of these and in this period of upheaval which we are seeing right now there are many questions of programming techniques and programming style that have to be revisited so I want to give you my perspective on that I'm actually pretty old so I'm old enough to know the period of the last periods of upheaval that was like in the IDS when object-oriented programming took over I was a grad student then and at the last part of the eighties I was a grad student and before that I was an undergrad and I remember still when this byte cover came out that was an 81 I was on an internship at Siemens I think and I read this and I was fascinated but also thoroughly confused methods messages objects what were they talking about it was very very weird but somehow fascinating so that was the whole issue of the byte magazine which was the the premiere magazine for developers of the day of the time about small talk so now I have a question for you what was the first object-oriented programming language yeah she was it yeah you didn't fall for small talk even though I had sort of like you to that it was similar 67 right so that was a good decade before Smalltalk came out and the second then was small talk so now the question is well what did these languages have in common or rather why did object oriented programming become popular and I think to answer the question it would be good to see what similar 67 and small talk have in common so why they become popular was it because of encapsulation both small talk and sim would have encapsulation now don't think so code reuse no dynamic binding probably closer but not directly dependency inversion that one came much later Liskov substitution principle open-closed principle you gotta be kidding it I believe it was because of the new things you could do with the object on it program that's not to say that these principles are not important of course they are important but to actually get the adoption the initial adoption you need to fulfill a need that the incumbent cannot fulfill so what was that so it was essentially about the relationship of methods or operations and data structures the traditional approach is epitomized here with a linked list you have in a linked list you have only two cases the list is empty or non-empty and then you have an unbounded number of operations you can map over a list you can reverse it you can print it get an element insert and so on when similar car came out that was originally as the name implies for simulation and simulation actually is slightly different from this model there you have only a fixed number of operations these operations are typically get your simulator your objects to the next step print it display it in some way copy and aggregate but then there's an unbounded number of things that you can simulate so possible implementation of the simulation class would be a car or a road or a molecule or a cell a person a building a city could be anything really and that was a new challenge because you suddenly had an unbounded number of possible data types that all were responding to the same API to the same protocol okay so that was similar small talk was for three widgets and then we actually see the same situation a fixed number of operations with something on a GUI widget you can redraw it can form the bounding rectangle you can move it and so on and you have an unbounded number of possible widgets could be windows or menus or letters of videos curves images anything really and Smalltalk came out as a language on the alto computer first personal computer right here at cells Park and that was new for several reasons one was it has a bitmap display and the second was it had a mouse and for these reasons you could suddenly have a large number of different forums that he could present to the user previously with ASCII terminals you were very limited it was about all letters so that was a new thing and Smalltalk was a language for it so both simulation and gooeys have in common that they need a way to execute a fixed API with an unknown implementation and it certainly was possible to do this with a procedural language such as C but it was very cumbersome so I remember at this at the same time there was i b m and tried to compete against Windows with the operating always - that had a window manager called presentation manager and that was written in the original style and it was actually very hard to program compared to something in an object-oriented language so I think I believe that's why everybody migrated to object programming at first of many people did because to do these things it was just so much easier so what does it have to do with functional programming so I believe just like o repeated there and function programming has lots of two dollars methodological advantages I believe I don't need to convince you that in writing country programs you will use your error rate you can can gain better model arity raise your level of abstraction get shorter code increase your developer productivity all these advantages are real but I think they are long and not enough for mainstream adoption counter-argument to that would be well afterall Lisp has been around for 50 years and so there was time enough to prove the method object or advantages of functional programming but it hasn't happened and I believe that doesn't mean that these points are not true I firmly believe they are true but to actually get mainstream a doctor adoption you need a catalyzer something that starts initial adoption beyond what we see now I mean I think this adoption so very super encouraging but we need to go the next steps and reach out to other layers of programmers until after that the other advantages became clear to everyone and I believe we have this catalyzer now because the catalyzer is essentially the two forces that drive a lot of software complexity today which come from hardware multi-core and cloud computing hardware for the first time in a long time I believe Intel is actually not giving us necessarily the hard way that people asked for I think everybody would like to have a 50 gigahertz Intel i7 but we don't get a 50 gigahertz Intel i7 in instead we get 6 cores or 8 cores on that because that's what what can be provided and furthermore it has this development also has led to a new mode where you say well if you need more compute power you fire up more service you distribute your load over an elastic number of servers in a data center and the cloud and that's your new model that's how you get more power but of course it will affect your programming model you can't just rely to do everything sequentially anymore no more servers you spread your application over the more you have to think about handling parallelism handling distribution handing possible failures of your server and so on and I believe that's the triple challenge that we're facing today parallel how to make use of multi cores CPUs in GPUs and clusters asynchronous how to deal with asynchronous events and distributed how to deal with delays and failures and it turns out that mutable state is actually a liability for each one of these because it causes the tough problem of cache coherence how do you validate your caches if there's a change somewhere over the network of races of versioning and so on so the essence of functional programming then is that instead of doing these stepwise modifications of mutable state which are problematic we want to concentrate on transformations of immutable values so once we are there the question is what about objects should we forget about object-oriented programming and all program in pure functional languages like Haskell well some people would recommend that I'm not one of them because I think that what we have learned about object orientation in modeling systems in decomposing system stays very valid a lot of the object-oriented techniques really apply to systems architecture and just turning things from imperative to functional effects that very little so it's fundamentally the question where do we put things in what how do we structure a system what goes where in the end you need to think about these things because otherwise you will end up with a simple flat global namespace and where everything lives and we know that that can't scale to systems beyond the certain size which is way too small so that brings me if we want to look at objects then then we see that we have to on the align provides objects a little bit previously objects were characterized by state identity and we yeah but now if you want to become functional we have to change that I think state is no longer a property that's necessary associated with an object think of javelin string that doesn't have mutable state in it and it's a perfectly good object very nice one at that structurally quality is as we know much better than identity I mean how many of us have been bitten by the fact that comparing two strings in Java might be true or false depending on lots of other circumstances in the system and we definitely do want to concentrate on functionality okay so we want to combine then function and object-oriented programming and I believe technically that is actually quite possible the challenges are more social they are in a way they lie in the way one community sees the other that's how many functional programming people see object-oriented programming right and that's how many objects and programing people see functional programming it's a little mad scientist where you don't know whether it's crazy or dangerous or - or the - and that's where we are whereas it really is a bit between the two chairs and sometimes it's hard to sit between two chairs with much like much more like it this way sit and look into the sunset but to get there we may get me to get rid of some baggage first miss concert by concert previous conceptions such as objects need mutable state or they need reference identity or things like that good so Scala then is a bridge between those two paradigms and to do this it tries to be at the same time various organelles so try to make do because it tries to have a large span between between two schools of programming that have been very very different it tries to be as orthogonal as possible so to bridge this span with any as few concepts as possible but still to keep the whole span tries to be expressive and to be an opinion so it doesn't really care about where on this bridge you are whether you are an imperative side or the functional side or somewhere in the middle but the language itself lets you do it either way so it naturally adapts to many different programming styles and when I started out with Scala as an academic I thought well this is great this is clearly the best of all worlds afterwards I had to learn that well not necessarily because then you have the big problem that well because you have a choice of such a broad range of styles which ones do which one do you pick it's a hard problem for many people so certainly Scala is not a better Java and neither is it a Haskell on the JVM so I want to rule out the two extremes it's definitely something else so I believe that what we will see emerge but we have to work to get there is a new fusion of functional and object-oriented programming and what I want to do in the rest of this talk is try to make some tentative steps towards this fusion not a big design but just in answering just some very concrete questions and also asking you what this answer could be because sometimes one can very well disagree on them so I want to first start with six guidelines which I believe help in writing good scholar programs and if you disagree then we I would certainly be very happy to discuss with you at the end of the talk at the question time or after we before we leave here so the first one I believe don't believe is controversial that's just keep it simple that's always good just because Scala is a language where you can master wonderfully really tough problems really complex system it doesn't mean that you need to bring all these mechanisms to pay pick the simplest thing that does the job pick the simplest thing that goes from A to B without many contortions here so I think that's pretty uncontroversial number two that's in a sense a special case of of that don't pack too much in one expression so that's actually some code I picked from any repository that shall remain unnamed here so that's a big expression pretty five lines and takes a while to wrap your head around it what it does because you have very little help it's just a sea of filters and flat maps and flat hands and stuff like that so it's amazing what one can get done in a single statement but that doesn't mean you have to do it so I then try to refactor this thing sort of guest a little bit what what this could be and here's an alternative way to express the same so I've broken this down to say well sources is getro class path and filter and we define a workspace route and then we define a method files of entry that given an entry gives you all the files that belong to that and finally you see a sources iterator and a flat map files of entries so for each element in sources give me all the files and concatenate them so now it's sort of more legible and it's very easy to do this thing particularly in Scala because Scala lets you write both Wells and deaths in line and that's one of the great things in Scala that you can actually break out fun behavior not just values but behavior things that that get called and you don't need to go through all the overhead of adding a new method well if you do that in Java you are at least out of ten lines of code here it's a single line you just put it right where you need it and really please do that so I believe a good guideline for Scala code should say it should be every line defines something and you name what you define you don't you don't just string together a long sequence of Combinator's before I put because you can number three prefer functional so functional programming means programming without side effects so your function should be pure going from input to output I don't think it's I need to convince you of the fact but they're good films that contain the same warning here and in this corner prefer functional that means by default use vows not Mars use recursions or combinators not loops use immutable collections of immutable concentrate on transformations not crud not create read update delete code but preferring functional means that sometimes we canna break the rules and it's also important when it's okay to break the rules so I don't want to be dogmatic Scala has a mutable area a mutable part and that's not just to pull in all those Java programmers and then wean them off if every believes a mutable part is there for a purpose so sometimes mutable gives better performance in particular if you're dealing with collections that are very very large significant percentage of your total available memories say and definitely you want to be immutable immutable is known to consume more memory than mutable and sometimes even though not that often but still sometimes it adds convenience so here's some points where adding a var actually would add convenience so my fourth advice would be not to diable is local state local state so why does mutable state we could lead to complexity well the complexity is really in the interaction between different things that's always where complexity comes from it it comes from an interaction that you between different strands that you can't control very well and in the case of mutable state that interaction is clearly that you have different program parts that interact through these variables in a temporal fashion so you have to make sure that where will is set before it is read or a lock is grabbed before you enter a critical region and so on so that's a tricky part because this temporal fashion is not manifest program called code it's a property that arises from program execution and that is very hard to track so that also means that state is the more harmful than more elements of your program it touches or SAS the counter conclusion if state is local it's actually not that harmful at all it's just maybe a nuisance or you might not like it it's all for aesthetic reasons but it's actually you can't really argue that it's very very it complicates your program very much so local state is less harmful than global state so here's an example where I actually sometimes use local state so that's from I think the classifier parser and you see code like wire interfaces bar equals pop pass class header and then you say if there's an annotation then add class file annotation to my interfaces so that means that essentially like telling you well interfaces is what's in the classifier class file header but second rule modification if the thing is an annotation then classifier annotation is also part of the interfaces that's what this says now that uses avoir and an assignment what would it take to write this purely functional well not that much so you could write it this way could say parsed interfaces is passed class header interfaces is if the thing is an annotation then parsed interfaces plus class valid notation as passed interfaces fair enough it's a small difference but the functional version is clearly longer than the non-functional version and not necessarily clear so I believe it actually comes down to a question of naming the functional version if really you want to make this drive this point home that there's a difference between parsed interfaces and interfaces then do it that way it's perfectly good even though it's longer you shouldn't you say the shortest one each time but if you think that this distinction is actually a nuisance and all you're interested is what is the final interfaces then I would say well you don't need to jump through these hoops you can just use a local variable the qualification here is the variable really should have a very clearly delineated scope so yeah ideally after these two lines here the method should or the block should finish and you should just return interfaces then it's clear that nothing bad will happen with this variable if that's a global variable then I would say no by all means no because this global variable of course then is exposed you don't know what code interacts with it and even if your current version is clean and nothing interacts with it it's just an invitation for the next guy that comes along or gal that comes along to actually change that and do some some interaction good more local state examples so here's another example where local state is useful say you have a sequence of items with price and discount attributes and you want to at the end of the cash register you want to get the sum of all the prices and also the sum of all the discounts that you receive so that's easy you just do twice a map and a sub so total prices items map price sum and total discount the same with discount but now I want to say okay love do the same with just one sequence traversal maybe items is an iterator you can traverse it only once what do you do now well you could do it in a purely functional fashion then the canonical version would be I believe a fault left and that's the code that you would get so in the end you define both total price in total discount it's a fold left the initial values are 0 for both and then your function takes the current or T price and T discounted an item and adds things up it's doable but it's not super pretty and another thing that is notable here is that the flow of data is not very apparent in this thing so that's the thing you define those are the initial value you have to sort of mesh them up by position and then they flow into this thing somehow and then the thing finally goes over there it's all very convoluted and I must say that if you write the comparative version then things do are a bit clearer here so we just define the variables total price and total discount and then you iterate through items that you add the item price and item discounted thing so again there you say well probably if the the functional version is in this case more convoluted and imperative one so why not if this is a local computation and local price and local discount are just variables that he used to produce this don't shy away from considering this solution as well but the counterpoint of that is since I said state is the more problematic the more global it is that means if you have mutable objects then you that's something typically much more critical and problematic because mutable objects tend to encapsulate global states so the state is visible for everyone that can get access to this object that could be a large graph and encapsulate sounds good but it does not make the global State go away so there's still a lot of potential for complex entanglements so one thing there is that when you say don't use or be careful with mutable objects it's actually not so clear what that is what is it what is a mutable object is that an object that contains bars not necessarily how about this one we have a class buffer proxy and it takes a buffer when array buffer of T and there's a put method which forwards to the append method and there's a length method is that mutable I would say yes so you could say okay let's modify this maybe we should count mutable structures also just like bars so should not contain bars should not contain collections or any other structures immutable but then what about this so here we have a class memo and that class contains something which is definitely mutable so it's a mutable wishing me cache map in this value and then apply would do a cat or else update so it would give him a key X it would look into in our mammal map if the key is present it would return the current value if not it would use the function to compute the current value that the value for the key store it in the map and return it afterwards is that an object is that object mutable if I create a new instance of mammal what you think who thinks it's mutable okay who thinks it's immutable half-half see you don't agree and the answer is it depends so but if you assume your functions are immutable that you pass this object is indeed immutable so if you create an object like this new memo with the incrementation function that object is immutable why well but the definition I propose here is that as an object is mutable if it's functional behavior depends on its history so what that means is that the effect of a method both without and what other effects it might have depends on what happened to your object before that method was called if that makes a difference the object is mutable if it doesn't make a difference that means the method returns the same result every time we call it then the object is immutable now this memo object this will return the same result every time we call it it will do it in on different routes the first time it will take this incrementation function use it to compute the result but one and once it has a key it will simply give you a look at the key but it will be the same result hopefully the second time around it will return the thing faster well not for the incrementation function but for more complicated functions we'll do it faster okay on the other things get more murky if I actually use a side affecting function for this memo object because then if you analyze things carefully you can say you can observe a side-effect namely that the counter gets incremented the first time you call memo with a certain key because the function gets evaluated but the second time the counter stays the same because it will just look it up in the map so a memorable mutable functions is itself mutable immutable pollutes a little bit the objection good last advice don't stop improving too early so often I found that I could shrink code by a factor of 10 and make it more legible at the same time but I didn't do do that in a single step I didn't step after step after step and often the steps were spread out over several days or weeks where in the meantime I did other things so typically what what would happen is I have a rough solution I said I can improve it this way and I was book would be happy and go away with a better solution and I would come back a day or two later and said well let's have another look at my great solution and then say ah but I know another way how I can make it even better and that I could iterate several times until I have something that is really super tight and good and that's okay so some people are frustrated by that why can't I have the best solution immediately every time and I think that's you know that's unavoidable and this may be the wrong way to look at it the way you look at it should be that isn't it great that you can derive the pleasure of improving your solutions several times so not just 170 I can actually have this pressure several times I think that's an effect for every one of us and I don't propose busywork to just change from one solution to another but often these refactorings give you clear coat and in Scala you can have you can go a long way and I think that's that's we should embrace that and it's a good thing so keep going so if I want to then in the last part of my talk present you a set of choices where one can often argue for one or the other there are different styles to present things and we can discuss when which style would be appropriate the first choice would be in fix operations versus dot so as you know Scala unifies operators and method calls so every operator is a method and that means that you can also use every method which has an parameter as an infix operator so you have a choice do you use the dot method parent syntax or to use the infix operator syntax which one would you choose here no contest you would choose the last one the plus right nobody would write class as a method what about this one here would you write XS map F in line or XS dot map path who would write access map in line D the left part and the dot method part okay interesting it's kind of days I think we had slightly more votes for the dot method part so the west coast seems to be a little bit more operator centric than the East Coast but it's it was flipped anyway so here what I saw was about 40% infix operator 60% methods and it's calibrated where boss maybe 1/3 to 2/3 if I if I do a rough estimate okay what about this it's again I have in fixed operators but now I have a sequence of them and the right hand we have them lined up by dots who would write it like on the left Wow three okay well if you look at legibility then I believe that the left hand side loses clearly so it's it's very hard to see in a long sequence like that what are the verbs and what are the subjects are the objects in in a thing like that because you'd have to count so XS is the first one flat map is a word and fun would be a subject filter not as a verb well it's it gets very hard to keep track the dots give you much more guidance for that okay but the problem is it's an annoying choice you have to make so let me propose just for the sake of fixing one standard something to fix that so the first thing I think that's uncontroversial if a method name is symbolic we use in fix for alpha numeric method names I believe one can use in fix it's a choice optional if there's only one alpha numeric operation in the expression so something like mapping add filter is okay and I typically use that myself in particular if the method name is short like is or contains or add or start stuff like that but once it's more than one then I would prefer change operators like if we wrote this out in in in fix then we would have a problem for instance you would then read mapping space add space filter and here it would throw you because you would sort of pass filter as a work but it can of course be also used as a subject I feel it's perfectly good good name for a subject so I believe we sort of rely on our passing of these things on implicitly knowing what are the combinators and that's an assumption that you might be true for you but not for your readers newcomers to the culpers might really not know what the typical Combinator's in scala are called so they would have a big legibility problem here choice number two alphabetic verses symbolic well that's an an old one and I think we fought these battles last year there were but we had sort of mostly beyond them now I think we have we all we are mostly set on our honor honor on us on a way to decide between those two so we know that identifies in Scala can be alphanumeric or symbolic so operators can be either methods how do you choose do you write XS map F or XS whatever do you write vector + MX or vector add a mix ok yeah you probably would use a plus here right so you would use the symbolic name you write XS forward left set up or zet / : XS up who writes XS for left that oh yeah and who writes the other few have you yeah okay yeah that I was guilty of that one of introducing that one because it's really cool so let me explain to you why this is really cool so fold left is actually very hard to parse because so one one guy somebody told me well look forward left I always thought it it's called forward left because the operation goes from right to left but that's actually wrong it goes from left to right it's confusing the other thing that's confusing about fold left is that the when you write them out these IDs zero actually ends up on the left hand side so it's zero up XS one up XS 2 and so on but here the zero comes after the XS that's the other thing that's very confusing whereas here everything is lined up right and the great intuition is think dominos so that thing here is a domino and it starts with a zero and it's where it goes right and and that and that that's the whole thing so brilliant I thought it was brilliant and nobody nobody here but you see that's the fallacy you think it's brilliant and I'm sure you can think of some of these operators as well that are brilliant but your audience has no clue what you thought and they are just completely puzzled right so that's why I think in retrospect even though maybe some of you are using this operator now maybe after Scala there's quite a few people's air you have to be convinced that it's a cool operator after all but but it's you have to sort of really even if you think it's brilliant and it probably is you you're your typical readers of your programs have no clue what this thing should mean and and you probably won't have a chance to educate them all so it's much better to stick to alphabetic good here's another one that's what I was more lucky with that one so after the slash : d bagel and several other debaters I was really loved to introduce another symbolic operator but I did it anyway so there's a triple question mark in 210 and that was an I think there was a complete win so everybody immediately took the triple question mark it's the right thing and I really that means that for triple question mark that was clear what it meant what else could it mean that this thing is undefined you have to define it and I think furthermore it's good because if we had picked something like to do or undefined that's a thing that sort of drowns in a sea of text we're stupid question mark sticks out you see immediately typically like others where I have to work so that's the advantage so to sum it up my advice would be use symbolic only if the meaning is understood by your target audience beforehand the operator is standard in your application domain typically example a mathematical operators and or and you would like to draw attention to the operator that was the triple question mark because it usually sticks out better than alphabetic and if you call only on reason three it's a very risky thing so probably you want to have at least one or two as well good choice number three loops recursions or combinators so often for the same functionality too you can use one of the three so here we have a simple loop it steps through the numbers up to a limit and it tests a qualified predicate and it will give you the first number that qualifies or limit if none of the numbers qualify so you can do it that way or you can use recursion so here's the because it's purely functional solution for the same thing you say recur with an integer argument if that's greater or equal limit or I qualifies return I otherwise return recur I plus one and you start with recurs Eero or you could use predefined combinators so that would be the that would do the trick so you say go from zero and two lengths find the one that qualifies that would give you an optional result get the option or if the option is not return length which one would you choose here the loop there recursion or the combinators who would choose the loop a couple yeah recursion somewhat more the Combinator's large out yeah that drowns the trance rest okay let's do another one how about this one it's a snap f think with list but first does the same thing or recursion does the same thing well I guess they're pretty clear everybody would pick map I don't even I don't dare to ask you - how about this one so that's X s group - - list map case list X Y X times y then we have a the same thing with a loop so we have a buffer and you step through a list and as long as it's not empty we take two elements off the list put them put their product in the buffer and advance by two elements or we have the same thing as a recursive function so same thing as a loop we if the list has two elements then we take them form the product recurse if the list is nearly give you under which one would you pick here who would pick the group thing the combinators okay still a lot a bit less than previously the the while loop no nobody wants to violet the recursion okay okay good you have good taste so though so do I loop I believe us is clearly the worst here you don't need a while loop here it's it's just distracting it's also the longest of the three solution yeah yeah then I still wouldn't use the while loop I get to that yeah they're grouped version is it's great if you know the commentators because it uses some fancy one like grouped is only available on iterators and uses some fancy construction with this pattern matching closure and things like that so it's great if you know the commentators and everybody else on your team knows them as well that's sort of the fatigue the danger to fall into you know the Combinator's but maybe your co-workers don't and then they would be needlessly puzzled by this code and I believe the third one that's actually underrated so I think one very good style of functional programming is essentially first order function programming not using any high order functions just recursive param matching equations and that does a trick and it's actually sometimes usually often longer than combinators but it might be actually be easier to grasp for newcomers and furthermore it tends to be more efficient as well so if you have if you're unsure whether you need all the power of this thing is very very long then the recursive solution here is actually not good yeah right so that's that's a potential downside here often you can but you could massage the solution so that it does become tail recursive that's a standard approach here so my only reason for the while loop would be yeah if the list is very long then use a list buffer probably yeah that's true if not and use recursion or if you know the Combinator's use them good if we so often these pattern matching recursive solutions are tail recursive and if you can verify by just putting Taric on them and if they are keratosis then they do just run just run just as fast as a while loop so there's not really a reason for performance to go for the imperative solution this one isn't so why does gala have all three well combinators that clearly in most cases the easiest to use anyway they're all done in the library recursive functions are clearly the bedrock of functional programming you want them together with parent matching because that's much clearer and safer than the tests and selections and loops well maybe it's kind I could have done okay without loops but they are familiar and they're sometimes the simplest solution sometimes so my recommendation would be consider combinators first since most of you are already doing that if this becomes too tedious or efficiency is a big concern try to fall back on tail recursive functions and loops can be used in simple cases or when the computation inherently modified State so if that's something sometimes your computation wants to modify State a mutable map or something like that if you do that anyway then you might as well write a loop and loops might actually be the simplest if you inherit if you modify in one iteration a lot of different pieces of mutable state then then a loop can be can be a good choice choice number four procedures or equals so Scala has special syntax for unit return returning procedures so here's a swap method that just swaps two element of an array so that searchers standard implementation and you can write just essentially the block afterwards without writing equals and colon unit or you could write colon unit and equals then it looks just like a normal function so which one do you choose here who chooses the first one quite a lot and it's actually in the official style guide - who - the second one bit less yeah yeah that was actually I believe a mistake historical accident so the reason why procedural syntax was introduced is that very very early on when we did like the first scholar tutorials I was concerned that I'd have to explain unit too early to Java programmers so the thing was we wanted to write like a sort function and the problem was I wanted to get to the thing well how do we write a sort function and just to show them how to map Java to Scala and things like that and then there was this unit and I would have to do this whole detour to say well unit is sort of like void but not really because it's a type of a value and it was scared of that so I said well let's just hide this whole thing we sweep it under the carpet we have special syntax it looks more familiar to Java programmers anyway I believe actually that was a mistake because it traded simplicity for ease of use familiarity so it's definitely more familiar but it's not as easy now you have two versions to do this and furthermore the the easy one has a bad trap namely when you write : unit and then something then the compiler gets thoroughly confused because what it thinks is that you're actually defining an abstract method with result type unit and then it thinks this thing should be a refinement of this result types of ambiguity and initially in the first version so a long time in Scala compiler you got absolutely confusing error messages I think the one you got is probably somewhere here that probably says it wanted a colon here or something like that so in the middle of this it would get confused because it said well this is not a type refinement this is statements but I don't want statements we have put a lot of work in in that in Scala tune eight or two nine I'm not exactly when so when that we special case this thing in the compiler and says well you probably meant to write an equal sign here something like that but it's still very confusing and I think it's it's not worth the added complexity this this opens up so my advice would be don't use procedures in tax and it turns out that it's called-- I see people say they will change the style guide and also in IntelliJ where you actually have a refactoring from here to here so automated refactoring IntelliJ they promised me that turned it around so in the next every facts are you from here to here good choice number five private versus nested so say you have an outer method here that uses a helper method for some of its functionality so I've said before it's actually really cool that Scala has this capability to put methods in line so that they don't pollute the global namespace and they're much less overhead so the question is should you always do this so always put it into the innermost scope or are there reasons to put this method sometimes into an outer scope I think in most cases you want to put this into the inner scope definitely you want to do so that would be the outer scope alternative sorry so you would have now the method is Java you make it private you put it here and then you call it like that so which one do you choose so I think in most cases you want the first one you want to put it where the usages with maybe one exception so you definitely want to nest if that means you can save a parameter so here I have modified this Java method and now it actually refers to owner with only is Java then there's a big advantage to keeping it where it is because that means you don't have to pass owner into the method you save on a parameter and that's always good but let's say you don't become you don't capture anything there is no reference to any outer thing then you could argue that there's a legibility problem the other way around because when you look at this method you say well does it is it affected by anything out out here or not and for for small method like it's Java that's very easy you just scan the line and says well it's not affected like that was this thing here well obviously it doesn't refer to own but now think is Java is actually a method that stem line Long's time ten lines long then that becomes a mental effort to see well does it use own or does it not use honor so there you might have a case where you say well it actually might be preferable to move is Java out into a private method because then it becomes clear that well it doesn't need anything from the inner scope so my recommendation would be prefer nesting if either you can save on parameters then it's a no-brainer or also prefer nesting for smallish functions even if nothing is captured and secondly don't nest many levels deep that sort of goes to don't overdo it because if you make nest many levels deep it often becomes very difficult to align just your little thing in your head to say which which which method calls where last choice pattern matching versus dynamic dispatch that sort of brings us back into the object-oriented versus functional or the object-oriented yet versus imperative functional debate that started an object and programming started so let's say you have a hierarchy of shapes like here you have a base class shape and you have them subclasses circle rectangular points and they're all shapes and you want to write a method to compute the area of a shape two ways to do that you can write an area method using pattern matching like this one so you just have cases of all the different shapes and do the right put the right formula in place for each one or you could have area as an abstract method in class shape that gets then implemented by each subclass which one do you choose who would choose the pattern matching version okay how about and who would choose the method okay bit more than half yeah okay so the method has it that's actually an interesting question and I've been going back and forth a little bit that one so first before we go to the answer why the Scala five Scala has both well pattern matching is the functional part comes from functional part and it definitely is very convenient a lot of code becomes very concise and clear dynamic dispatch comes from object oriented programming and it's the core mechanism for extensible systems so you want both but how do you choose between them so I believe that the answer to that is it depends whether your system should be extensible or not and in what direction it would be extensible and that brings us back to this question how object oriented programming started so if you first see extensions with new data alternatives dynamic dispatch is better so for instance if we had added a class play triangle to our shapes then here it would be very smooth we just add a new case class and that would have its own implementation of the area method and we're done whereas if we use the previous method to do to define area we'd have to define both the class and we'd have to go back to this area method and think of the case so we'd have to think of two locations instead of one which is harder to keep in your head on the other hand if you foresee that you would like to add typically two new methods later then I believe pattern matching is advantageous because otherwise you end up here touching all of these classes and adding a new method to each of them whereas whereas with the pattern matching method you would have only a single point that you need to exchange extend sorry if the system is closed and complex so you don't see for CNE extensibility requirements I believe one should also choose pattern matching and that's something we have sort of changed in going around the compiler for Scala is a closed and complex systems or typically you don't add new syntax three cases every week and also the functionality is pretty fixed what you do with it and the current version actually uses dynamic dispatch a lot and I'm sort of shifting over to param matching because it gives me a single point where I can understand the logic the problem with the dynamic dispatch is that the logic is really dispersed and I want to say well what is not the area of a shape but let's say the the symbol that corresponds to a type then if I have a single definition that says well for this type it's that for that type is that for that type is that it's something that I can grasp in my head if I have could to go through all the many hundreds of lines of class definitions that do the different types it's much much more difficult to actually compile and collect this knowledge into something that is a thing that is has has a core definition so that's why it actually believes that when in doubt prefer the para matching solution prefer this one it gives you a single point that where you can understand the logic you have an objection yeah good that's a point yeah yeah with virtual methods the compiler will inform you if you forgot one where's with with pattern matching it will do the same thing but only if the base class is sealed so sealed paste but on the other hand if you have a closed system then I would very much encourage you do seal your base classes because that's that's when things really start to work with your parent matches as well okay what if you want to extend in both dimensions so that's sort of the opposite for complex and close by the system where you want to add new methods and new data types well one thing that we found out to work found to work pretty well in that cases to essentially put a pattern matching solution inside a class that you can inherit so we would have something like a shape handler and it has an area method that essentially deals with all the known shapes and you can add further methods to that and then there could be one if you add new shapes somewhere else in your system there could be an extra shape handler and that would add handle than the new shapes and in would fall back to the super handler for they be the shapes that you have already seen in the base class that's one possible solutions there are many others it's a well studied problem it's called the expression problem and the expression problem is essentially about extensibility of both data and operations in typically object-oriented systems it was the name was coined by my colleague Phil water the expression problem he chose because his standard example where arithmetic expressions syntax trees security that describe arithmetic expressions and of course it's about expressiveness as well expressiveness of languages so conclusions we've seen lots of puzzling choices so I didn't want to hide them from you go right into that and give you some of the design of some of the criteria with which one might choose one or the other and I think it's natural to have all these choices because after all we are breaking new ground here so my mind advice of the talk would be keep things simple think of good names so I think naming is really crucial in everything think so think of many names and good names for the intermediate results in your program and have fun and I think if you follow these three then you're doing all right thank you so we have some time for questions in 15 minutes some like that so the question was should you prefer large interfaces with many methods over small interfaces with them essentially having implicit decorators that provide the other functionality I would say the letter so I believe a good design should have small interfaces that are more modular that get into these decorators now of course the follow-on question is why don't Scala collections have that via then they're not designed that way and I believe they're the the answer is well we laid the groundwork that we could do that only much later in 210 we essentially had essentially very convenient and free implicit decorators with value classes and implicit classes before the overhead was quite large both in the in the notation to actually write these implicit decorators and what's for collections more important it would have had a performance impact because every time you call a method you would have created a new wrapper object and we didn't want to pay that price let's say you call let's say the hey drop two thing which is very fast on less than actually creating this wrapper would actually have slowed this down maybe by by five by several times so that's why we didn't do it but I think now that we have it if we would get a chance to redesign the collections and make it more modular that would be a good idea so general the advice would be implicit parameters and decorators are great and they're really a great tool to motorize your system so by all means consider those okay so the the question was that they found that implicit parameters work really well but implicit conversions post lots of problems so what advice to give there so yeah I mean I completely echo your sentiments implicit parameters generally work really well and as far as I can see not really problematic in most cases implicit conversions essentially there's only one thing we wear in physical conversions our unproblematic and that's essentially when they introduce new methods that you'll immediately consume that that's what I call implicit decorators so essentially you add new methods in the way you would in other languages with extension methods but it's essentially an implicit that injects a method and then you call it and for that we now have special syntax since got a two point and that's implicit classes so with an implicit class you can essentially fire this thing up that leaves them the third part ways I have a implicit conversion from type A to type B and that's sort of kept in there as a convenience thing and then it's true that the more of these things you introduce typically the more problems you cause yourself so a couple you get away with them fine but if you overdo it and you have more and more than things get ambiguous very quickly and confusing very quickly so that's also the reason why these these third kind of implicit conversions we have now put them behind the lock the lock is called a language import so you now have to write import language dot implicit conversions if you want to use that and I would say yes in most cases I would not have advise to use them so I would say if you have a policy in your team don't use that what's behind that lock then I would say that's I think probably a good choice you might have in some specialized reasons where you want happened and of course you can get it by importing it but by default I would say it's a good choice not to use them you cannot extend their case class yes that's true so it's really only formed further alternatives so if one can nothing but but what was the question if you cannot extend the case class then oh you say okay now I get it yet so I believe what you're getting at is to say well if I use dynamic dispatch can have gradual refinements in overrides along a tree and those I can't do with a plain Terran match because it would mean I would have to have case classes that inherit case classes and that's true yeah so if you want to do that not just have a single method that you implement them in essentially a flat sum of types but you want to override this method in further in further refinements than dynamic dispatch is the choice for you but I should say if you do that a lot and also your program code will not be very legible if this is something I mean a lot of overrides in many different cases are typically something that is not very legible because you not only have you have you have to look at all the possible cases now but you might think you have something because you look at in the base class and then actually in the subclass no it's not the implementation you were thinking of so I would do that I mean there's certainly reasons for doing it this way typically performance so if you have a clear contract of what your method is if you have special implementations and special special implementation subclasses that just do the same job but faster nothing against that but I think one has to again see what what what are the application areas of public applicability here so would for monadic operations one can express them with for yield or with map flatmap which to prefer I think that for many people flat map is kind of cool but also confusing what it is so I would say once you have flat maps in your sequence I would typically use a for yield instead it's simpler most people most people understand this better so technically it means if you have more than one generator and you need a flat map then you could use a fork for for expression and it would be typically more legible so nothing against map and filter but flat map let's keep it a secret oh okay so the third alternative would be to use the same thing with type classes so we'd have to see Beaufort area yeah so type classes I didn't think of type classes because so I imagine that what you are after is to have a type class area of area a computer that essentially has implementations for circle rectangle and the other shapes that's a slightly different behavior because then your area actually depends on the static type of something not on the runtime type so it's a different ballgame so my recommendation here would be if your architecture is such that the dispatch always depends on the static type so you you know that that you want me to rely on the runtime type then indeed type classes are a good choice and probably the preferable choice it goes a little bit in the direction of the first question which says well should we put everything in one big API and thereby make it over rideable and dynamically dispatchable or should we put type classes so the answer was preferred type classes but of course that means that essentially these things have the same implementation for all dynamic subtypes of any static type let see if you can assure that then I would say type classes like right in this case I didn't assume that so I didn't consider type classes yeah yeah we had them I think we had Eugene here last week speak last week at a meet-up and maybe he has already said a little bit so I think we are sort of we have a roadmap for what we want to do with macros now so the current state is macros I'm to ten but as an experimental feature so experimental means they might go away or be just all be fundamentally changed in future versions one shouldn't rely on them for anything that that that is is crucial for to be brought forward experiments is fine but currently not much more than that so I believe what we are ending up with in one of the future versions is a restricted version of macros which I call blackbox macros so a macro is just a function it takes arguments that that are expression trees and it gives you back a new expression tree and the crucial point which I believe makes a big difference is whether this function is type-check the way its declared or the way it's expanded so let's say we have a function that says well I give you a I take a string or an expression yielding a string and I give you back and an object so then when you expand the macro actually the expansion would give you back something much more fancy like a list or a boolean or things like that the question is which of the two types does the type checker see object the one that's declared as the return type of the method or string if the answer is the second one that's the current state of it off of experimental macros then that opens the door to do a lot of fancy tricks with types because macros can essentially produce new stuff that only has to type check on expansion on football therefore the declare type can be any in nobody cares and that means you open the floodgates for essentially language extensions that can be as wild as you can imagine unfortunately it also means that there's no chance you can support this thing in plane editors or ideas because it would mean type checking a program means expanding your macros you have to expand your macros before you can type check so we want I think we have said well this was great while it lasted but we don't wanna standardize on that so we Daesan black box macros where we say well essentially the type that's declared in the macro is a type you get that means that I des can actually type check your program perfectly well without macro expansion and it also means that macro expansion essentially essentially it's there to improve your program in some way without fundamentally changing it so macros that would be possible under this new scheme is for instance taking a four inch of a range and replacing it by a while loop or something that could check whether a spore there was a talk about spores that means a closure has escaping references and giving you are warning are an error if it hasn't or things like optimizing your collections so that loops are fused and things like that so all these things will remain possible but you will essentially not be able to sort of undermine the type systems with macros and I think that's sort of a good compromise of power and and tooling and understandability so I think will converge on that one did that answer your question okay good why did we choose not to have lazy evaluation by default there are many problems with lazy evaluation so the first is that of course it's incompatible with side-effects if you ask Simon Jones he says well that's actually good because the biggest the best things with lazy with lazy evaluation in Haskell is it has kept us honest about side effects and that's why I asked and essentially has a very strict regime where any side effects have to be in a Moana so if you have schoo that's the right thing but if you come from a language like Scala which started with side effects from the beginning lazy evaluation is fundamentally compatible with that Universal lazy evaluation in even if you if you use less evals and Scala and I would encourage you that's good practice it also means you should be very careful with side effects with mixing that the side effects because it's rather can be rather surprising when things I evaluated in side effects are all about temporal orderings so the two things something's that's the the more things that are problematic with lazy evaluation Universal lazy evaluation in particular memory graphs so often when you evaluate lazily everything you essentially you get automatic loop unrolling so often your loop or recursive computation it gets unrolled that you get essentially the full graph of all the unrolling Xin your loop in your data structure and that kind of course be huge so Haskell programs can be sometimes rather difficult to debug in their space behavior and again that's not what we wanted for for for for SCADA so I think that's among the two main reasons I actually believe we hit a pretty good compromise to give you a lazy evaluation on demand and I personally haven't really seen many reasons to go beyond that to try to be honest yeah you shouldn't ask me I never wrote anything with Java beans so yeah we're together since said as well you can you know you can have this you can have this annotate very fine state yeah probably my advice would be use play and akka and don't use Java pins so the question was can we like drop the static typing requirements locally and then get to types so in a sense we have that already so we have dynamic you can use dynamic it's again under lock because it gets dangerous so you have to import the import language not dynamic but then dynamic is perfectly good it gives you dynamic types you can have methods any method calling with an amicus tanum it's dynamically chosen so it's dynamically resolved and to get back to static typing we also have a good means and that's pattern matching so I can take an object I don't know what it is but once I have put the parents in place in each pattern then typically I do what the types of my variables are so I think with those two we have that I haven't seen that much reason to go that way except like you say interfacing with something that is inherently dynamic JSON or JavaScript at the whole or things like that then I believe it's very important and that's why we also added dynamic so there was a very nice talk about scalloped jeaious at Scala days by C best Android or any who had met a lot of interest and dynamic is essentially a cornerstone of that because that lets us talk with JavaScript programs directly and if they happen to have types because that that project actually makes then makes use of any typescript type annotations which now also exists for JavaScript and we use them if you don't have them then we fall back to dynamic for these reasons for these scenarios is actually I believe it's very good to have dynamic and is promising one last question do we plan to extend macros to do aspect or any programming over the whole probably not there are some things we are looking at we haven't figured out yet what you would like to do is something like being getters and setters things like that so essentially code generation of methods that's currently specialized if you write beam property on a field you get together and setter it would be nice to be able to generalize that to make that within the capabilities of a macro that takes a load of the compiler the compiler doesn't have to special case it anymore and it opens the door to after other people to do stuff like that and I think we probably want to want to keep it at that not to point cuts and stuff like that I don't think that would happen ok I think we should wrap up then thank you very much for being here [Applause] [Music]
Info
Channel: InfoQ
Views: 97,887
Rating: 4.9505663 out of 5
Keywords: martin odersky, scala, style, scaladays, functional, programming, imperative, sf scala, marakana, creator of scala
Id: kkTFx3-duc8
Channel Id: undefined
Length: 74min 54sec (4494 seconds)
Published: Tue Jul 02 2013
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.