KotlinConf 2017 - The Cost of Kotlin Language Features by Duncan McGregor

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[Music] anyone else hate us BC and up so hello I am Duncan McGregor I'm a freelance developer based in London or London England as it's known over here I can't find the mouse pointer hold on hold on it's the wrong side of the screen that's where there were there were they were there so I started programming when I was 14 which was young in those days but it's now apparently the average age of a JavaScript developer Facebook that was in 1981 and then some time passed and we discovered Colin that was 2 years ago while I was working at Springer Nature the academic publishers for a while we had to ignore an edict that we could use any statically typed jvm language as long as it wasn't caught in but that was in reversed and Springer nature enthusiastic converts the joys of Colin and regular hosts to the London caught me meet up and I see that some of my homies are in the house as you can see from my visual aid this session is on the cost of Kotlin language features I assume you're here because you care about cost or I suppose care less about Colin cogeneration or cotton EE let me give you some hard-won advice don't care everything will be mostly fine if you don't care it's almost always okay and when it doesn't you'll be out of fix it if you do care you'll end up in this maze of twisty turny passages there will always be another little thing to investigate another result that you don't quite understand another technology that someone suggests you should be an expert in I wasn't gray before I started looking at this topic my eyes worked unaided she which is a point I don't have my glasses on I didn't have this broad batch in fact if I were you I would leave now the other stations look good I here Jake water is an industry thought leader it isn't too late if you stay there will be bytecode all right you can't tell you having warned block the doors we're in this together now and by the way you'll notice that isn't my picture but the placeholder looks so much more approachable and confident than I am that I thought I'd leave them in okay so why do I care about the cost of colony language features well I'm a physicist by training that training develops an inquiring mind and a sense of beauty I fell in love with the way the software allows me to express my creativity with almost instant gratification and earn big money from blue chip companies so most days I spend my time writing exquisitely crafted code to do pretty arbitrary things now if you're anything like me this code offends you not because it isn't documented not because it appears to be just making sure that a string is uppercase and ends with a single suffix of banana not because it isn't indented quite right but because the function isn't a single expression now if the aim isn't to make every everything a single expression function I don't wanna play the game and that local variable upper case that's the thing that's standing between us and a single expression and let is our weapon in the fight against local variables as I say exquisitely crafted code to do pretty arbitrary things but in the back of my mind this is nagging doubt am i setting myself up for a performance for other procedural programmers laughing behind my back along with a functional programmers who are laughing in my face is that let free enough to scatter all over the code or do I have to save it for best for situations where making this a single expression actually improves it so the primary aim of this activity is to put my mind at rest by answering the question are there language features that we should have bitterly avoid and as I said in instruction the short answer is no because you shouldn't care but I can also report as your carer as in I care so you don't have to that I didn't find any features that I would eventually avoid but you can't get out now because we lock the doors but of course this only applies to the places I looked I didn't look at every language feature and I only looked at Java right on the JVM and particular Kotlin version on a particular OS and there are lies damn lies and statistics about micro benchmarks so the secondary goal is to equip you to investigate the costs of features in your own environment say co-routines on Android on Kotlin 1.2 should you care which you shouldn't as they say give a man a fish and you feed him for a day teach him to try and investigate performance on the JVM and you have a consultant a client for a lifetime so there are a lot of college calling language features and full disclosure I've kind of included the standard library somewhat dubiously in the definition how did I go about choosing these ones well frankly bearing in mind that the primary objective of this was my peace of mind there were the things that I wondered about a better post hoc rationalization is that I assume that a Kotlin translation of imperative Java code would compile and run in the same way so look for those areas where Kotlin or I do things differently than Java and if you want to follow along by the way get your laptop's out and costings on github the code behind these songs all there along with the data that I've collected so far so diving in let we're gonna let look at let first because it's simple and we can use it to illustrate predicting performance by looking at generated code rather than benchmarking and you might say well why do we want to guess rather than benchmark well we see later on that running your benchmarks is is harder than you'd think and so often we could just go for a quick win if we're not going to be overwhelmed by byte code we need to keep things simple so let's start with the simplest function we can think of it takes a field it takes a parameter actually assigns it to a local variable and then returns that local variable and you'll see when we get on to benchmarks why we're using this at a form I'm going to compare that to that version which basically just uses let to push a an expression into into a block so we're kind of think of let I promise it would be byte code and now is the time the byte code general part of the local vote variable version is this and if you're if you're not used to reading byte code and I was one of you innocence before I set off on this journey the key thing to bear in mind is that the JVM is stack based so parameters for every operation and the results returned will pass on a stack rather than in registers so the code here is first of all taking the value of local variable 1 which is the state parameter and putting it on the stack it's then taking that value off the stack and putting and signing it to local variable 2 which is V in this case I've taken the value of local variable 2 which is V and putting it on sack and then it's issuing a turn which takes a value off the stack and returns it you'll get to love this stuff let's have a look at the let version well there's obviously more of that because there's not enough to fill in there's more than fits on the slide so what we're going to do is we're going to sort of compare those side-by-side and if you look at this with the help of color coding you see that they effectively do the same thing there's the sort of loading parameter one at the top in in buff and there's a returning content of a local variable down the bottom in in green and this is bid in the middle which if you remember I sort of mnemonic is effectively taking a local variable and assigning it to another local variable which is ironic because we removed a local variable in the source and where does it come from well I think that it's the parameter the let block which has been expanded in line and it's kind of needed there to keep the line numbers and debugging all happy because you expect to be able to debug into a let statement and have breakpoints said in those points until you can see this this NOP for no op instruction that's sitting there in order just again just to hang something on a line number really and there's another view of this code that even has sort of go-to use in it where they just go to the next statement effectively so this is even an optimized version so the let version has the same code as the local variable version and with some more code in the middle and it's rare that adding code to a code path speeding up so we might expect the let to be slower than the imperative version and here is a typical benchmark results for these functions so we're looking a minute how to generate them but they appear to be the same to the naked eye you know even if I get close in fact I've got 5,000 seconds of running each one of these benchmarks and the stats say that the baseline is statistically detective Lee faster by less than 0.1% and how come well the benchmarks are only measuring what's hot spot has had time to work its magic on the Lodi bytecode and so all those effectively nugatory instructions have been optimized into the same running code by the time we actually it to measure it and in my experience this is pretty common and it's the thing that will come up again in this session okay so that was nice we don't have to worry about let what about that other key calling funeral safety but first now soft nubuck with byte code it's time to talk about micro benchmarks now you may have used a profiler to examine the performance of your code in situ and they typically work by sampling what method is being executed at any one time and then using that develop a picture of which ones are taking a lot of time and you might have it then it tried to say well this bit of code need to be slow let's try to measure this one thing and try and run it on its own I think you find it here really not sure whether whether the code is executing at all and it's really hard to time a developer on the JVM you'd have this form in spades so they developed the Java micro benchmark harness to address this issue here's a benchmark written for the for jmh and Jeremy is kind of like j-unit it goes through a bunch of classes find the things annotated in this case with benchmark and runs them and what it does is it it'll take that bet benchmark method and we'll run it as many times as it can in a second and it uses the number of iterations that it managed to run in a second as a measure of throughput and then it does that again for a number of seconds maybe yeah you can even set it but I typically use 500 to get more statistically accurate results and generally only after say 20 iterations to allow hot spots to really get its teeth into the code now in order to make sure that your code is actually run benchmarks should operate on some sort of state object that's passed in to the benchmark and they should return results from the benchmark and if you do those two things then jmh is job is to make sure that this code is actually invoked and the hot spot can't cheat and just say well nobody used it so and have to do anything his equivalent Kotlin code and you'll see these open we had to make the class open and that kind of reveals that the J matches jmh is actually south class senior class is to augment that before it runs them and here are some results of these two two benchmarks one is the Java and the bottom one is the Kotlin and you can see first of all that the performance is in operations per second so spinning at 20 million invitations of this benchmark method a second basically there are little error bars on here that one red pixel which says I've done a good job of measuring and for the for the statisticians should there be any who haven't cut their own throats by now those are the 99.9 percent confidence interval which means that we're 90 99.9 percent confident that the measurements are inside those two red bars that you can't see because they're so close together but the important thing really is that we can see that the Java code is apparently faster than the Collin it may be obvious by I the job is quicker here but I wanted a way to check that whatever claims I made here could be verified either as I got more data or in different environments so I wrote some code that would combine results and many benchmark runs and then run J unit tests to verify sessions about those results and here's one such test it says the Java baseline is faster than the Kotlin by between 3 and 4 percent and again for statisticians that I think I did faulty to confidence in from 95% which as I understand it means that with 95% confidence we can say the Java is more than 3% faster but with the same chance of error we can't say that it is 4% faster which isn't the same as actually showing that it isn't 4% faster but it's almost the same why is the Java quicker I suppose it's possible the title of this slide gives it away here's the bytecode for the Java and now we're all experts you can see that it is taking the first parameter putting on the stack and then returning the value on stack same pattern we saw as the let example but without an additional local variable the Kotlin well that still has that code at the bottom but you can see these annotations that say well we mark that state we didn't have a question mark on the end of it it's a non nullable parameter so Collin has to check it and and if you if you explicitly check sort of bang-bang operator or whatever in Kotlin then you have a line number to check on so you can just easily check but in this case the the check is buried by the compiler so it actually has to tell you what which parameter was wrong and that's our load constant state there so that's the name of the parameter now this this tax three to four percent is levied on every non nullable parameter to public Kotlin functions and you can switch it off with a compiler option and it isn't present on private methods as these can only be called by Kotlin which will know whether it's the safe thing to do three or four percent may sound high but this won't make all of your code 3/4 and far slower that would depend on the number of parameters you have the number of public methods you have and what else you're doing in the in the method and it's were saying that I really haven't found any good way of accounting for this in jmh benchmarks so if you compare Java and Kotlin performance you're always you've always got this in the back of your mind as part of the effect so with that warning ringing in our ears we're going to go on to compare Java and Kotlin performance in the area of string interpolation so here we can actually see what a state object looks like again the annotations tell jmh that it's a special thing and in this case we've got two public feels a greeting and a subject and we're going to combine them in the canonical fashion and we do it in both Kotlin with Java at the top and Kotlin at the bottom I don't what happened to the labels on there but it does look like Kotlin is significantly slower than Java and the stats agree with us in this case between five and eight percent and in fact this is an area where I think Gatlin could do better I can see by the look in your eyes that you've had enough of byte code for now so instead of showing byte code we're going to go with this is the equivalent Java come to Kotlin code to the byte code that's generated as well that statement and the problem is I think that append empty string the first first line there and you can see how that will come about if you were to pass that string and work out how you're going to build a string builder from it but it turns out to be quite harmful certainly in short cases like this which is ironic because Scotland does manage to put a single space character rather than a string in the middle which Java doesn't so you might say well could Kotlin do better if it didn't do that and if you were to run this code you'd see they actually can and actually that would run quicker than Java so come on Kotlin compiler writers and so what I'm saying so should you avoid string interpolation in colin well no no but if you do have issues with performance this would be an area where a little bit of hard coding might might be worth yeah be worth your time one more thing to mention is the kyla is pretty smart when it comes to optimizing constant seems drink interpolation the top version is a constant so that just basically it doesn't bother to build string builder a load different constant pool the second version is the third version isn't which surprises me a bit because everything's private and immutable at that point but I'm learning that vowels are treated quite conservatively by the compiler in cases like this okay we're going take a little break from code talk about running benchmarks firstly gem h is a little intimidating you can't just drop it into your project as it has code generation so it means it's a need your damage benchmarks need their own maven or Gradle build file they need a separate project which means benchmarking your own code in your current project more complicated and when you start running benchmarks on your developer machine you were almost certain you see quite large error bars actually we have we have one here showing that the the measured throughput is varying between the runs and this measurement error can make it quite hard to distinguish between actual effect and statistical aberration this is some way earlier results and you can see that it's hard the arrow bars to say whether the top one is actually faster than the bottom one because the bottom one could be faster or slower than that nice bar so these are throughput figures per iteration for a typical gem each benchmark run on my Mac and you can see wide variations in the run time there are little dropouts lots of spiky stuff and then this huge hole here goodness knows what's going on with that red line over there and this is largely due to 2j Mitch having to share your processor with the other things I know is visit operating system is busy doing for your happy convenience and you can try and stop using a machine while you're running the benchmarks but they take minutes or hours to run a batch and even then you discover your machine is helpfully picking up email or searching for new printers on the network Google's software checks are updates every 3623 seconds which is both the Prime and a prime number of seconds more than an hour they must be pretty pleased with that the list goes on so my my first attempt to deal with this with the results of statistics to do more benchmark runs to pull in more and more data and to try and combine the results and then try find ways of discarding the cleanly Duff data but that's what complicated I did consider punting the code to et2 to run indifference responses I should say that other cloud computers available but then I realized that that whilst it would juice effect on my max background tasks I would add the effect of all the other VMs hosted on the same machine so that wasn't the Goa so I've taken to running the benchmarks on a Raspberry Pi model 3 be booted to a plain bash prompt and the PI is set to sync give it star calling there the PI is set to sync the results back to my Mac for analysis between runs so we don't even have network traffic during the benchmarks and it means I can run benchmarks in a clean environment while developing and analyzing the ones I've got or maybe even earning money if you're sharp-eyed you'll have noticed the blue line which here is maybe 1.3 by 10 to the 7 we're hovering around 1.7 by 10 to the 8 on the previous slide showing that the code only speer to be 13 times faster on my 2,400 pound macbook pro than my 32-pound raspberry pi although to be fair I did have to buy a case and a power supply to the PI okay t break over back on your heads properties are a feature Kotlin that aren't supported natively by Java and Java you can either access a field directly they'll have a public field or we can write a handwritten getter function and so we're going to simulate well simulate have those two things in a state object and we're going to benchmark effectively accessing the field and actually seem to get up in Kotlin well we can't give direct access to a field but in return copling writes a method for us and allows us to pretend that we're accessing a field which is our field property here and we can call a method as if we were accessing a field that sort of computed properties hurts the thing I'm calling method property there let's see the benchmark results well against my expectations Java direct field access is no faster than the getter which is surprising because the getter is a method call on top of a field access so again hot spot must be doing a pretty good job of optimizing that path evaluating Cottenham property that returns a constant value is the same speed as well that's the bottom one which leaves the standard Kotlin field backed property which is faster than the others by about 1.5 percent and I don't know how nobody seems to know how I went subject you to more byte code but the field property in the Java getter appeared to be running the same code to me and I had thought that it was a combination of final method accessing a final field allowing hotspot to gain a little extra traction so I did a whole bunch of different different variations but the cotton immutable property in the Kotlin field property are both quicker and a java property a final java property a getter a final I think is the same speed as all the others but the shop loser here is Java accessing a final field where it turns out the JVM spec mandates the primitive or string fields whose values are known at compile time or inlined even for non-static fields which I didn't know and it turns out at least on my test system to be slower than fetching the field so that was a poor decision at some point and who the takeaway is don't fear Kotlin properties that was relatively painless but it was for me and it about you so let's see if we can keep up momentum with first-class functions first of all functions here I'm sort of referring to Cotton's ability to treat functions like any other value to pass them in and out of functions assign them two parameters and variables and ultimately invoke them so we need a function in this case identity single parameter function returns its argument unchanged and we're going to pass it to an inline function here invoke width which takes another parameter which we'll call its first argument with which made sense when I wrote it we have benchmarks the baseline is just calling the function the the two variations are passing a lambda in that invokes identity and passing a function reference in so if you take a moment to predict which is quickest I'll set up a slight Pole and we can vote I was a true question they all essentially compile to the same code bar this little local variable and hotspot line number juggling we saw in the in the let bytecode and we know that hot spots pretty good at optimizing that away there's a caveat though it's quite easy to take this code and instruct our reference to identity into a into another variable which looks innocent enough and in fact it does slow things down it's the bottom one there but you'd be lucky to drink the slow down on anything but a dedicated benchmark machine why is the slow down well you can see here that yellow which tells you that we've ended up invoking a function through its interface rather than just in lining a call to it and actually I find it amazing that that is as quick as it is so tribute to the work component JVM unfortunate there is a situation where whether this is much more detectable on a whim I tried the same thing with installer than strings and to my surprise it actually runs quite a lot quicker with intz but if you look at the throughput we use a value of function type is dropped to over half which is all 4 and the problem here is that Kotlin doesn't have functions that take primitives so in order to pass parameters and return values from the function invoke it needs to box up it need to have an object which musings to box up that int and that can be quite a condition on the JVM especially in places like this where I assume it escaped analysis will show that it doesn't need to be put on the heap but here the effect is large because I'm not actually doing anything else so I wouldn't expect your application code to suffer too much room and you know such a drastic slowdown as this but it's a good example of how small changes can have unexpectedly large results ok the next feature is mapping and by mapping I mean the functional operation map which returns a function to every item in a collection then returns the result as a list and as with the first-class functions example we're more interested in the overhead in the function so we're here it's just effectively copying the collection and our baseline is going to be the old imperative way of doing things because it's old-fashioned these days doesn't it but you know we all co-locate one time and just so you know we're running with 10,000 elements in the in the ArrayList of strings a drumroll you know actually that's quite disappointing for us brave new functional pioneers we can take heart that if we're doing anything more complicated in our map function then the cost iteration will be relatively less than the total throughput but even so we should understand what's going on here why is that slow well if you drill down through map eventually it ends up here which is a for loop over an iterator and the iterators are quite slow because they check for a modification on the collection and every time they access an element and it can't learn where iteration where list is immutable it really shouldn't happen so we might try to write our own version of map this way which would be nice we can benchmark that and that turns out to be pleasingly fast almost as quick as our baseline but falls through the floor if you pass it a linked list because it now has to walk every yeah when you get gist so you can fix things for a linked list by providing different implementations for plain lists and random access lists but what we'd really like is a way of passing a function to a collection to have it invoked on every element in the most efficient way possible add the sort of JVM level in internal rather than external iteration or Scala sort of traversable as those writable and since that's actually function exists in JDK 8 I tried a few but splitter 804 each remaining turns out to be even quicker than accessing our ArrayList well as an index and he's also good on linked lists now the downside to this is that the transform can't be inlined that cross in line modifier is selling Kotlin that we're prepared to accept the consequences of that which are that we have to generate a class to hold the transform and we can't issue a non-local return but how what LAN local return for a map would mean I don't know so - I would say yet to be honest I hesitated to sort of present this because I haven't tried it in anger and I've benchmarked it but it may be a good tool to pull out your box if you want expressiveness of the function operators without with the speed of a good old fashioned for loop and finally on this topic is worth noting that all is not rosy in the garden of streams which is pretty Zen really isn't it the bottom bar is Java streams mapping identity have the same collection which is slower than even a slow conversion by quite a considerable chunk right we are running out of time because we started late which is bad so I'm gonna skip default collections you can speak to me later I'm gonna think about some other costs so speed or lack of it isn't the only way that a language feature can cost there are the dimensions compilation speed is a potentially significant cost those of us who lived through Scala seven or eight years ago may think you ain't seen nothing but katli and even with its incremental compilation feels slower to compile than Java actually for me it doesn't as I haven't run in Java since being forced to buy the new macbook to reduce compilation time so something acceptable so for me now the cost is that I Jessica don't get the nine hour battery life that I used to get out of my love a new MacBook Air I haven't looked at the compilation speed of different features but that might be an interesting experiment for somebody would like to come to Kotlin con 2018 code size is another another factor the Java class format is optimized for needs the Java language and not Kotlin so the language designers squirrel away this sort of stuff which looks I think worse than it is because of all that Unicode escaping but the metadata is quite a significant part of at least small class files especially if you favor small methods I suspect that I don't know but I don't know that the metadata is used in compilation and not really used at runtime but I don't know and certainly at the very least it's going to be pulled through file systems it's going to end up in your jaw files and so on and these factors are probably not going to bother our server-side developers where Cody's going to be sat in the same process for hours or weeks but maybe on small clients it may be an issue and of course this this cost in particular isn't isn't ascribed to any particular language feature but rather the feature of using Kotlin rather than Java okay so we're on to takeaways makes me hungry every time I read that slide so if you were paying attention in the traction you already know the takeaway conclusion everything I examined is reassuringly okay I didn't find any Kotlin language features that I would habitually avoid even if I cared which I'm gradually training myself not to I ended up with really a great deal of respect for the language team in the library team as far as I can see they've done a bang-up job there are compromises of course and where it has to choose Kotlin seems to favor safety and predictability over performance and used to fail fast iterators and the one we didn't see where it returns linked sets and maps rather than hash set and hash map a case in point but in lining the hotspot mean that often it doesn't have to choose it can be safe predictable and performant in line also turned out to reduce the effects of boxing and unboxing primitives because code that's in line the compiler can know that it's dealing with the primitive and not box but this can still catch you out as we saw and if you run into performance issues this is a good rock to look under and also note that at least the hotspot your code may not be performant to start with which can be an issue in little utilities or if you're running in service environment serverless environments where whole jaebeum's are born and die before they've had the chance to run into the sunny uplands and I haven't looked at the situations without hotspot android java script and lrvm if you understand the language designers trade-offs then you can often improve performance by giving up some safety or predictability and extension functions are lovely for allowing us to develop our own specialized versions of operations where we have different trade-offs and the language designers picked and I hope that I've equipped you with ways or at least a mindset to investigate performance in your own case and to benchmark and had a benchmark to check that you're making improvements should you care about costs which you shouldn't so thank you very much I'd like to thank John Nolan who did all the stats of this stuff I'd like to thank you for still being here there are a couple of blog posts ironically published the point where this session was accepted which was brilliant for me you could look at jobs bytecode benchmark harness this presentation is on Google Docs on there there will be a short quiz and you can I'm Duncan Ruger thank you very much do we have any questions oh yeah I've got three minutes yes sorry no you see there was so much stuff I meant to say by the way any questions that don't evolve the sad phrase co-routines but no I mean yes there's so much stuff sorry hello so a little bit and unfortunately slides we had to skip on the link hash nets and so on are places where actually there's quite a considerable memory overhead because you end up with linked lists of things so a little bit but another great deal thank you very much [Applause]

Info

Channel: JetBrainsTV

Views: 7,930

Rating: 4.7090907 out of 5

Keywords: KotlinConf

Id: ExkNNsDn6Vg

Channel Id: undefined

Length: 38min 51sec (2331 seconds)

Published: Wed Nov 15 2017