Effective Java - Still Effective After All These Years

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
duh well I put in my clicker mm-hmm I decided for no terribly good reason to embarrass my my son who came with me tonight this is this is tim block and he he wrote the front end of this java application which says that our house is currently consuming 430 watts which is actually pretty good if any of you have tried to reduce the energy consumption in your house you know that for 30 watts is it's pretty hard to achieve but yeah we actually cut our electricity usage in half after writing this application the hardware that we're using is called ted the energy detective from a company called energy inc so check it out but that's not what i'm here to talk about tonight tonight i have a feast of java for you at least I think I do let me see if we're real lucky I'll even be able to use my clicker all right so can everybody hear me like in the back yeah good and my clipboard even works all right so as promised I'm going to start with some appetizers in fact two kinds of appetizers I will have visual things that don't do what they look like they do and then I'll have code that doesn't do what it looks like it does so let's start with the eye candy so here we have the magic number from the Java class file format cafe babe and as you can see the letters are tilted right so the C is kind of tilted this way the a this way the F this way and so forth right right so now what I'm going to do is I'm going to color the letters in with some light yellow ink but I'm going to do it really gradually so you can see that I'm not moving them or anything okay now are they tilted well let's remove the ink there you go or what about this I have some dots for you on a checkerboard how many colors of dots do you see shout it out do you see I'm not saying do you think are there how many do you see three most people see three they see light gray mid gray and dark gray so what if I tell you they're all the same and how do I prove it I'm just going to dissolve away the checkerboard leaving only the dots all the same bring back the checkerboard and they look different again and now I'm going to tell you how it works there are gradients in the squares so like if you look at this square it goes from dark to light dark to light dark to light and dark to light it's surrounded by a dark to light gradient so it looks dark and what about these light ones they're surrounded by light to dark gradients and the mid gray ones there along the diagonals where the colors of the squares don't change so that's the actual color you see and now that you know how it works of course you don't see it anymore right they all look the same color no that's that's not the way your brain works but luckily with with the code once you know how the trick works you know you can actually avoid the problem in the future so let's let's do a couple of code puzzles how many people here have done puzzlers before alright for those of you haven't done them before here's how it works I'm going to show you a small program that fits on a slide and then I'm going to walk you through what it appears to do and then you will tell me what it actually does by show of hands it will be multiple choice four choices and then when you voted I'll tell you what it really does but it's not all fun and games each of these has moral associated with it usually it illustrates some sort of a coding trap and about by telling you what's going on you can avoid that trap in the future and this isn't like a you know presidential primary you do actually have to vote um you know I'm not going to go on until you vote and in fact we have we have prizes it's going to be a little bit tricky because they're only three puzzles but keep track of how many you got right on the honor system and I'll flip coins to see who gets the prizes so without further ado the first puzzle this one is called life's persistent questions and in this case the question is simple yes or no so we have a static boolean method here called yes or no that takes a string s and returns true or false depending on whether the string represents a yes answer or no answer so you might write this for you know doing some sort of command-line application where the guy types in yes or no where true or false and it tries to be kind of flexible so what does it do it takes its input string first translates it to lowercase so whatever case it starts in now it's all lowercase and then if the string equals yes or Y or T it replaces it with true and finally we call boolean get boolean to translate that string to a boolean and we return that boolean so my question to you is what does this program print and the program is very simple it has only a main function which calls print ylim of yes or no on true followed by a space followed by yes or no on capital y lowercase e capital s so basically we're parsing two strings with this method and by the way you're allowed to talk amongst yourselves you're not allowed to type it into Eclipse or anything like that and then you'll tell me does it return does it print I should say false false does it print to false true true or none of the above I'll give you a moment to think about it don't yell it out we'll take our votes in a moment and by the way none of the above you know could be I suppose it could be false true it could be throws an exception could be varies from run to run you know something like that ah yes so in past I've given some puzzlers that don't compile and I found that the audience gets very very angry in fact people stopped throwing overripe fruit and they started throwing under ripe fruit and it was beginning to hurt so from now on I promise space cadets honor that all future puzzlers including the three that I'm presenting this evening will compile oh I should say that some of them may make all methods from Java util they all begin with import java.util that star but it's using my special white-on-white font so you just can't see it yeah when you're running it's just Java simple question in each case you just just up and run it no command-line arguments all right everybody nobody's chattering and everybody's kind of nodding so I guess you all got this one so how many people say it's choice a false false smattering two three maybe four and by the way raise your hand up high so I can count you you know have the courage of convictions well I used to sing in choir when I was in college and my choir director gave me very good advice he basically said if you're not sure you know the right note sing loud and to this day I follow that advice so sing loud how many people for choice a five of you how many people for choice B true/false one tentative how many people for choice C true true half of you so that means D must be approximately half of you but raise your hands anyway yeah actually D wins all right so what actually happens when you run this as a practical matter it prints false false and why does it print false false well the boolean get boolean method does not do what you think it does let's see what it actually does see here's what it says it says returns true if and only if the system property system property named by the argument exists and is equal to the string true so unless you can prove to me that your laptop right now has a system property on it you know called whatever yes whose value is in fact true or something like that the thing is just going to print because there is no such named assistant property it's a darn shame isn't it so let's take another look at it indeed we are calling that horrid method so how do you fix it well you could fix it like this just call the method you meant to call boolean dot parse bull you know like engineered up parsing but I don't think this is actually a really good way to fix this program and the reason is this you're doing an awful lot of machination just to make the string work for this method why use this method this isn't really what you want at all you know and in fact we're due to lowercase here and then we're doing a case independent comparison in here so we're kind of doing the work twice and it's just you know messier than it needs to be so let's just do this you know if we simply go to lower case and then compare it with all of the strings that we're interested in then it clearly works and that's what we're after we want not something that merely works but something that clearly works right okay so the moral strange and terrible methods lurk in the libraries there are some things that are in there that you don't know about and you're better off not known about I don't know why we didn't deprecated this one years ago so anyone who has the power to do that you know kind of open JDK committers out there deprecated those methods they're terrible and by the way there's one for every type and and they're in the types themselves you know so like there's integer dot get integer just like there's integer dot parseint endures it's just awful um if a code misbehaves then make sure you're calling the library methods you think you are read the documentation click on it you know hover over it you're using these really nice development environments so take advantage of them and this is really about API designers I should have bold-faced this line don't violate the principle of least astonishment the principle of least astonishment says that every method should do the least astonishing thing given its name and its arguments it should not astonish its callers in this case I think it probably astonished most of you it certainly astonished me the first time I called it accidentally and don't violate the abstraction hierarchy right good software is hierarchical you write low-level will try again you write low-level things on top of those higher-level and so forth so in this case you have the class integer that's like really low-level it's just a wrapper for an INT what's it doing and calling out to properties properties is high-level so properties should depend on integer not the other way around or boolean or whatever and by the way properties is pretty well broken otherwise um and finally don't use similar names for wildly different behaviors you got you know boolean get boolean and blue dot parse boolean and and one of them you know merely looks at the string and does some local computations and the other one goes out to system properties that that's not a wise API design decision all right I have one more puzzle before we get onto the the main part of the main part of tonight's festivities so this one is called searching for the one and aren't we all searching for the one anyway so in this program I'm sorry I don't even know why I said that in this program we have an array of strings 0 1 2 3 4 5 and we translate the string array into a list of integers and then we use a binary search to search for the 1 see searching for the 1 it's kind of a you know upon I guess inside the the list of integers and we do it with a comparator and here's the comparator so let's go over the code in detail shall we so we got our main method we have our string array and it contains the strings 0 1 2 3 4 5 then we have a list of integers which we initialize to a new ArrayList of integer empty we iterate over all the strings and notice we're using that cool for each I like for each a lot I'm not biased in this matter by the way I just I just like it and then we add two integers integers got value of s and notice I'm not doing the same thing to you twice I actually is integer dot value of and I promise once again you know space cadets honor that this actually does translate an integer so this becomes 0 this becomes 1 and so forth and and then we store those integers into the list you know one at a time so you get 0 1 2 3 4 5 and then finally we print out the result of column collections that binary search on this list for the integer one and of course we're autoboxing that up to the capital I integer one using our comparator here and let's take a look at the comparator on the comparators is quite straightforward it takes it's it's compare method I should say it takes two integers and as you probably know the definition of a comparator is it should return a negative number zero or a positive number as its first argument is less than equal to or greater than its second one so let's look at the definition carefully if the first argument is less than the second one we return negative one otherwise if the first argument is equal to the second we return zero otherwise we know the first is greater than the second so we return positive one make sense so my question to you then is what does this print oh and I should tell you just in case you don't know the binary search method returns the index at which it finds the thing so if it found one here it would print out five i've conveniently you know given each of these numbers i've made it into an identity array let's say and if it does not find it it returns roughly speaking negative the position at which you would put it actually rather than negative it's the bitwise complement simply because negative zero is zero and so you know if it's at position zero that it would go it actually returns energy rockman value for what it's worth now that's a lie if session zero returns negative 1 because the bitwise complement of zero is negative one anyway so take a look at it and then and then we'll go through your choices which are 0 1 minus 2 and none of the above we have a moment all right it's time to vote so how many people say choice a that this will return and then print out 0 how many people for choice a that's my answer and I'm gonna stick with it so we have like 3 including me oh ye of little faith how many people say choice B position 1 that looks like a quarter of you for choice B um how many people say choice c- to a scant quarter of you for negative two and how many people say choice D none of the above most of you all right so I'm going to say choice D wins then and in fact it turns out that as a practical matter this program will print negative two if you run this on any implementation of Java that I know it'll print negative two but in theory its unspecified and the reason for that is the comparator is broken it violates its contract and once you have an object that violates its contract then all bets are off so it turns out that it's I mean it's really an implementation detail that prints out negative two and it could print out something else but let's take a look oh and autoboxing is tricky by the way this program used auto boxing and auto boxing can give you what we call the surprise left jab and it does so here um so let's take a look at this it seems obvious like this sort of work right I mean how much clearer could this be if I is less than J or turn negative 1 if I is equal to J return 0 otherwise return positive 1 what could possibly go wrong well yes I hear it from the front row the answer is identity the problem is I and J are object references and we could not change for upward compatibility we couldn't change the behavior of the double equals operator on object references and it returns true if and only if two object references are identical being equal isn't good enough and in this case we know by the specification of the libraries that these two zeros the one that we pass in sorry the two ones the one that we pass in and the one that we get by calling integer value of are actually different instances of the integer one it turns out this is very strange but the specification of the method actually guarantees that oh sorry the specification of the method guarantees that if you call this integer value of method it will not cash and reuse results you're getting fresh newly made integers Lord knows why but the spec says so so that's what's wrong with it and how do we fix it well you could fix it in either of these two ways right what doesn't work is double equals so if we do the tests in the opposite order we say if it's Liu I less than J negative 1 I greater than J positive 1 otherwise 0 does that work yes because both of these tests these comparative operators actually do auto unboxing so they work properly alternatively we can replace the double equals with a dot equals which which forces a value comparison but I don't like either of these solutions and the reason I don't like them is they kind of dance around the problem it's like they work but they're very delicate code and somebody couldn't take a look at that and say oh why is he doing this check in the strange order and switch the order and then it would break or someone could look at this and say well gee you know why is he just using auto unboxing all the time it seems inconsistent that he's using it twice on the line but you know here's make an explicit call to dot equals so I think that the best solution is this manually unbox those suckers and add a comment saying why you're doing it unbox the arguments to force value comparison perhaps even instead of identity comparison if the slide was a little bit bigger so that's that's my preferred solution and what can we learn from this one first of all and you know the funny thing is I've been saying this for like five years ever since we did auto boxing auto boxing blurs but does not erase the distinction between integers and ins between box primitives and primitives they're different and the main way in which they're different main ways our identity and Nomis so the box ones can be known and the box ones have identity distinct from equality whereas the primitives are pure value types and then also know that only four of the six comparison operators actually work on box primitives less than greater than less equal and greater than equal work all the ones that actually have a less than or greater than sign works but double equals and unequals do not work on box primitives is it is very very hard to test for this by the way you know suppose you sorted list using this thing would it work almost certainly yes you know ie I can come up with you know strange inputs which would prevent it from working but 99% of the time it would work so there are plenty of comparators out there that are this broken and you know you will never discover that they're broken until you're I don't know demonstrating it to your most important customer when the President and the Vice President are present at the mahogany table also another interesting lesson years even Josh and yield after make big mistakes so it turns out that that broken one if you have a copy of the first printing of this fine book you'll see it listed as the you know in the solutions section with an even more broken comparator - we fix it using this obviously correct comparator so Neil and I put the broken comparator into the book pretty bad huh anyway on - oh yeah I already did this onto the main course so the main course is a talk all about effective Java and what I've done tonight is I have sort of a few a few interesting topics from the second edition of effective Java which I'm presenting the book has a lot of stuff in in addition to everything that was in the first edition it has a whole chapter on generics a whole chapter on and newman's and annotations and one or more items on each of the other language constructs so it's added in Java 5 the threads chapter was renamed concurrency in honor of the fact that we now have Java util about concurrent and that you really shouldn't be thinking in terms of threads any more threads are mere implementation detail for implementing concurrency you should be thinking in terms of concurrency and whenever possible you should be using higher-level concurrency abstractions of the sort provided by Java util concurrent all existing items have been updated to reflect current best practices so I basically went through every word of the book and a few items were added in honor of new patterns some of which I will tell you about tonight and a very few were deleted because they no longer seemed relevant so anyway that's that's the book and of course I can't discuss all this tonight the book is 278 pages long the first had 57 items in the second had 78 which is kind of too bad I always wanted to write a slim volley and I don't think I can claim it's a slim volume anymore but it's as soon as the language will let it be I suppose so what's on the agenda for dinner then first I'm going to talk about generics and this is by far the longest section of the talk then very short sections on enums varargs and concurrency and finally a slightly longer section on a good way to do serialization that should be better known but but isn't so on with generics and the first thing I'm going to tell you is you know perhaps the most important thing that I'm going to talk about tonight if you come away learning only item 28 then you know the talk has been worth your while so you probably know that Java has what are called wild cards to allow you to write api's that are flexible why do you need them well arrays in Java are covariant so if I have an array of object type parameter and an array of strings can I pass the array of strings in yes I can array of strings is a subtype of array object but suppose I have a list of object as a parameter type and I want to pass in a list of strings can I do that no I can't it because collection types and generic types in general are invariant and you know there's one good thing about this which is it gives you better compile time type safety suppose that I have this thing whose parameter whose type is array of object I pass in an array of string and then inside the function I store an integer into it what happens runtime error does anyone know what runtime error very close class cast exception it's called array store exception no it's a race store exception I swear um if it's VM error you've got a bad VM and I can tell you where you can get a better one but that said um you know basically the arrays give you more flexibility but worse compile-time type safety generics give you great compile-time type safety but unless you use those wildcard types they don't give you the flexibility so the wildcard types give you back the flexibility they let you combine the the flexibility with the compile-time type safety but they're a little bit tricky to use list of string is a sub type of list of question mark extends object whereas list of object is a sub type of list of question mark super string pretty hard to remember right so I have for you a very simple mnemonic and if you just learn this simple mnemonic then you will never have to worry about it again just use the mnemonic here's the mnemonic with a helpful visual aid the mnemonic is pecs it's short for producer extends consumer super by the way I trust you all recognize this man yeah back then I think you know he took steroids or something like that yeah anyway that's that is our governor but when he was younger and buffer ah so so what does this mean producer extends consumer super it means when I'm passing in a parameter from which I want to produce tees I want to get tees from this thing then I should use the type food question mark extends tea producer extends whereas if I'm passing in a parameter into which I want to put T's that is the thing consumes elements of type T then I want to use question mark super tea producer extends consumer super and this only applies to input parameters don't try and use it on return types I'll tell you why in a few slides but first let's try it out let's let's flex our pecs shall we so suppose I have an API that looks like this I have a stack that has the usual stack methods push and pop and I want to add bulk methods push all which takes a collection of elements of type II and then puts all of those elements pushes them onto the stack if I have stack avi and I declare the method like this with collection of e now suppose I have a stack of objects and I want to push into it all the elements from a collection of strings should I be able to do that sure they're all objects but will come by a compile no it won't because the e's don't match you know I've got collection of strings and I'm trying to put it into a collection of a stack of objects these don't match invariant typing you lose so let's repeat our silly mnemonic pecks producer extends consumer super so the question is this source this suggestively named source parameter which is a source of objects of type II is it producing objects of type II or is it consuming objects of type II its producing objects of type II so our our mnemonic tells us producer extends so we should replace collection of E with collection of question mark extends e simple as that now let's look at the pop all method and that's sort of the opposite right in this case you know we have a method called pop all which takes a collection and takes all the elements from the stack and pops them into our collection the suggestively named destination parameter because we're putting things into it is it a producer of ease or is it consuming them it's consuming them so PEX producer extends consumer super so we want to use question mark super e collection of question marks super e and once we do that what do we gain well here's what we gain now if you have someone who has a stack of number not only can they call push all with collection of number with the exact match but that can call push all with a collection of Long's loans extend number or any other type that extends number so you get back all the flexibility you had with arrays and and more you also get compile time type safety and what about pop all well poeple you know if you have a stack of number with the old definition all you could do was pop it into a collection of numbers whereas now we can pop all from our stack of number into a collection of objects just as we should because every number is an object so that's basically all there is to it if you remember that mnemonic PEX you'll you'll never be confused again about when to use extends or super oh and here's here's another little example of it this one's maybe a bit trickier so we have a Jared you all know what generic methods are their methods that have type parameters for the method rather than for the type so we have something called set sorry the methods called Union and it takes two sets set s 1 and set s 2 and it returns the set that is their Union so right now all these things are either all have to match so you take a set of strings and another set of strings you get back a set of strings that's great but what if you have a set of integers and a set of floats and you want to get back a set of numbers will this definition work no it won't because these generic types are invariant so how do you make it work flex your pecs so of s1 here are we using it as a source of elements of type e that is is it producing them or is it consuming them producing and what about s2 also producing so in both cases we should use question mark extends e and what about the return type doesn't matter don't use the mnemonic for the return type always use the exact type for the return type and I will tell you why in a couple slides I keep saying that but I really will all right so how should the definition look declaration like this set of question mark extends es1 set of question mark extends e s2 because both s1 and s2 or e producers now why no wildcard type for the return value here's why why do we use wildcards we use them for one reason which is to make our API is more flexible the API should just work people who do the right thing who don't try to passing something that's going to blow up it should just work they shouldn't have to think about it the clients of an API the users of a library should not have to think about wildcards what happens if you return a wildcard type you need like variables of the wildcard type in or to store the return values and doesn't actually give you any more flexibility no it really doesn't make your API any more flexible it just makes it more difficult to use so so based on this principle that wildcards should sort of vanish before the API users eyes you should never return a wildcard type so that that about settles it right well no a little bit of truth in advertising suppose we declare it this way and we actually try to use it to do just what I said that is we have a set of integers a set of doubles and we take their Union and try to store it into a set of numbers sometimes it doesn't always just work the compiler gives you this incomprehensible error message it says found set of number and comparable question mark extends number and comparable question mark required set of numbers so what's really going on here what's really going on is whenever you call one of these generic methods the compiler is doing what's called type inference it's figuring out a type that works for e based on your arguments and in this case it figured out the wrong type you just found a limitation in the type inference algorithm that the Java compiler uses and the Java language specifies so what can you do when the compiler does not infer the correct type tell the cop compiler the correct type it's called explicit type parameters so you use this hideously ugly Union dot open angle bracket number close angle bracket Union of in stubbles it is it is hideous it turns out actually that God kills a kitten every time you specify an explicit type parameter and and luckily you know we are doing some things in project Coyne and in future Java language development to reduce the number of times you'll have to use these explicit type parameters I think they're incredibly painful and I hope you never have to use any but occasionally will so if you get a nasty error message like this just say darn compiler darn language and specify the thing explicitly and try not to think what the kitten you know what I'm actually going to take this one offline because it's a long talk as it stands I will tell you one thing which is the type inference algorithm takes up like 23 pages in the Java language specification so that's the short answer you know basically it's a difficult question it's actually impossible to do type inference we're complicated enough languages and all use Scala programmers out there know that skål has you know much more complex type inference engine than Java is what can do a lot more but even it gives up occasionally it can't it can't infer every type you want it to all right so here's a summary of our mnemonic in tabular form producer extends consumer super that is if the parameter is going to be used to produce tea instances then you use through a question mark extends tea if the parameter is going to consume two instances use foo or question mark super tea and if you're the academic type this is called covariant in tea question mark extends tea and this is called contravariant in tea question mark super tea but I think this table raises a question right I've only filled in two of the four blanks what happens if I have a parameter that I want to both produce and consume tea instances I want to put teas into the collection and take teas out of the collection then what do I do or what about suppose I want to neither produce nor consume you might say well why are we even passing in the parameter if you don't want to produce or consume but it turns out that there are plenty of useful methods that neither produced nor consumed suppose I write a method that applies a filter function and removes everything for which the filter function returns false you know it's not actually putting any T's in there nor is it taking any T's out it's simply removing elements without looking at them right so that is neither producing or consuming or suppose I have a method that reverses the order of a collection you know yeah I guess you could say it's producing and consuming but it's only producing and consuming from itself so you know it what types do you use for these well here are the types you use if it is both producing and consuming you use simply foo of tea don't use wild cards the intersection of the types foo of question mark extends t and foo of question mark Super T is foo of T so if it's if it's both a producer and consumer prove T and that's called invariant in T as I said before if it's neither producing or consuming then is foo of question mark and that just means few of any type since I'm not producing what consuming so I don't really care what type is in there I can pass in foo of anything and by the way note well foo of question mark is not the same as foo foo without the question mark is a raw type and is unsafe if you use it the compiler can no longer sort of reason about your program and cannot assure you of the type safety of the program so if you want to tell the compiler the guy can pass in foo of any type you should say you must say foo a question mark all right so that's that's the longest part of the talk so congratulations on slogging through that one and I hope you all sort of remember to flex your pecs it really will make your life as a programmer easier one more on generics and this this is a pattern that I just love it's how to write a container with an arbitrary number of type parameters so most of the containers or collections in Java have a fixed number of type parameters right you got your collection it has one type parameter the element type e you've got your map it has two type parameters the key type and the value type that's usually great but sometimes it's not enough what if you're trying to write a database row right it's got a whole bunch of columns how many who knows differs from row to row and they all have their own types suppose you want this all to be type safe is it possible you know I didn't used to think it was possible but it is possible and here's how you do it you use a new a new pattern called the type safe heterogeneous container pattern or THC for short it's a mind expanding pattern now I will tell you all about it so the basic idea is very very simple you don't parameterize the collection you don't have like collection of e instead you parameterize the selector and what is the selector the selector is something that you present to the color in order to withdraw an element from the collection and the data is strongly typed at compile time and you can have unlimited type parameters for the same collection because you can have as many selectors as you like so in our database row example the row itself is not parameterised but each column is and you have you know one column whose type parameter is string because that column contains a string one that the thing is you know big decimal and and so forth so let's go into a little more detail suppose that I want to implement a favorite database and what is a favorites database quite simply is a little database whose keys are class objects and whose values are elements of that class ok so it can store your favorite string your favorite integer your favorite float your favorite class you know whatever and you see how that has arbitrarily many type parameters one for each favorite you're storing so let's look at the API first we have the class favorites it has two parametrized methods notice that the methods are parameterized not the type and what you do to put an element into our favorites database is you pass the class object which is its selector and you pass the instance and notice how we're type checking it it's a class of tea and a tea instance so that will not even compile if I tell it my favorite string is 43 because 43 isn't a string you know so at compile time and by the way you don't have to I see some of you guys taking notes or maybe you're writing email to your girlfriend but if you're taking notes don't bother all this stuff is online if you type typesafe heterogeneous container into google it's on google already this whole talk and also it's in this fine book at least one copy of which I'll be giving away at the end of the evening yeah well yeah maybe I shouldn't be engine string as my example but you know that the point is that as long as you know as long as the types match it compiles if they don't match it won't compile and then how does the gap method work it the class object the selector and it returns an instance of that class and how does it look when you use it here's a little client program it's really really straightforward I create a favorites which is a collection that holds multiple different types that's where the heterogeneous in typesafe heterogeneous container comes from and it's typesafe so I put into it my favorite string which as you all know is Java my favorite integer which is the hexadecimal number Cafe babe and my favorite class which I guess is thread-local tonight could be anything and then if I get my favorite string out its returned in a variable of type string and that would not compile if I put say integer here why wouldn't it compile because you know the types have to match and they wouldn't match my favorite integer gets stored in an int there I'm doing some auto unboxing I'm in my favorite class gets stored in a class object and by the way there's our question mark class of question mark it turns out that that's kind of the best you can do and then if I print them out it'll it'll print out you know whatever they are the string the energy of the class Java cafe babe and thread-local great and the implementation of this thing must be really complicated right because of what it's doing no they said the whole implementation this runs so let's read it very carefully first of all inside each favorites object we have a map I think I'm using a hash map and what does it map it maps a class of some type to an object now is that strong enough to enforce our guarantee that we only map the string class to a string object the integer class or integer object and so forth not at all there's no you know because of the fact that you can't do it based on the ordinary collections that only have a fixed number of type parameters so basically this Maps an arbitrary class object to an arbitrary object but we're only going to use it in this restrictive way we are not going to put in mappings that don't meet our our criterion okay and now let's look at the put favorite method as we said it takes to parameters of type class of T and T if the type is no there was no pointer exception because that's not a legitimate type value and the point is we're only storing it into the collection right so if we simply try to store it with null would it work yes and would it blow up later we always want to blow up as soon as possible and and so the lesson there is always validate your arguments that's what I'm doing and then we put into the map a mapping from type to instance do we know that instance is in fact an instance of the type yes we do and now we get it out we get it out with passing the type object we look it up in favorites and what does this return what is the type of favorites get type what is the type of the underlined expression object darn shame we're not supposed to return an object we're supposed to return a T so how do we turn it into a team well we have a class object of type class of T lying around and it turns out that class as of Java 5 has a new method called cast class cast which is the dynamic equivalent of the cast operator in Java so if you have a class object and you call class cast on an object reference what does it do it checks if the reference is in fact an instance of that class if it is it simply returns it unchanged if it isn't it throws a class cast exception right so it's doing exactly what the cast operator does but it's doing it dynamically based on a class object rather than you know statically based on the actual class then you've textually included in the program and that's all there is to it that works that's the typesafe heterogeneous container pattern and you can use that to do databases and things like that that are that our type safe and actually work so I commend it to all of you all right now on to a bunch of easy sections you've you've made it through almost all the hard stuff truth in advertising once again compels me to tell you that that last section on serialization is a little bit tricky so how are you all doing are you basically keeping up with the material excellent good so new types this one's really easy so prefer two elements enews two boolean's it seems obvious but you know I see people doing the opposite so this is the consider this one a reminder which would you rather see in code double temp equals thermometer get temp of true or double temp equals thermometer get temp of temperature scale dot Fahrenheit this one tells me what kind of temperature I'm getting back this one doesn't tell me much at all you know yeah maybe your IDE if you're lucky when you hover over the thing will tell you but you know what if you print it out or what if your friends IDE isn't as good as yours or whatever um and and some people say oh but Josh you know this is just way too long compared to true and I say oh you know if you must use static import and then you can just say Fahrenheit but I far prefer this and this I think would be reason enough to use these two elements and neumes but it turns out there's an even better reason and that reason is they evolve so for example you know suppose you start out with a temperature scale that includes Fahrenheit and Celsius and then you know some science nerd wants Kelvin no problem just add a third one if you'd use the boolean you're stuck there is no third value right and and then suppose you realize that my gosh I have all these different temperature scales I'd like to be able to convert them to some common temperature no sweat add a method to the anoon you can do that in java called two Celsius that you know no matter what temperature scale you have will convert a you know a temperature value of the scale represented by the constant into a Celsius temperature so that's why you should always use almost always use two element in ohms rather than true or false see I told you this part of the talk was easy now varargs I have a little useful pattern and a half for you to use with varargs so here's the simplest case of varargs this slide is basically just to remind you all about what varargs are what they do so varargs allows you to pass a bunch of arguments of indeterminate lengths and do something reasonable with them so in this case we have a method that takes a bunch of in and returns their sum right static in sum and the type of the argument is in two dot and that means it's zero or more integers and it kind of boxes them up into an array for you so how do we do it we simply set the son that is the return value to zero we iterate using the for each loop over all the integers that were passed in in turn we add each one into some and finally we return the sum so that that makes sense to all of you great now suppose you want to write a varargs method that takes one or more arguments instead of zero or more why would you want to do with something like that yeah suppose you're writing a min function mini isn't defined if you pass in only zero arguments so in code reviews I've seen an awful lot of code that looks like this if args dot length equals zero throw new illegal argument exception too few arguments otherwise we'll set min to the first argument then we'll iterate over all the other arguments and if an argument is smaller than our tentative minimum we assign it to the tentative minimum and finally haven't gone over all the elements in the array we know we've found the true minimum and we return it does this work yeah this works but you can do better what's wrong with it the biggest thing that's wrong with is it fails at a runtime if it's invoked with no arguments things should always fail as soon as possible wouldn't you rather have something that failed at compile time I would it's ugly this explicit validity check it's just nasty looking and finally it interacts poorly with the for each loop see because we're we're going from the you know second element that is the element sub one to the end rather than the zero so I didn't bother using for each loop so what do we do it turns out there's a really easy solution and really pretty too I think it's pretty just declare it like this int first argh in two remaining args okay now you cannot compile this with no rx it only compiles if you have at least one and then you can use your for each again right for each int in the remaining args if it's less than the minimum it becomes the new tentative minimum finally return it and notice no validity check anymore I don't need it because it can't fail at runtime it fails at compile time so that's the right way to require one or more arguments I'm sorry hold the questions only because the talk is as long as it is normally I like to take questions during the talk but I just I'm worried that I'm going to keep you guys here too late all right so um and here's a variant on that and by the way this is an optimization this should only be used where performance is critical if you do this and you haven't proven to yourself that performance in this case is critical when you are doing premature optimization which is the root of all evil so don't do it but if you have a case where the problem with varargs is varargs automatically creates an array and and kind of puts everything into an array but it costs time and garbage collector pressure to create all these arrays and sometimes you really can't afford that in that case what you do is instead of having only one thing you know to take the case with one argument you have one two three four five and finally if more than five default to the version with varargs because the way the the method overloading works if you can resolve a method without resorting to varargs you do right so in this case as long as you have five or fewer arguments you'll end up using one of these methods and by the way these actually are the static factories for a new sets of Weiss a new set of typeface italic comma typeface bold it does know array allocations because we wanted a new set to be fast as a bat out of hell and it pretty much is and and the other interesting point is that you know basically the number of methods is a function of the way in which you expect API to be used so if you can sort of look at a corpus of code and say is 95 percent of the calls have five or fewer arguments then you know five is probably the magic number for you so just just look at the code and try to figure out how many methods you need all right so that's all I have to say about VAR args and now a concurrency item usually concurrency stuff is hard this one's actually pretty easy and it's about common abuses of concurrent hashmap concurrent hash map is a great class why is it great you know it combines high concurrency that is lots of things can happen in parallel with high speed and how does it do that it does it because Doug is very clever it's very clever and he wrote it very carefully he spent a long time writing it and it uses fancy things inside like lock striping and non blocking algorithms and the whole idea behind concurrency utilities is that you and I don't have to know about all this stuff because Doug knows about it you know it allows us to basically multiplex Doug's knowledge and in fact it pretty much makes the old-style synchronized collections obsolete so you used to say collections dot synchronized map of new hashmap don't do that anymore just use Java util concurrent dot concurrent hash map in fact import all that crap so you don't actually have to you know say the package name but anyway that's what you should use but use it right what do I mean by that well first of all never synchronize on a concurrent collection you know whether it's a concurrent hash map or a concurrent link cure any of those fancy dub collections don't synchronize on them they do their synchronization internally and you synchronizing on it will have no effect on other ongoing operations on the thing because you know the operations on this thing like if you synchronize on a hash table does that mean someone else cannot do a concurrent put yes it does you've blocked that thing cold but there's a funny thing about synchronization you know some people think oh yeah synchronization is like concurrency uh-uh synchronization is the opposite of concurrency concurrency is lots of things happening in parallel synchronization is me blocking you to a first approximation that's a little bit cutesy and not completely correct but it's still you know worth remembering so the point is that if I synchronize on a concurrent collection other things can still happen because the other methods that Doug wrote aren't looking at the lock on that collection at all they simply don't use it so here's a arguably broken method that's trying to do string in turning and and by its string and turning what I mean is we have a map from a string to a string and what it does is the first time it sees each string value it stores that string value in the map and the second and successive times it returns the string value that's exactly what string in turn does for you but here's an implementation of it so how does this implementation try to work it synchronizes on the map get something out of it if it's null puts the thing to it and sets the results that will be returned to the parameter that was passed in and it returns that but it's no good it's synchronizing on the map don't do that so what should you do well you know the problem is that we're trying to atomically combine two things right checking for the presence of this string in the map and putting something in it wasn't already there and how can you do that on a concurrent collection you can't you cannot atomically combine stuff and so what they've done is they've done a whole bunch of atomic combinations for you these little mini transactions that have names like put if absent so here is a correct but suboptimal version of this interning utility it says during previous value equals Matt plenty of absolute s come s and that means if there's no mapping from s at all put a mapping from s to itself otherwise leave it alone and return whatever the previous value used to be if the previous value is null indicating that there was no entry for that string then we have just put in the first entry for it so we have done that the actual interning and we should return our argument otherwise we should return the previous value make sense and what's wrong with it the only thing wrong with it is that it calls put if absent every time it reads a value not only the first time and it turns out that put of absent is much more expensive and and more damning it's not just expensive but it causes contention it turns out that when you're doing a get from a concurrent hash map it causes no contention whatsoever any operation you know we all right can go on in parallel with a get it's like magic but so this is not the best way to do it what is the best way to do it this is the best way to do it it's it's kind of like the double check idiom except it works actually the double check it even works too if done with volatile so I'm just being cutesy tonight I guess but um you say string result equals map get of s from our concurrent hash map so if it gets something we're done just return it but suppose it doesn't get anything then we call put if absent SS almost always result will be no but if there was a race condition if two people calling this at the same time and someone else got their value in first then result will be something other than no and we should return that result does that make sense so that is the correct way to do it and it turns out by the way that on my machine which I have at home I measured this one is 250% faster than the other one besides you know being much better from a contention perspective and one more solution that doesn't work at all this this code is very badly broken so why am i showing you broken code because we found that 15% of all uses of put if absent don't look at their return value it's almost always wrong to call put if absent and not look at the return value so here's what you don't do put up absent s comma s and then return s you're returning the argument you're always returning the argument even if it was already present in the map so this is like doubly wrong it's both expensive and it doesn't actually work as an intern so don't do this if you're calling put of absent you should almost certainly be looking at the return value so summary the old-style synchronized collections are pretty well obsolete instead use concurrent hash map and friends but never synchronize on the entire thing they don't work that way instead use put if absent and friends and use them properly only use them after you've done a read first because the reads are so much cheaper and always check the return value all right now I have one last section on serialization and then a little tiny dessert course but it's really short it's a puzzle so civilization is fraught with peril anyone who's read you know the first edition of effective job or who's ever heard me talk about the topic knows that I feel that serialization is a dangerous thing what's wrong with it many things first of all it causes implementation details to leak into public api's this is really bad right if I just say implement serializable all of a sudden all of my private fields become part of the serial persistent representation of this object their private fields and they're part of a public API which is my serial persistent representation can I ever change them well yes but it becomes incredibly painful if you want to see examples of you know code like this look in the JDK for things that were sort of made serializable before people really thought about what it meant look at big integer where I was the one who so that's that's one example of what happens another bad thing is it's too magical instances are created without ever invoking constructor when you deserialize an object you call read object you get a new object of you know some type let's say a new you know I don't know what Ruby it didn't call any of the foobie constructors none of them it's like magic and that's very bad because when you're writing an object what you tend to do is you have some invariance right you make sure that all of your constructors establish the invariants and all of your methods maintain the invariants and you're done right that guarantees that all your invariants will be true for all time right yes unless you say implement serializable in which case you have this other sort of magical pseudo constructor which is read object and that can produce instances of your class that violate there at variance and that can be very dangerous it doesn't combine well with final fields so it turns out that if when you deserialize something it's impossible to get the correct value into some final field because let's say it's something with a kind of a runtime presence like a a thread or something what can you do you know you have two choices I'll make it non final or use reflection to change the value of a final field neither of these things are plugged you're caught between a rock and a hard place so what are the result of all these things together if you use serialization you will suffer from increased maintenance costs increased likelihood of bugs increased security problems it's just a fact of life pretty much but it turns out there is a better way you can avoid these problems and you can do it using what I call the serialization proxy pattern the basic idea is really unbelievably simple simply don't serialize instances of your class instead serialize instances of a idealized representation of the state of your class make a little nested static class that does nothing but hold the state in it's sort of most concise form and then reconstitute these little state mementos into actual instances of your class at your serialization time using only the public API s and that's the magic there isn't only the public API right no longer are we having D serialization Auto magically give us an instance of our class we're calling a public static factory or we're a public constructor to get the instance so let's look at it in a little bit more detail the first step is to design the serialization proxy it's a struct like proxy class but concisely represents the logical state of a class to be serialized then declare it as a static nested class of the class that you're going to be serializing and then provide a single constructor for the proxy which takes as its argument a single instance of enclosing class in essence turning an instance of a class into its proxy and there's no need for consistency checks or defensive copies here no need at all it's okay if somebody C realises a broken instance of the serialization proxy why well because the contents of it are just going to be used in calls to public methods and those public methods are going to do the validity checks so what do you do you put a write replace method on the enclosing class and literally it is this code you can cut and paste this into every class that you want to do a serialization proxy for the right replacement method simply returns new serialization proxy of this so that translates the object into its serialization proxy then you put a read resolve method on the proxy do you guys know about write replace and read resolve by the way by show of hands who here knows write replace and read result okay write replace Andrey resolve allow you to intercede method calls onto the serialization chain such that the way write replace works is when something is being serialized before you return the serialized stream you pass the object that's about to be serialized to write replace method and instead of serializing the object itself you serialize whatever is returned by write replace so in this place in this case what does write replace do it says hey don't serialize the object instead see realize a new civilization proxy representing the object rid resolve is kind of the opposite operation which is used not when your serializing but when your deserializing and it says you know after you deserialized an instance of something instead returning it directly call the read resolve method on it and return whatever the read resolved method returns so in this case you put a read resolve method on the serialization proxy that calls public API methods of the enclosing class to create a new instance that represents that proxy and that's it so how does it look in practice once again you might think oh there's got to be a trick it's really complicated no it turns out that there isn't and here's how I'm going to prove it this is a real-life example this is actually noon sets serialization proxy I came up with this pattern for a new set it turns out that it's particularly useful for a new set for a reason I'll tell you in a moment but let's look at it first just so I can show you that I have nothing up my sleeve as it were so what does this utilization proxy have what is the idealized representation of an atom set well it's a type the enumerated type and a bunch of elements of the type so we have a private final class instance being the element type and then an array of anews being the elements themselves why do I need the type well what if the what if the set is empty if I said it's empty I don't have any elements of the type so I don't know the type it's the only way to know the type and and thus offer you know runtime type safety for the union's it not just runtime type safety but turns out you need to know the type in order to perform the various operations on an EM set it's just critical so this is the idealized representation that is this is a serialization proxy and remember we said it has one constructor that takes an element of the set sorry of the enclosing class which in this case is a named set and returns it's a serialization proxy and what does it do it simply copies the type from the new set into its element type field and then calls the two array method on the name set to get all of the contents of the thing into elements and notice by the way that this both uses public methods like to array as well as looking at internal fields it's alright if the serialisation proxy constructor uses the internals of the enclosing class but it's not alright if the read resolved method uses anything private the whole idea behind this pattern is that the read resolved method which translates instances of the serialization proxy into instances of the enclosing class that one has to use only public API so let's take a look how does it work well first we call a name set none of the element type so that's the standard static factory to create a new set consisting of no elements of a given type and then we iterate over all the elements in the elements array and we add each one to the new set and finally we return the result and the last thing we need is a serialization seed over a new ID to appease the serialization gods so that's it this actually is the serialization proxy of enum set and by the way why is it such a good thing for a new set I'll tell you why it turns out that the class of the same can change between the time it c relized and deserialized how can that happen it turns out that under the covers there are two classes that implement enum set they're called regular Noom set and jumbo and home set if the underlying enumerated type has 64 or fewer elements then you use a regular new set which is just a wrapper for a long where one bit represents the presence or absence of every element if on the other hand you have more than 64 elements in the underlying inium type then you use a jumbo a new set which takes an entire array of these things now suppose that I have an enumerated type with like 60 elements in it and I serialize an an inset of that type now suppose I had 10 more elements now it's got 70 and then ID serialize it well I see realised it as a regular named set but it DC arises as a jumbo so that's kind of why I came up with this pattern but afterwards I decided it was you know or we came to the conclusion that it was much more generally applicable in fact there are you know many cases in which you should use it is it a panacea of course not there are no panacea especially for something as complicated as serialization so what's wrong with it first of all it is completely incompatible with extendable classes it only works and by extendable by the way I mean externally extendable it only works for a class that is kind of closed within a library so if you look at a name set yeah I have two subclasses of it but they're mine you cannot extend an M set if you could then the civilization proxy approach wouldn't work also it is incompatible with some classes whose object graphs contains circularity z-- this is actually a fairly complex topic addressed in more detail in the book but at any rate you know new sets are kind of nice because there are no circularity Zoar i mean by that just that the elements of anonym set are just the anews they cannot point back to the ANU met and also it's a little bit more expensive on my machine it adds about 15% to the cost of serialization and deserialization that's not a magic number 15 but just you know it adds a little to the price but where it is applicable is by far the easiest way to robustly serialize complex objects it avoids you know all of those horrendous problems with serialization that I listed on the first slide so that's it that was a pretty long talk so I'm going to very quickly go over the key ideas in the whole talk first of all if you remember nothing else from tonight remember Peck's flex your pecs all together now producer extends consumer super there you go and when a fixed number of type parameters just won't do for a collection like saying if you need an arbitrary number of them use that type safe heterogeneous container pattern or for two element in use two billions never ever synchronize in a concurrent collection that's the other if you only remember one thing from tonight don't synchronize on concurrent collections instead use put if absent but generally read the value before calling put of absent and always look at the return value from put of absent and finally when you're plans call for serialization remember the serialization proxy pattern so that's it for this talk if you want more then you should get this fine book now in it's fine second edition and we have a 1-1 little bit of dessert one more puzzle are you guys up for one puzzle all right um I know this is a long talk but anyway so here's our puzzle called when words collide I think there may be a typo in the title and what this does this is got a very strange puzzle because we have two classes print words and words and here's how the game is played we compile these two classes together then we recompile words so this is a second version of words and we run the class file that came out of the first compilation with a class file for words that came out of the second compilation and you know before you say but that's totally ridiculous rush why would anyone do that that turns out to be a kind of a microcosm for what happens all the time with libraries you compile your client with version 1 of a library and then the vendor releases version 2 and you should not have to recompile your class you should just be able to run them you know with the binary for the new version of the library and everything should just keep running so in this case what does our client do it simply prints out three words words dot first words love second and words got third in the original version of the words class the words are the null and that's the actual null pointer set the null set and in the new version the words are physics chemistry and biology so the question then is if I run the program does it print the null set physics chemistry biology does it throw an exception or does it do something else entirely don't don't yell it out if you know what just think about it yes they're cabal with the same JDK any JDK like I heard someone asking the correct question in the front row which is does it inline the constants very good question I could tell you but that would take all the fun out of the puzzle then wouldn't you okay I think it's time to vote yeah all right so how many people say this program will print the null set which would mean it in lines them oh that looks like 60% of you how many people say physics chemistry biology it doesn't in line them 20% of you how many people say throws an exception 5% of you and how many people say none of the above five more percent of you well it turns out that those final five percent of you were in fact correct this program will print it will always print the chemistry set which is kind of strange yeah I see someone up here who got it but anyways it's very very let's take another look at the program the the intuition here is constant variables are in lined but what is a constant variable I mean constant variable it's not like a little bit of an oxymoron now constant variable well here's how a constant variable is defined roughly it's a final primitive or string variable so you can't be a constant unless you're a primitive or string whose value is a compile-time constant it turns out you have to read all four or three of these sections of the JLS to actually get the whole definition but it turns out that no is not a constant variable go figure right it turns out that it doesn't have a constant pool entry so basically it's any compile-time constant primitive or string but not no I noticed that in neumes are never constant expressions now let's go back to our program constant not a constant constant which means the and set are in fact compiled into the program no is not so we're running against this we print the chemistry set mystery solved right but what can we do to fix this ie you know this probably isn't the behavior that was desired well here's a neat trick if you want to prevent something from being inlined simply call the identity method on it so we have a method here called ident on string that simply returns its argument now we say public static final string first equals ident of the we're doing a method invocation and once you do a method invocation it's not a constant anymore and this is a way to force the compiler not to inline things that shouldn't be inlined but it raises the question you know what should be inlined and what shouldn't be and the moral basically is that you should only use constant variables for entities whose values will never change so II that's a reasonable constant variable PI sure number of planets in the solar system I think that changed you know so basically if there's any chance the thing is ever going to change don't use a compile time constant you can make it a public static final but initialize it to something that's not a constant expression and do be aware that null is not a constant that's sort of counterintuitive but it is true and I guess that's it for tonight so thank you very much for coming to the talk and before we go before we go I have a couple books to give out so on the honor system I guess there were a total of three puzzles right two before and one after so who got all three right raise your hands all right two of you and I oh wait three of you all right so we're gonna have to do some rent some coin tossing either three of you got them all right come on up and we will distribute books and the rest of you thanks a lot and I hope to see you again soon
Info
Channel: UserGroupsatGoogle
Views: 141,649
Rating: undefined out of 5
Keywords: Joshua, Bloch, Effective, Java, Puzzlers, Silicon, Valley, Web, User, Group, JUG, Van, Riper, Kevin, Nilson, Marakana
Id: V1vQf4qyMXg
Channel Id: undefined
Length: 73min 58sec (4438 seconds)
Published: Thu Oct 22 2009
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.