Clean Code - Uncle Bob / Lesson 2

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
yeah okay so where did the moon come from came from the earth probably this is true although it's a little more complicated than that I remember my my grade school science book gave us some options for where the moon came from one of the options was that the moon was captured it was an asteroid that the earth simply captured with its gravity and another option was that the moon bubbled out of the earth somehow and to demonstrate that point my science book showed a picture of the Pacific Ocean with the moon superimposed on it because the moon would fit right into the Pacific Ocean so clearly that's where the moon came from the problem of figuring out where the moon came from is actually fairly complicated because the moon is a strange object first of all it does not orbit in the plane of the Earth's equator it orbits in the plane of the solar system so the angular momentum of the moon is more strongly related to the Sun than it is to the earth that argues that the moon did not come from the earth on the other hand the orbit of the moon is circular it's very difficult for a captured body to have a circular orbit any kind of captured body would have a wildly elliptical orbit so how did how did we get this circular orbit an object if it didn't come from the earth and then you look at the composition of the moon and the composition of the moon is virtually identical to the earth right down to the isotope ratios with one glaring exception which is that the moon has no iron and the earth is mostly iron so although the minerals in the moon are very similar to the earths it's missing the iron if the moon came from the earth some process must have filtered all of that iron out and finally the moon is tidally locked to the earth it keeps the same face to the earth at all times and that can only happen if the orbit started much closer to the earth and was pushed outward by a transfer of angular momentum so what could satisfy all these constraints and scientists have worked on this for a very long time and eventually about 30 years ago they came to this very interesting conclusion it turns out that in every orbit 60 degrees away there are stable points in that orbit so if you take a look at the Earth Moon system and you go 60 degrees away from the moon there's a stable point in the Earth's orbit there it's possible that another body formed a body maybe the size of Mars and the these stable points are metastable they're not perfectly stable so if you disrupt an object there it will start to slide across the orbit until it impacts you so it's possible that some object roughly the size of Mars maybe four and a half billion years ago formed at one of these stable points in the earth-sun orbit got disrupted moved in and smashed into the earth melting both bodies the two iron cores went to the center of one of the bodies and the there was a splash that splash formed a ring the splash was only the lightweight silicates none of the iron the ring coalesced into the moon and then the angular momentum coupling drove the moon away from the earth and tidally locked it circularized the orbit so that's the current current theory for what we think about the origin of the Moon wouldn't it have been fun to be there to watch an object the size of Mars smash into the earth and form a ring that turned into the moon in those days the the length of an earth day was on the order of six hours and the moon was so close it would have filled the sky and then the angular momentum coupling drove the moon farther and farther away and tidally locked the moon to the earth the earth slowed its rotation rate down to once every 24 hours that process is continuing the moon is getting farther and farther away the earth continues to slow down and it'll take about eleven billion years before the earth finally tidally locks to the moon we don't have eleven billion years left so we don't have to worry about that in any way we're not supposed to be talking about this I do want to talk about comments so a complete change of topic let's talk about comments what is the purpose of a comment to explain the code that that you can't use the code to explain yeah okay fine that's fine um the purpose of a comment is to explain the purpose of code if the code can't explain its own purpose can code explain itself well generally it can now by the way it did not used to be the case Fortran could not explain itself there was no hope one of the problems with Fortran is that does anybody remember the length of a name the maximum length of a name in Fortran six characters right you can have a name six characters long that was it and in fact most languages in the early days had a limit very much like that I once worked on an assembler that had a limit of four you can't get a lot done with four basic the original basic one letter one number that was it a 1 B 2 k5 those were your choices so it's very difficult to get any kind of explanation done when you've got limits like that today's languages are remarkably rich we can have names that are very long we've got structure in these languages classes and variables and methods and namespaces so we've got all these tools that we can use to write code that explains itself I found this code on the internet I don't know if it's real I have seen enough like it to know that it could be real this is bad really bad it's really really bad hack if you're an employee of inter Troad communication but I'm really really sorry that you have to maintain this I was honestly planning on removing this tomorrow but I've been known to forget things like this it happens no it doesn't not if you are a professional software developer it doesn't happen you don't put an excuse like this in the code now he might have done something stupid here I'm not I'm not worried about the stupid thing he did he was in the middle of a production crisis and he had a whole bunch of pressures on him so he did something dumb that was okay the problem I have is that he wrote this comment and made an excuse for what he did and checked it in and that's you just don't do that do not check stuff like that in so here's the thing I can't seem to figure out why the account ID variable isn't set and I've looked and look but I gotta leave now well okay he had to leave okay anyway I found that I can just grab the account ID from the debugging logs I suppose that to fix it you'd have to locate where it's clearing out the idea again I'm sorry okay so the problem I have with this comment is that it was irresponsibly checked into the production source code control system I have no real objection to what he did to fix the problem in a minimun immediate emergency that's fine but he should never have checked that and he should never have made this excuse whoever that person was and then excuses the fact that he's not going to fix it later that's a problem so I want to talk about comments and I use that as a preface but let's talk about the right and the wrong reasons to do comments nothing can be quite so helpful as a good comment I say that up front because most of what I'm going to be doing is ranting about comments so I want to make it very clear upfront that not all comments are bad some comments are great on the other hand nothing can be quite so obscuring as a bad comment comments are not pure good many of us were taught that comments were pure good we were taught in school or we were taught by early early books that everything should be commented a programmer should comment everything comment comment comment we got to the point that we measured the quality of the code by Counting the comments which by the way it's really easy to make high quality code if all you're going to do is count the comments this led to absurd kinds of comments like AI + + / / increment I the proper use of a comment is to compensate for our failure to express ourselves in code to the extent that we can express ourselves in code we do not need comments but we cannot perfectly express ourself in code and so in those cases we need comments every use of a comment represents a failure a failure to express yourself well in code this is how I want you to look at comments comments are not the kind of thing that you pat yourself on the back for entail good I was a good programmer I wrote comments every comment you write is a failure now you will fail you will fail to express yourself you will have to write comments but you should look at every comment as an unfortunate necessity not a great achievement why well one of the reasons is the comments lie now nobody intentionally write them up I guess some people do but most people don't intentionally write like comments that lie but comments degrade over time let me ask you this what color does your IDE paint your comments in green like the grass Oh gray Eve Gray's even better soft gray ignore me gray I can't hurt you great I don't bother you gray I'm just part of the background gray we paint our comments and colors that are easy to ignore because we don't want to see them we want to look at the code we don't want to read the comments comments gets in the way so we we make them fade into the background now I changed my IDE and I have my IDE paint comments in bright red big staring glaring red because I figure that if someone wrote the comment I probably ought to read it and then if I read the comment and I think it's useless I will delete it and that helps me get rid of a lot of comments comments silently rot because no one maintains them we paint them in a color we can barely see so of course we don't maintain them and no one reads the comments when they're into the code they're fiddling around in the code and they're changing stuff around in the code they don't read the comment that's up here and so eventually the comment starts to say things that have nothing to do with the actual code has anybody seen code anybody seen comments that tell you to do exactly the wrong thing you know they're 5 years old and they tell you okay this is what you've got to do when you read the codes well if I did that that would crash right but do we delete those comments do we fix those comments or do we leave them there because they're there for posterity has anybody seen a comment that is broken loose from the main body of comments and migrated down into the guts of the code probably through copy/paste operations where it hangs out there like a piece of junk DNA just a fragment of a comment hanging out there serving no purpose whatever but no one will delete it comments don't make up for bad code now here's what happens you write a bunch of code and you look at the code after you're done with anything oh this is awful and then you start to write a comment I've better comment it because this is really bad code I better comment it no you should clean it don't comment the code clean the code make the code express your intent put the effort into the code don't put the effort into the comment if you finally fail at cleaning the code then write the comment most of the time however you will find a way to make the code express itself explain yourself in code not with comments so here's an example the top says check to see if the employee is eligible for full benefits and then below that there's this horrible boolean expression what wouldn't it be nice if there were a function called is eligible for full benefits and then you could say if employee that is eligible for full benefits that reads a heck of a lot better than that comment up there so much better to use the names of functions and variables to explain the code than it is to use a comment to explain the code now I'm going to walk through a set of comments that I think are I use the word good here on the screen but that's not quite right these comments are acceptable maybe I would leave them I would not delete them and I'm gonna walk through those and then I'll walk through a batch of bad comments copyrights well okay I don't know what you're doing in the Netherlands but in the United States you've got to do this if you want to protect your code and you have to have this kind of weird punctuation so fine we will leave the boilerplate code up there here's a comment it's an informative comment and it's really an interesting one it says it returns an instance of the responder being tested and then the function here is responder instance now you could ask this question why didn't he call this responder being tested why didn't he name it better why did he have to have this comment there and the answer is that he's using a design pattern what design pattern is he using singleton yeah now you might not like the singleton pattern by the way who's got the design patterns book and read it yeah good yeah okay and the rest of you who do not have this design patterns book you must go out and buy the design patterns book and read it and understand it and then come back five years later and read it and understand it again it is probably the most important book written within the last thirty years not because it says anything new because there's nothing new in that book it's all about old stuff and what the design patterns book does is it takes a bunch of old ideas that have been around forever and are still around today and it gives them names and canonical forms so that you and I can discuss singleton or decorator or visitor or composite we can talk about those things and if you know these patterns if you know if they're in your head then I can say to you you know we could use a visitor for that and the whole design pops into your head oh yeah we could use a visitor for that and we could talk about the design at a higher level of abstraction we could discuss the patterns that we're going to use to solve a problem if you are reading code and you see the word report visitor and you know that pattern the design of that code pops into your head if the author has been faithful to the pattern the design of that code pops into your head and all of a sudden you know what to expect inside that code it's very powerful very important please know your design patterns there are folks out there nowadays who are saying design patterns were so 1995 and now things are so more modern and our languages nowadays don't need pet it's complete nonsense don't fall for that silly trick right the design patterns are good things to know and good things to have in your head what happened here is that this guy is using the singleton pattern the singleton pattern canonical form has the function using this naming pattern responder instance so he could not name it a better name and therefore he used to comment to back himself up and that was fun that's a comment that I would leave there the pattern forced him to use this name so he could not use a better name fine he can use the comment below that we see another kind of informative comment which is to tell us what this regular expression matches now regular expressions are hideous horrible syntaxes that no one understands it looks like gobbledygook alright you look at emergent expression and your eyes kind of go oh so it's a really good idea to have a comment telling you what the regular expression is matching and clearly you can see that his attempt here is to match a timestamp he's gonna match hours minutes seconds and then whatever the heck of that is e month/day/year that's probably the time zone right so okay that's fine I'm glad he told me what he expected to match but do you notice that the regular expression actually matches much more than just that his comment is lying to me the regular expression does not just match that timestamp it matches lots more than that because of these stars so two-edged sword he was informative but he also was telling me a lie here's a comment that I would leave and I love the phrasing right when programmers write comments they use such interesting grammar right we are greater because we are the right type I love this right this is the Imperial weed we are not amused so okay what he's trying to tell us here is that this is the compare to function in Java and this is the canonical form of the compare to function in Java and the first thing you check of course is that the incoming object is the right type okay well what do you return if it's the wrong type and so what he said here is look if it's the wrong type than I'm going to say that the this object is greater than the incoming object we are greater because we are the right type okay fine I could have probably phrased it better but I at least understand the intent I would leave that comment there I might improve the wording this is our best attempt to get a race condition by creating a large number of threads now you can see what he's doing here he's going to create a thousand threads and then he's going to hope to get a race condition apparently he's got some some multi-threaded thing and he's expecting some kind of race to happen this is a terrible way to make a race condition by the way if you ever want to make a race condition you don't do it this way because this will just set up a nice little resonance amongst all the threads and they won't race properly so we really want to test a race condition you have to line up the threads right at the race point with some semaphores and then release the semaphores and let the race continue so he's using the wrong strategy but at least he's telling me what he's going to do here now he could have put this into a function named attempt race condition that would have been better so I'm questionable about this comment I think he probably should have put it into a better named function and then of course he should have learned how to actually create a race condition I wrote this probably about nine or ten years ago now and the comments over here were an attempt to deal with this horrible optical illusion code is rife with optical illusions we line things up in code there are repeated patterns in code and they make your eyes twist and turn in strange ways so it's very easy to create these bits of code that are virtually impossible to see just because of the optical illusion so you look in here and holy cow you know they're all assert truths at least there's no assert false is in there and Hasen a B's and B A's and B's and what the devil is going on so I tried to explain over here what the intent of these comparisons was and it's successful on the other hand it's a double-edged sword because a reader is going to come along a person is going to come along and try and read this code and they will be drawn to the comments and they will ignore the actual code if those comments are wrong they'll never see the real code so that's a bit of a problem I have struggled with this batch of code for many years trying to figure out a way to get it to be expressive this is one of those examples where it's very hard to make this code express itself well now here the programmer is warning you don't run that test until you've got some time to kill and you can see why you know what is that a 10 million he's gonna write 10 million lines in a file well that's going to take some time and some of you may remember old j-unit you know before we had all those @ signs and Janet we the way you turned off a test in Jane it was to put an underscore in front of the name of the test and that would turn the test off so what he's telling us here is that this is a disabled test and if you turn it on you can expect some pretty hefty delays I would leave the comment in seems reasonable to me this is a comment that I encourage you sometimes wonder you know what the age was of the people who wrote the Java library and you know what were they kindergarten programmers or something because sometimes the Java library is full of code that is you know very questionable this is one of those cases simple date format is not thread-safe they've got static variables in simple date format and if you don't know this and you don't anticipate it then you're going to get caught in some kind of a concurrent update problem so I always encourage people to put that comment in and say remember that simple date format is not thread safe just as a defensive measure when I wrote this slide probably six years ago I thought to do comments for a really good idea that two new comments were brand new in IntelliJ about six or seven years ago and I thought it's so cool that you can put this tutu in there and then you can push a little button on the IDE and get a whole list of all your to-do comments that's just so cool and now I realize that the word to do means don't do I finally understand that and so now I see code set code that is just loaded with to do comments everywhere nobody ever does them and so nowadays I have a different rule for to do comments I will use two comments I will put them in but I will not check them him to do comments must either be done or deleted before I will check the code in because once you check it in it turns into it don't do so that's my new rule for to do comments I wrote this comment oh probably nine years ago I was working on a system and I I needed to put this trim in now trim is one of those functions that it occurs so often people use it so often and they use it for reasons that are ignoring because just you just call trim on all kinds of if it comes in from the outside world called trim on it so I wanted to point out that this particular trim was actually really important it was actually doing something semantic inside the algorithm so I amplified that with a comment I didn't want people to just ignore the trim do you write Java Docs now I think Java Docs are fine you know especially if you are producing an API for the outside world if you're going to be writing a whole sub nice library for the outside world of concern to consume then Java Docs are fine but inside the team if you're writing code that only the team is going to see you don't need Java Docs for that because the team ought to know the structure of the code anyway and code ought to have names that communicate pretty well so I don't like Java Docs unless they are for external api's and even then I want them to be pretty minimal okay bad comments and for this I'm going to sit the programmer here was talking to himself notice where this comment is sitting it's in the catch block that does nothing now that's weird all by itself notice what the the code is doing right he's got some file that he opens up it's a properties file oh thank you for that oh that was for me so that screen is okay yep got it he's opening up a properties file and then he's going to load the properties from some from that file and then there's this i/o exception that he catches and he says no properties files means all defaults are loaded where are all the defaults loaded how do I know that's true he says it here he asserts it that all the defaults are gonna be loaded here I don't know that true I don't see the code that loads all the defaults he's talking about some code somewhere else and here it is he's talking about some code that's in some different place I don't know if that code is still there I don't know if this is true or not this helps me not at all I understand why he did what he did he's assuming that all the defaults are gonna be loaded but I don't know that that's actually taking place so I looked at this and think well this guy was just talking to himself it was justifying why he was catching the exception and then simply ignoring it what he probably should have done here is load the default he'd loaded the defaults here then I wouldn't have to we wouldn't have to have the comment it would be very clear what was going on but he did this instead now here's a comment that is harder to read than the code utility method that returns when this stock closed is true throws an exception if the timeout is reached okay now first of all it's wrong it just it describes the way this function works but it describes it incorrectly because it there are ways to return from this this function that it doesn't talk about and second of all it's easier to read the code you know while we're not closed on the timeouts greater than zero then wait 100 decrement the time out that's pretty simple to read so I think the comment is not only useless it's worse than useless this kind of stuff makes me nuts what what motivated this guy to put a comment in front of every single variable not only that they're javadocs these are Java Docs and look at them the child containers belonging to this container keyed by name now first of all the word container is a strange word and he can't seem to be can't seem to make up his mind whether he wants to use container a component so sometimes he uses component in sometimes container he always capitalizes container I don't know why it's just it's a very strange thing the child containers belonging to this container keyed by name well it's a hashmap yeah it's keyed by name and what's the name of it children okay bunch of children keyed by name I don't need the comment to tell me that the processor delay for this component background processor delay the variable name says more than the comment does life cycle support event for this component life cycle support container event listeners listeners loader implementation loader logger implementation logger all these comments are completely useless I don't know why he put them in there I believe he was motivated by some strange urge to comment everything because comments are good I would much rather those comments be ripped out of there I already talked about that one has anybody seen a coding standard that mandates comments thou shalt put comments on every function thou shalt put comments on every class this is stupid and you must find the per people who wrote those that coding standard and inform them that this is stupid because you don't want to ever mandate comments when you mandate comments you get stupid comments that's when people will write the dumbest doggone thing so here this is stupid now what really gets me about this one is that this is a Java doc right and the purpose of Java doc is to run the Java doc tool to generate the HTML that perforates the nice little webpage for you right then and if you take that comment out and run the Java doc tool it will generate virtually the identical HTML because the Java doc tool is smart enough to look at the class and and say okay title author tracks duration in minutes I would miss a couple of things I guess but almost no difference in the HTML and somebody's told him he had to put that kind of crap in there long long ago in the deep dark past when we did not have source code control systems we would put the journal comments in the first part of the source file this is something we all did but nowadays we have source code control systems what what source code control system are you using git see there get good good that's the right one ok so nowadays we can put all of our journal comments and to get we don't have to put them in to the beginning of the source file so I hope nobody is doing this and if you find any of that stuff in the source code you could probably get rid of it now what would happen if you deleted it you would still be in git you're not going to delete it right you delete stuff out of a source file it stays in git you don't have to worry about the fact I hold I'm gonna delete this stuff don't do this please don't do this it is the dumbest thing to do it's also insulting you know I know that's a default constructor I'm a Java programmer who the hell do you think you're talking to now this one this was a little bit scary and because you got that last one wrong all right little copy-paste going on here I found this one in Tomcat and I was fascinated by it because I still don't understand the comment it's been years now right does the module from the global list mod in angle brackets I don't know why the angle brackets out there depend on the subsystem we are part of I can't I can't understand that comment but I was able to use the IDE to extract out variables so at the bottom you see how I extracted out the variables I extracted out module dependencies from s module get depends subsystem so you can see that here as module that gets depends subsystem is there I extracted that out into a variable called module dependencies the IDE told me what the type was so that's cool ArrayList of module ok cool these are the modules I guess that are that we are part of maybe I don't know and then I extracted out the other part you know subsys mod get subsystem and I called that our subsystem and the IDE told me that was a module and now look at the if statement if module dependencies contains our subsystem I love the way this reads I don't know if it matches the intent of the comment because I can't still can't understand the comment don't do this it's stupid you do not have to yell and then tell me that these are instance variables right I know that their instance variables you don't have to do that sometimes it is important to yell in a comment right sometimes you want to do that sometimes you are saying something that you really want to attract attention to and then it is appropriate to put some kind you know big flag in there like this but if you use that for instance variables no one will ever pay any more attention to your big red flags so this is like the little boy who called wolf don't do that stuff it's pure clutter does anybody do that anymore the comments on the closing braces remember there was a time than our IV eat well over the time we didn't have IV ease you know in the 80s when we were when we were coding in VI with text files how did you keep track of your closing braces and this was one of the techniques but it's probably not necessary anymore this is graffiti you do not have to tell me that you were here the source code control system will remember so if we need to know who to blame we can find out of all the sins of comments commented out code is the worst of them all it is a is it an abomination before nature and nature's God when I see commented out code I don't read it I don't try to understand it I just delete it and get it out of the system commented out code is a horror now people get very upset about that you can't delete that code someone might need it one day well if they need it it's in the source code control system so they can go back and look for it there but I'm not going to tolerate it sitting in a module like this do I comment out code yes while I'm doing experiments but I will not check it in I will not check in commented out code look at this what does it mean header pause equals bite pause did it first of all is it code it's got a semicolon after it so that's a pretty good indication that it's code why is it there is there a variable called header pause I don't know Dana posi equals bypass thank you I don't know why that's there I'm going to delete it gonna get rid of it do not let put commented out code in don't check it in this is somebody who wanted to write a really pretty Java doc so we loaded it up with all kinds of HTML macros and crud like that forgetting that the place you want the comments to be most readable is in the code the fact that you've got a Java doc tool that scrapes out HTML and does a nice little print job for you is fine you can use that tool but the place you want the comments most readable is in the code so you don't want to obscure your comments with a whole bunch of HTML and make them unreadable and force people to run the Javadoc tool and then look at it on the on the web page if you're gonna write a comment make sure it's readable in the code for that reason I don't tolerate HTML in my java - no one can put HTML in a java doc I'm just to use some other mechanism to format it so I will tolerate a pre and end pre in the giant as the HTML and the Java doc which keeps all these fonts mono spaced port on which Fitness would run defaults to 8080 - it does where do I see that 8080 to where here is this 8080 - there's a fundamental rule about comments right you never talked about code that's somewhere else if you write a comment you only talk about the code that's right there because if you talk about code that's somewhere else that codes going to change and your comment will turn into a lie for the same reason you never put where used lists in comments because those where of used lists will change like crazy as well here's somebody who simply did not want to write the code they wanted to write about the code so they wrote a whole little essay I hope I drove the point home comments aren't bad but I don't want you write in comments by default I don't want you in the mindset of saying I've got to write a whole bunch of comments now if you're going to write a comment you need a good reason to write a comment because what you should be doing is making the codes speak for itself to the greatest extent possible and then if you fail at that well then you'll maybe you'll have to write a comment how long should a source file be how many lines should there be in a source file this is not an easy question to answer it's not like you know the the one thing the one rule principle it's not like that this is something else so so I did a little study I took seven projects that I found off the off the internet and I analyzed their file sizes and I found something really interesting so here's the the seven projects one of them is j-unit this was probably 2001 mm maybe 2005 when I did this and notice that Janet is like 6,000 lines of code and the the plot here is called a box plot so this is the smallest file this is the biggest file this is one standard deviation and the mean is right in the middle right so j-unit which is about 6,000 lines of code has a whole bunch of modules inside it but those modules the smallest module is four lines of code the biggest line module is two hundred three hundred four five hundred lines of code the average module is around 20 30 40 50 lines of code and most of them are between 130 that's that's pretty good that's pretty standard notice there's a log scale here all right so as you move up it changes a lot Fitness this is another project it's about 50 thousand lines of code so much larger than j-unit but almost an identical structure this is interesting because you would expect that large projects would have a different structure they'd have a different different statistical array of file sizes but that does not appear to be the case Fitness much larger than Jane it has the same structure test ng sng is a tool similar to j-unit it's because 72 thousand lines of which makes you wonder what the hell it does and it has a wildly different structure interesting and so I'm not quite sure what that's about but the you know there's a big file in there looks like 1500 lines of code the average is like 40 lines of code but there's a huge standard deviation so he's got no regularity to the size of the files time and money that's about 6,000 lines of code that's Eric Evans example of domain driven design very similar structure to fitness and J in it J depends JD Ben is really old it was probably written in 98 about 7,000 lines of code and a slightly different structure looks like his average file size is about 120 he's got a couple of big files in there although they're not really big it's just that the average seems to be larger than usual and I think the reason for that is that he Javadoc everything and that just drove his file sizes up a little bit aunt 200,000 lines of code and aunt the average file of sizes around 200 lines that's pretty big he's got a couple of big files in there is one that's 2,000 lines long Tomcat 384 thousand lines of code look at that man 2 average size is about 200 lines he's got a file in there with 5,000 lines in it now what I find interesting about this is that there's no correlation between file size and distribution right you've got a couple of big ones a small one here and a big one here and another small one there and a slightly larger one there I'll have roughly the same correlation we've got really big ones out here and a small one here that you know there's no correlation there so what that tells me is that file size is not a function of project size file size is a style that you can impose upon your system and since that's true what style would you like to impose on your system how big should your files be and it looks here like you can build a fairly significant system with files that are on the average 50 lines long most of them are less than a hundred lines that's probably nice so that might be a a goal to strive for I'll come back to that stuff because I want to do another statistical analysis this one this is an interesting analysis of the length of lines same projects but the length of the lines in the projects and look at the interesting correlation here every color is a different project but look at how they follow the same interesting curve now this is a histogram it's a histogram of line length so this is the number of these this is a line length down here and the vertical scale is the percentage of lines that have that length and notice that the vertical scale is a log scale so there's a lot of lines that have no length these are probably blank lines and and then they fall off pretty rapidly to a minimum here of about 12 and then there's this gradual climb look at the look at how tight that's clustered these seven wildly different java applications but really tight clustering right in here with a peak and they all peak about the same place right around 3035 something like that lines of code where 1% of the lines of code now are at this level and then it starts to fall off and this looks like a slow fall off but this is a log scale so it's actually a very rapid fall off you get to about here and realized that nobody wants to see any lines that are longer than about 80 so this is the interesting part of that curve and what I find fascinating about it is that it's that all seven of these projects follow the same curve so it seems that we have a preference for lines that are on the order of 30 to 40 line 40 characters long is that a change in well so that's interesting yes lines the screens are getting bigger is that having an effect on our line lengths and this argues that it's not because look at how look at how small that is that Peaks it's in there right around 40 35 40 right so the fact that our screens are getting bigger does not seem to be affecting the distribution of line lengths I find that to be very interesting now what does it tell us about what kind of guidance does it give us well I think the guidance is pretty obvious you want your lines to be on the order of 30 or 40 characters long you don't want very many that go beyond maybe 80 I actually have a a barrier put into my IDE at 150 I will not go beyond 150 and I do that because I believe it is rude to make your readers scroll to the right so all the codes should fit on the screen and you should not make your your readers scroll to the right if you make them scroll to the right they won't and then you can hide all kinds of crap out there yeah do they have different functions yeah I don't know I didn't do that analysis it's an interesting interesting thing to do we have an awful lot of code now so there's very interesting analyses that we can do like this but I didn't do that one anybody else have a question on this good ok show them let's talk about names and see where's our names that goes to about here there we go early in programming we didn't have a lot of options with names like I told you before we were limited to like six characters nowadays that limitation is long gone so we can have names that are as long as we want and we name things we do a lot of naming in software we named files and directories and programs and classes and namespace and variables and arguments we name all kinds of things and because we do so much of it we probably ought to be good at it so let's talk about some rules for naming things with the the rules that I'm going to show you here are old they've been around for a very long time they're derived from Tim Ottinger 's list of naming rules that has been very popular circled around the internet for years very obviously you want to reveal your intent in a name so I'm going to give you a couple of examples here is int D a good name for a elapsed time in days now your original thought would be well no of course not because D is just a one letter name that's awful but wait a minute how long should a variable name be what's the rule for the length of a variable name now consider the for loop for I equals 0 I less than 10 I plus plus do you want that eye to be something other than I the answer that is probably not and so there's there does seem to be a place for single letter variable names like I so what's the rule what is the rule for the length of a variable name and here's the rule that I use a variable name should be proportional to the size of the scope that contains it if the scope is very small like one line a single letter is fine you don't want to have anything else if it's a one line scope you don't want to have anything else a single letter is great D would be a perfectly valid name for a date if D existed only in a single line because you wouldn't lose the context you wouldn't need the name to remind you of anything the function call that generated the name would enough long scopes need long names so let's walk through the hierarchy here inside of a an if statement you've got maybe a couple of lines in that if statement variables inside that if statement ought to be very short variables inside of a really tiny while loop should be very short if you have a function and that function is four lines long the variables inside that function should probably be pretty short because it's four lines long maybe they'd have to be a little bit longer arguments would probably be a little bit longer a word would probably be good for an argument instance variables live inside a class they have a slightly longer scope they have the scope of the class so probably an instance variable should be long ish two words may be functions the arguments to a member function probably a word global functions mobile functions have a huge scope they better be very long got global variables sorry global variables have a huge scope so they should probably be very long variables should have a length proportional to the scope that contains them what's the rule for functions exactly the opposite exactly the opposite the bigger the scope the smaller the name for a function and for very obvious reasons we would not want to call the open function if the name of the open function was open file and throw exception if not found as ace as the scope of the function gets larger we want the name to shrink we want the name to shrink because we're going to call it more a function that lives in a large scope will be called from all over the place so we want to shrink the name down moreover if the function is in a large scope it must be abstract it must be dealing with a high level abstraction so we want the name to be short as the scope containing a function decreases the name starts to get longer so the instance variance since methods of a class will probably have slightly longer names private functions called by public functions will have even longer names private functions called by private functions will have even longer names you can continue down that hierarchy for a very long time especially if you're extracting until you drop as you'll extract and extract and extract and all these extracted functions are going to be private and every time you go down another level the name gets longer and longer and longer and it gets longer because the function becomes more precise it does something really tiny really precise that you need words to specify so the name of a function is inversely proportional to the size of the scope that contains it what about classes same as functions size of a class name is inversely proportional to the size of the scope that contains it classes at the global scope have one word names derived classes have multiple word names inner classes have multiple word names as the scope shrinks the name grows so that's a reasonable rule for naming things or at least controlling name length therefore if d is not necessarily a bad name for this variable as long as the scope that it was contained in was small if the scope was long than elapsed time in days is perfectly reasonable what does that function do take a look at it spend a little time it gets them what does it get well it walks through a list and it interrogate the first element of each list and if that first element is a four then it adds it to the output list list one which it returns excellent good now we know what this does it scans a list for first elements four elements whose first element is a four and then it returns only those lists that have that okay good I think I got that right that's the same code but this changes everything the the names here tell you what's going on inside this function it's getting all the flagged cells oh there are cells and the cells are part of a game board and every cell has a status value and if that status value is flagged then we're going to return that cell so this function returns all the flagged cells now notice what this nice naming system does this naming system does more than just make it easier to understand the function it also tells you what program contains the function this program is in some kind of a game this function is in some kind of a game probably minesweeper if you know minesweeper of course nobody knows minesweeper anyway because nobody has that old desktop accessory from Windows anymore but okay a good system of names tells you not just about the function you're working on but it tells you about the entire context of the system so that's the power of a good system of names let's see you gotta watch out for this what's the first place where these two variables differ that some of you probably got fooled by that this is really hard to see code is full of optical illusions and you have to be very you have to be cognizant of the fact you have to be aware of the fact that code can contain optical illusions and you have to fight against that this is really hard to see you're gonna be looking at here especially because of this and you also have to worry about stuff like this our modern ideas have gotten pretty good at disambiguating 1 and L and 0 and O P and Q but sometimes these letters and symbols differ by one pixel so you have to be careful about that sometimes it's hard to just differentiate them do you use prefixes any more like za and uh I used to use this this of this convention all of my local variables began with ah the indefinite article here all of the arguments to a function began with the and all instance variables began with I TS it's so its name it's date I used to use that all the time I've stopped using that now I used to use it in the 80s and in the 90s but I've stopped using those prefixes now because the IDS are just so good at telling me that that's an instance variable or a function I can hover over it so I've kind of dropped the whole prefix I'd be in that days it's probably not a good idea to use numbers series like this sometimes you have to fall back on it but it's better to give them reasonable names number series aren't a great idea watch out for noise words data and info David first of all I want should realize date is completely redundant right of course it's data you don't need to say it's data okay product and now here what is the difference between product and product data is there some is there does this tell you anything about what what is different about it there's a type in the system called product there's another type in the system called product data there's yet another type in the system called product info what is the difference between them and these type names don't tell you the difference so that's a problem those are noise words I'm not going to talk about that because I'm going to come back to it later this I found in a real application and I find it scary as hell the the first line says there's a function that will return the active account get active account that's great and it returns an account good but now look at the next function get active accounts this is in the same system now that begs the question what the heck is that first one returning this there can be more than one active account so what's the first one returning the third function is even scarier get active account info but it returns the list of accounts so these are not well disambiguated names the names are not telling you what's going on here and you look at them and go what that's not good that's a WTF make sure your names are pronounceable cooker three 'can't busy name jigna minam's now you can see why the last one especially you can see why you used this name it's the generation timestamp your month day hour minute second makes perfect sense but you can't say it now you know you guys pair program but any pair of programmers out there anybody repair how often do you pair program not very many people doing it why not it's a good idea pairing allows you to share knowledge do you do code reviews who does code reviews ooh everybody has code reviews code news a very inefficient pairing is very efficient right how much how much time should you spend on a code review let's say that the original author required five hours to write the module how long should it take to review that module well some function of five hours maybe not exactly five hours but some significant function of five hours because if you're going to review the module you need to walk through the reasoning that the author originally went through now hopefully the author made that easy for you by refactoring it and cleaning up the code so fine but it's still going to take you some fairly significant amount of time to review that module if you if you review a module that took five hours to write and you review it in 15 minutes all you've done is look for semicolons so all you've done right and maybe it's some indenting and maybe a few naming conventions but you haven't done anything significant you haven't actually reviewed the code so you should look at code reviews as requiring roughly the amount of time it took to write the module well okay if you're going to spend that much time reviewing the code why wouldn't you pair pairing takes about the same amount of time but when you're pairing you are actively involved you are not passively reviewing you are authoring there's a much better way to share knowledge and contribute to a team yeah so the question was about noise words because he didn't want to talk about pairing the word exception appears in lots of different types in Java an illegal state exception illegal argument exception nullpointerexception so on does the word exception constitute a noise word in that case no it doesn't it's the it's the noun that is the base class they all derive from exception and then the derived classes have to add a word on to that to describe what kind of exception they are so that's not a noise word and and it is not noisy also because it tells you it's an exception all right now back to this pairing thing thank you for that we have to go out for lunch oh yeah it's 12:30 isn't it its feet is what I'm talking about pairing even though I don't yeah
Info
Channel: UnityCoin
Views: 304,842
Rating: undefined out of 5
Keywords: programming, software, clean code, polite code, shrunk code, programming language, computing, technology, society, ethics, human relations, uncle bob, robert cecil martin, edsger dijkstra, grady booch, future, rules, java, c#, c++, microsoft, functions, declarations, arguments, cycle, kotlin, InteliJ, methodology, agile, scrum, tdd, test driven development, programmer, responsibility, expectations, architecture, design, development, applications, app, structure, web, study, practice, optimization, productivity, purpose
Id: 2a_ytyt9sf8
Channel Id: undefined
Length: 66min 0sec (3960 seconds)
Published: Fri Aug 09 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.