Greg Young - "How to get productive in a project in 24h"

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Some interesting ideas about data mining version control to find and refactor problematic code.

πŸ‘οΈŽ︎ 3 πŸ‘€οΈŽ︎ u/chromaticburst πŸ“…οΈŽ︎ Apr 26 2015 πŸ—«︎ replies

Looking at the git history of the file to find problem code is pretty clever

πŸ‘οΈŽ︎ 1 πŸ‘€οΈŽ︎ u/ForgotMyPassword17 πŸ“…οΈŽ︎ Apr 27 2015 πŸ—«︎ replies
Captions
I'm very unlucky I have the worst session of the day because I'm going to sit here and talk to you about software and visualizing things and looking at it from another layer how to get better understanding and what you guys are going to be thinking about is beer how many of you are going to the after party everyone's thinking about the after party no one's that interested in the talk but I'll do my best to try to keep you entertained for at least an hour how many of you have walked into a new piece of code before that you've never seen you show up first day of work at a new project crap how many of you have spent at least two or three months trying to ramp up on a codebase I am in an unfortunate position very often I come into a client and I will be there for a grand total of 3 days how do I tell them something valuable about their software in 3 days what we're going to look at is we're going to look at things that I do in order to help me gain understanding of that project within the first three days we're gonna go with more in depth how can I walk into a source base in two hours and tell you where your program your problem areas are there's ways of doing this and they're very valuable to understand how many of you use source control we tell our business people over and over again about how valuable their data is and all the things they can learn from inside of their data it's real business value inside of data how many of you data mine your source control why not we tell the business people about this data that they're recovering and how valuable it is yet we have all this data that we don't use ourselves the first thing that we can go look for in order to really get to know a source source code base is go look at the source control how many of you have seen this screen before a whole list of commits why do you commit your you changed something why did you change something it was a new feature it was a bug well there's a couple reasons why we do it now if I were to think about what you should see in source control and I'm not very artistic I'm slightly autistic but not very artistic this is what we should see on a healthy piece of code if I were to graph the number of check-ins or commits over a period of time on this file this is healthy at the beginning we get a big spike then we're done and we see it flattens out and we don't get any more commits and then maybe we see a commit here there this is healthy think about it you write code you get it to the point that it works and then you don't touch it you should not be continuously changing code what I tend to find in source control is not something that looks like this I tend to find something that looks like that what this is telling me is that you have a bug hive it's either you have tons of bugs in this code you're constantly going back and fixing it or this code is very very coupled and but you're changing other unrelated things but they cause you to have to make a change inside of this code how hard would it be to take I don't know link which has an adaptor to get and to produce this chart for me I've written it I can tell you it's about 19 lines of code it's not hard to go through and get this kind of chart out but this I can run across your source base very quickly and I can look to see if I'm seeing this or if I'm seeing that places that I find like that those are very interesting places I'm going to go in and I'm going to look at that code places where that looks like the first one I'm not going to bother looking at that code it's not your problem your problem is when you get this there's lots of these kinds of tools that we can use to get in and look at our source control and this Mac is terrible it keeps going out of fullscreen how many of you have heard of gorse so what we're looking at here is a visualization of the commit history of jQuery and hopefully the Wi-Fi won't go down we noticed patterns when we look at this when you watch this you are watching the history over the course of months sometimes years of how this software came to be you see patterns when you start watching it you should see a whole bunch of things come into it and that's lovely there what we see is the committers coming in and working in different areas this is a very simple visualization that we can use to get a better idea about the history of the software and you generally don't want to see ten people all hitting on the same thing you should see it come out and then come back you'll see it expand and contract think about how you write code you go through and you write something and then you realize that there was something that you could have done and abstracted you had the same thing written ten times and you come back and make something that generalizes it you should see expanding and contracting these are all ways of identifying and visualizing things about our source code the big problem when I come into a source code base is that I don't know it I don't know all about the history of the code I don't know where the problems are and there's 1.5 million lines of code what we need is better ways of finding the interesting places what you're really interested in if I come to your company is that on the first day I can tell you where your pain is I can say this is where it hurts right now these are tools that allow me to do that and we're going to go through more of these kinds of tools but the easiest way and the first thing that I look at is the source control I date a mind source control and if I see this that is a place where you have pain and it's a place where you feel pain do you know how I know that because you commit you're committing you're telling me that you're actually changing this stuff in that it hurts but this is not the only way of visualizing code from higher levels how many of you have heard of tools like sonar or end append they're wonderful tools but there's some problems with them they give you too much information if you want to watch an architect's head explode give him a copy of end append it will tell him everything that he never cared about about his system finding the real information inside of these tools and finding the important stuff is the hard part how many of you have heard of afferent and efferent coupling those are really big fancy words we're going to go in and we're going to define what they are but it's always been amazing for me that we as developers in university don't learn about afferent and efferent coupling how many of you have heard that you want to create software that is low coupling and high cohesion have any of you stopped to wonder what kind of cohesion because there's like eight different kinds no one ever talks about that they just want it to be highly cohesive and we're going to talk about some of the different kinds of cohesion as well this is one of the screens from end append that's a lot of information that it gives you the trick is trying to get in at the right information and that's hard so what we're going to do is we're gonna start learning about some of these kinds of code metrics what they mean and how to interpret the results now there is no code metric that I can give you that will say this absolutely means that this software is good this absolutely means the software is bad what I can give you is I can tell you this is an interesting place to go look there's probably something going on here and when you start realizing what these metrics mean you can look at code from a very very high level and drill down into the places that are interesting so let's start with afferent and efferent coupling they're very big fancy words that basically are extraordinarily simple concepts afferent coupling is the number of people that couple to me efferent coupling is the number of people I couple to could be a whole lot easier if they called it coupling and coupling out wouldn't it instead of having words like afferent and efferent that make it sound like a some PhD paper it's a really really simple idea here we have a method a which calls B and C a has an efferent coupling of two B has an afferent coupling of one it's a really easy idea and it doesn't just apply to methods it can apply to classes and dependencies what we're talking about is what you depend on versus what depends on you we can do it at package levels we can do it at a namespace level what things depend on this namespace what namespaces does this namespace depend on it's a really really easy idea if we were to flip it around now we have B and C that call a so a has an afferent coupling of two B and C have an effort coupling of one that couples to a this is a very simple idea stay away from the big and see words just remember it's coupling and coupling out if you understand that you can get a huge amount of information out of your system what if you had a method that had an afferent coupling of 475 you probably don't want to change that code what if I had a method with an efferent coupling of 900 probably says refactor me not positive but it probably says to refactor me tools like and append or sonar can tell you what the afferent and efferent couplings of things actually are what's really cool with independence it actually is like a sequel like language as well where you can write queries select from my code any methods that have an afferent coupling greater than 20 it's a really cool way of being able to look at your code now it's not quite so simple when we talk about couplings because there's different kinds of couplings does it make a difference to you whether I couple to an interface or a concrete class if I'm coupled to an interface can you change the implementation sure if I'm coupled to a concrete class yeah that'll be fun so we need to look at what we call the abstract nosov our coupling and the abstract miss of coupling it's again a very complicated term with a really simple idea behind it it's the number of abstract things that I couple to over the number of concrete things I couple to really simple metric what percentage of things do I end up going to that are abstract versus what percentage are concrete and here we can see the a couples to IB and IC by the way how many of you put interfaces as I I whatever I can haz Cheezburger this is always boggled my mind in the.net community that for some reason Hungarian notation is bad but I Can Has Cheezburger is good well of course I can have cheeseburgers good because they're they're very cute cats when we talk about coupling we need to remember the abstract nosov the coupling are you coupled to an interface or are you coupled to something concrete again these are just simple counts and we can get them out of these kinds of tools if I understand what they mean it tells me a whole lot of information about your software what I'm looking for is places that have high coupling if I find you have high coupling and I find that you're changing the code near source control a lot hmm that's an interesting place to go look isn't it let's look at the code and figure out why this is like that is there some history about this this is one of the main tools I use to get right into the interesting places inside of your source code the main problem we have getting into somebody else's code is finding the interesting stuff it's finding where penas these kinds of tools are what will allow you to be productive on that first day because you can immediately drill right in on the most painful part of the software there's other ways of viewing source control source code from a higher level we're going to talk more about it but this is an example of it what we're seeing here is the couplings inside of source code as a graph this is telling me about the affected graph add connections method we can see that affected graph merge calls it and then there's three tests that call in to merge this is again another way of looking at our code that's what we're focused on how do I get higher level view points of code to help me find interesting areas and to drill down into them you could be looking at code coverage you could be looking at afferent efferent couplings just remember if you look at them to consider abstract miss there's all of these different kinds of ways of looking at code how many of you have seen this before this is a report from n depend so we're gonna go through really quick we're gonna look at some of the kinds of things that it can show us how many of you have heard of the dependency matrix before so what this is talking about is who has dependencies on what what you don't want to see is what's known as circular dependencies things that are dependent on each other that's bad if you bring up this graph and you have circular dependencies what you actually get is black and they will show up in the middle and there'll be black what you ideally want to see is kind of what we're seeing here you see the blue at the bottom and then the green going all the way up the other side that's showing that we don't have a huge amount of interrelated dependencies this is actually a good thing this is doing it at a package level and we're looking at a package level how the packages depend on each other as I said with afferent and efferent couplings we can do this on any level namespaces classes methods it's the same thing that can just be applied at different levels of granularity let's jump back and let's go in really quick and look at some of our application metrics so we can see then this is for DB for objects it's got thirty-one thousand lines of code we can see the number of il instructions our code coverage is actually a very bad code coverage it's only 39% this is giving us a high-level viewpoint but we can start getting into a lot more detail and this is one of the most valuable things that n depend will actually give you I mentioned before that you can come in and you can write sequel like queries well what they've done is they've written a whole bunch of these sequel like queries for you so for instance it may be looking for methods that are too complex that's the second one there what that's actually using we're going to talk about it in a few minutes if cyclomatic complexity but it may be what about methods that are too highly coupled I'm very interested in any methods that as an example may have an F efferent top-line greater than 20 so show me a list of them these are just pre done ones you can write your own as well what if you had a class that had 245 fields in it chances are that's not quite right what about a method that had 25 variables all of these are smells they're things that if I see this it doesn't necessarily mean it's bad I'll give you a perfect example SiC LeMat a complexity I wrote a method that had a sick LeMat of complexity greater than 2,000 it was a state machine of go twos it was very very complex if you took the metric but it was a state machine there's actually a couple methods inside of the dotnet framework that are like this to remember that when we talk about these things you don't want to say I'm going to optimize to get none of these errors because a lot of times they can be false positives you have to take the number and use it as a tool to help you go and find areas do not take it as if you have a cyclomatic complexity greater than 20 you are a crap coder that's not true sometimes you want to break it's not always bad and it's very important to keep this in mind because I one of the biggest failures I find with teams start using this stuff is they try to optimize to say that we're not going to get any lists nothing will be in our list because obviously the people that built the metric tool they know much more about our software than we do keep in mind that these are pieces of guidance and they are not necessarily always accurate you use them to identify places you do not use them as a rule you do not set up rules let's say gated check-ins that's my favorite one you cannot check in code with a sickle Ematic complexity greater than 25 I actually know teams that have done this don't go down that road use things as a tool as a heuristic towards finding interesting places of code because you guys are all developers you can go read the code go talk with that developer go ask them why they did this why were they putting in that code going along with this how many of you use continuous integration for me what's really interesting about code metrics is not necessarily a static view like what we're seeing here what I'm interested in is deltas deltas give you a whole big piece of information I want to know about how things are changing are you going through and adding more coupling to the code or a removing coupling from the code if I look at the code that you're changing are you adding complexity or removing complexity these are the kinds of things I really like to look at and it's very easy to set up a tool like this to run on side of your continuous integration server and when you do that it'll generate deltas from one run to the next because they're also tied to check-ins that means you can actually look at which check-ins caused which deltas if I were to go a little bit more with the idea of that our data is valuable what could I start telling you about your team members by looking at all of their check-ins in big groups and looking at the deltas and the metrics over large numbers of check-ins there's a huge amount of information here the real trick is not so much getting the information it's understanding what's important and what is not important inside the information it gives us too much information but let's take a quick example here let's try methods with too many overloads that's always a fun one right it will tell us what the query is and one of my favorite things about all these rules they've built and then depend is they show you the actual sequel like language that they're writing in order to represent the rule so here we can see that we'll get a warning if count is greater than zero where the number of overloads is greater than six and it's not an operator think about that for a minute that's a huge piece of information how many of you have written a method that had more than six overloads come on you can admit it I've done it if I were to come into your code and I were to run something like this really quick is that likely going to be an interesting spot that I could go fix if I went and fixed that would that likely improve or not improve the quality of your code base would that be a good commit that you would like to see for me on my first day what about when we've got a method that's very complex and I jump in really quick and I refactor it down and I put in less complex code these are the kinds of things that can be done because we've identified valuable areas of code to jump into we're finding where people are having pain now there's all sorts of other things that we can start looking at for instance we can start looking at how assemblies are dependent upon each other we can look at how namespaces are dependent upon each other classes methods how are things intertwined we looked for instance at the dependency matrix that showed us a matrix of our dependencies I can tell you a lot of systems you go into you'll bring it up and they'll just come up black everything is dependent on everything and you know what the problem is immediately a couple years ago they and they actually have a really funny report it's one of these for the.net framework itself when you brought it up black they had methods with sickle Matic complexities which we'll talk about in a minute greater than 500 again all we're doing here is we're identifying interesting areas none of these metrics will say this is good or bad code what they do is they give us ways of looking at the code that are different it's a higher level view I can look in this one report at 1 million lines of code and I can start getting some understanding about how this code is intertwined I can start seeing where problem areas are again this is just more data and our data is valuable how many of you have told your business before that your data is valuable these are all just different ways of looking at what we're doing so let's just jump back now I have to go into another tool how many of you have heard of mighty moose so mighty moose is a continuous testing tool when you save or when you type it can figure out which unit tests need to be run it doesn't in the background it's actually fairly interesting tool it was very fun to write I have to say but that's not its main value its main value is because I can run your tests and because I'm running them I can start getting information about them I'm running your build so I know when your code changes I have all the information about all your code I can start giving you higher level viewpoints of your code and we mentioned this one to start with this one is a graph that's showing how pieces of code are related to tests how many of you have sat in a piece of code before and wondered how many tests in which tests were covering this code it's a fairly common thing to wonder I'm going into the code for the first time doesn't have any tests all of this is information that I need if I walk into your code for the first time I've never been in any of your code I open up the file I could start making changes right except I'm afraid I'm scared to change this code because I don't understand all the code around this code again this kind of data can start helping us become comfortable with changing code if I'm not comfortable then doing a very small change is going to take me a long time because I'm going to go look all around this code to the point where I feel I understand it and I'm not breaking something that's where my time gets lost my time is not so much just doing the change it's understanding everything around my change and making sure I'm not breaking anything what tools like Mighty Mouse can give you is this confidence and we're gonna go through various ways that you can start getting this confidence now we talked before about efferent and afferent coupling my team of Souza's afferent and efferent coupling you can see here how we actually have some couplings inside of this code what that's doing is it's using those couplings to build up this graph this graph is showing you afferent and efferent couplings nothing more it goes in at an i/o level and looks what things you call it then puts together a graph that shows you how things are related to each other now it's very interesting after having used it for quite some time I started finding patterns in the graphs and some teams I've worked with have even named patterns inside of graphs as an example they have a teapot if you see a grad only you need to read the nodes if I see a graph and it looks like a teapot I know the refactor I have to go do you cut off the nozzle healthy code when I see it like this will look like a spaceship with a big engine I'm a healthy code looks very different and if people go and use it sometimes you'll get into really coupled code and my favorite feature suggestion that we ever had was well when I bring up the graph in this code the graph is like 800 nodes in it and well I can't read the nodes because they come up too little in the graph and so it makes me zoom in that's not a problem with the graphing system just bringing up a graph without even reading what the nodes actually say on them I can tell you a lot about your code would you consider that to be reasonably healthy code looking at the graph what if we were to try a different graph we'll shoot ahead what if you saw that graph it's actually not that bad I've seen much much worse than this one this one does kind of look like a spaceship but you'll notice that it does not have big engines again these are just ways of visualizing and understanding our code anything that you can do to get a new way of looking at a problem will help you with that problem so let's just shoot back now here we've got some couplings going out in our code we've got one two nodes that contains key and another one to the constructor of node connection and another one two connections dot add but there's also something else interesting going on this code you guys have heard that you want low coupling and high cohesion right that's what code is all about well I find it odd when I talk to developers and nobody actually knows what cohesion means well of course it's cohesive it makes sense well okay but there's lots of different ways of dealing with cohesion there's data cohesion there's functional cohesion there's procedural cohesion which cohesion do you want to use one metric I love looking at you can see here here I've got two fields I've got nodes and connections and I have this method here this method accesses nodes and accesses connections so we could say that from a data perspective this method is 100% cohesive it accesses all the state I could put another method in here that accessed neither nodes or connections and we would say that that was 100% non data cohesive because it accesses none of the state how many of you have written or set a goal that when you write an object every method should touch every single piece of state that doesn't sound like fun does it because normally what happens you end up with five methods and maybe we've got ten fields inside of here and each of the methods is accessing let's say five or six of the different fields inside of our object you generally don't want to become 100% cohesive and you don't want to be 100% uncut if you're somewhere in the middle it's a good metric to look at how many of you have heard a single responsibility principle I've always had fun with single responsibility principle because I've actually had someone tell me that if I follow a single responsibility principle what I should build is an anaemic domain model with no behavior in it and then put services all across the top because that single responsibility one service for every behavior but they're leaving out one of the core concepts of object orientation what is the responsibility of an object if we say that we're single responsibility that falls into what is the responsibility of our object the responsibility of an object is to encapsulate state and provide behavior if we follow that all the sudden this concept of data cohesion becomes very important how do I figure out whether or not a method should go into this object well I can start looking at the state and how its accessing the state inside of that object how many of you have put a method onto an object that access none of the state and the object could that be a code smell that that method should not go there we can use a lot of these kinds of metrics when we're coding as well and most of them are extraordinarily simple I can explain most of them in less than five minutes to a junior they have fancy names associated with them they it's academia's gotten all into them but they're not difficult concepts it's more they have difficult names and they're not approachable when you actually understand them you can get right into this now how many of you would see this code and think that we should refactor it how many tests does this code need I'll give you a hint it's more than six if I have an and there that's probably gonna need two tests right how could I figure out how many tests as things actually going to need if I want to cover it I could count the conditions how many of you have heard the big fancy name cyclomatic complexity okay let's try it in a different way we count the branches fancy fancy term simple idea what we do is we count the number of branches that exist within a given piece of code that's it via an if statement well that adds to your cyclomatic complexity what would happen if I saw a method and I'm not going to say cyclomatic complexity that had 47 branches 47 possible paths through this method how many of you think there's gonna be 47 tests there if I knew that there were methods with 47 branches through them 47 paths would that be an interesting place to go refactor if I use a tool like sonar and append yet it'll actually tell me that it'll tell me what the cyclomatic complexity of all my methods are again this is another way of drilling into the really interesting pieces of code do not take this and say we have a gated check-in if anyone ever puts in cyclomatic complexity greater than 25 we will reject their check-in because we are the source code nazis you don't want to do that you use it to find interesting places and to gain understanding from a high level now another thing that we throw into mighty moose and this is actually it's very odd because it's the most valuable thing for me and it came as a byproduct where there is actually two bits that were bribe products the first one was two graphs that I put up these do you know why those were put in that was my debugging tool for working with the coupling and trying to figure out which tests to run it was just a little visualization that I could use to see what I actually thought was coupled to this code and after a while I started realizing as using them all the time not just to look at the coupling graphs internally I was bringing them up to understand code there's another one of these things that came in we talked earlier about if I come into your code I've never been there before I'm scared not because your code is so god-awful I'm scared because I don't understand the context of the code I don't understand what's going on around it what these little buttons here will tell me is they'll have a number that number is how many tests cover this code and then we do a risk analysis of your code that little button turns from green to yellow to red and there's actually a hidden feature inside of there as well which it'll turn into a dragon I won't tell you what you have to do to get your code to make a dragon but if you fix a dragon it will also reward you if I see green I'm comfortable now Green does not mean that your code is awesome it means I've got a tool that's doing a risk analysis around this code for me it's not green as good code right as bad code Green is code that if you make changes we feel confident that the tests will likely find things if you break them how many of you have heard of code coverage before code coverage is absolutely useless it is it is nothing but a false sense of security how many of you have written a test and forgotten to put in a cert does that test still cover stuff but there's no assert so what is the value of this test how many of you have put in an assert that was absolutely meaningless compared to what your test was supposed to do I know I know and not you guys but juniors do it and then I get back my code coverage report that tells me I've got 99.4% code coverage yes great software except none of my tests have asserts inside of them code coverage is almost meaningless what's really interesting is not how many bits are actually covered by a test what's more interesting is when I look at tests from the perspective of I've got some code I've got some tests around the outside if I were to bring up this graph the yellow ones are the tests the green wanted to know that I was searching for do you think those tests are likely to assert off something related to the thing that's under test what if there were 19 methods in between in other words we had 19 things between us and the test do you think it's likely that the test is accidentally covering us do you think it's probably likely that it will assert off anything to do with us at that point probably not so what the risk metrics are looking at if they're looking at how far away these tests are from you they're looking at your cyclamen added complexity I know another big fancy word let's we'll just call the number of possible paths through your code so I know that there's 19 tests I don't know the quality of the tests I can't possibly know that we'll talk in a few minutes about an interesting way of trying to determine that they are a call path of three away do you think this is risky code to change or non risky I didn't add one part the distance was the distance in a test assembly or was it in code what was the complexity of the things in between all of these come into play if I have a method between you and me that has a complexity of 5,000 you're probably accidentally covering me in other words each path along the way is a multiplier again these are just ways of visualizing and understanding and trying to quantify risk associated with code how many of you have been scared to make a change before the little green dot does not tell you that your code is awesome it's just a quantification of our perceived risk around what you're doing if I go in not all code will be green sometimes it'll turn yellow or red or as we mentioned earlier Dragons this one is giving us a yellow 5 that means there's 5 tests that cover this but yet we still think it's risky I bet if I actually looked at code coverage almost all this method would actually have code coverage but we still think it's risky for you to make a change what this is is this is a signal to you look around before changing code inside of this I want you to look around what's going on in this you're probably going to have a problem more research needed it doesn't mean crap code it's just an indicator to tell you when to go look for more information now we've looked at this graph earlier I mentioned you'll find patterns inside of these graphs I've been working really hard to automate this process but you see that point of inflection there hmm so I've got all these tests down here that are not covering that code up at the top and they're all connected through one node maybe I should refactor to put an interface at that point of inflection at which point all that bottom part of the graph would go away you can find huge amounts of information when you look at code from higher levels these are the kinds of patterns I'm discussing that you start seeing when you deal with things visually one thing I've also noticed when I have a team who is constantly looking at these graphs they produce less complexity in the graphs than teams that do not have graphs why because they're constantly looking at it if I give you something that you constantly look at you tend not to make it so that there's 900 graphs inside or 900 nodes inside the graph why because you start using them when you bring up a graph you can arrow through it and hit enter and it brings you to that place in code think about when you're writing code where do you normally navigate to some arbitrary place inside of your software or something related to what you were just working on almost always the place you want to navigate to is something that's inside the graph when you start using it all the time and you're bringing the graph up and down let's say 500 times per day you tend not to put lots of complexity and make the graph really complicated if you send me a hug request that the graph doesn't look good for your software it will be marked not fixed that is not a problem with the graph now how many of you have seen something like this before we talked about really complicated fancy names this is when we start talking about sick limb attic complexities again now let's talk a little bit about those cyclomatic complexity Xand tests how many tests should there be for a piece of code enough yes should there be more less or equal to the number of possible paths through that code less necessarily isn't going to work very well that's basically guaranteeing there are paths that you are not actually testing assuming you're only testing one thing per test I know a lot of people like to test 19 things inside of one test which makes it very complicated to try to figure out what that test actually means if I have tests and I know how many possible branches are through your code the only other question that I need to answer is what is the probability that this thing is actually somewhat testing this other thing is it accidental or is it intent intended to be testing this these are very simple ideas but they can be very powerful it's different ways of looking inside of our code what we really want is we want to remove our fear this methods yellow it has six tests do you think it needs more or less than six tests and I won't even get into what that code actually does I'll give you a hint if anyone wants to read it later it's actually going through and walking back in il2 Fields and delegates it's not fun code but I'm getting a yellow here would you be afraid to change that code would it be beneficial for me to give you a report of all of the yellows in your software and all of the Reds and all of the Dragons this is how I come in and I can be productive on the first day inside of your source base I'm using tools that let me look at your software in different ways I'm removing my fear when I come into a new piece of software what stops me from being productive is fear I'm afraid of what I may break how many of you have sat for two hours looking around before you changed one line of code fear is what prevents us from being productive changing the one line of code isn't hard you can do it very quickly but what does that affect all of the things we're looking at are ways for me to gain a better understanding of what's connected together what this may affect what it may break what is my risk when I'm working in this area all of these kinds of tools can be extraordinarily valuable but if there's one thing I want to stress very hard for you guys do not take these as your code is always bad if it has a sick LaMotta complexity greater than 20 only an idiot would write that code what you look at here is you use these as guides to help you find interesting problems in the code something that has an efferent coupling or couple out of 23 it's an interesting area of code to go look at it doesn't mean it's wrong it's interesting all of these kinds of metrics allow us to find interesting areas and to get into them as developers and research them more so we've gone through quite a few ways of getting more data about our code we are data mining our source control and remember when we data mined our source control we want to see it go up and then go down if it goes up and stays up you either have a bug hive or this is heavily coupled code and other ancillary changes are requiring changes here we went through and we started looking at afferent and efferent coupling which are well known code metrics about how dependencies are interacting we can use tools like end append or sonar to get this kind of information out how many of you are Yahveh developers there is also X depend which will do this free OVA and I've heard rumors that there's a new version coming out that's going to be for some other platforms I can't go into details as to which platforms will be but if you happen to be doing other kinds of development it will be coming soon and we've also looked at how we can even apply some of this in order to calculate things like risk of change to a given area of code all of these again are just data points about our code it's another way of looking at our code and the more ways that you can look at things especially when you get higher level perspective perspectives of your code the more quickly you can find where problem areas are the more quickly you can identify areas that may have problems but again all of these are heuristics none of them are completely correct all of the time a lot of it depends on the style of your code a lot of it depends on what kind of code you're actually working with at that time if I showed you a domain object that had 25 parameters to its constructor you'd be really worried wouldn't you what if it was a message that's not nearly as bad is it and the reason I put them in the constructor is because I wanted them to be read-only there's all sorts of things where it might be really bad in one place and it's perfectly okay in another keep this in mind do not use these metrics as a word from God they are to be interpreted understanding this allows you to get a higher-level perspective of your code and having that higher-level perspective is what really allows you to get and be productive in very short periods of time with new code bases you've never seen before the reason it allows you to be productive it's because it allows you to remove your fear you have a better understanding of how things are connected and your fear is removed about changing something and breaking something else now with that I think we have about three minutes for questions does anybody have any questions actually this is quite funny when we were first building out the tool our main goal was to only run the unit test that needed to be run based on a code D changed we worked on that use case for about a year and all of the kinds of things like the little risk metrics and the graphs and most of those were internal debugging tools and they ended up becoming the real value in the software there's a another little guy that'll pop up we call him Gary if your test becomes slow the icon will turn into a snail and he'll spit out a message saying you just made the test slow all of these kinds of things came about organically but my personal favorite is the graphs because the graphs were actually my personal debugging tool and they ended up becoming one of the main values of the software we will continue supporting it we at this point don't have a so one of the things we're looking at doing is when I have all these metrics in particular for me what's really interesting is the rate of change like as you're typing and I do your build and I get the metrics from the line of code you just wrote did you add or remove complexity I want to focus a lot more on giving you real-time feedback about the quality of changes that you're making maybe with some guidance like I saw what you just did and I think you should put an interface in for me that's really going to be the future of a lot of our development tools is kind of it not Clippy so much maybe a little bit less patronizing than Clippy but the general idea that you kind of have a pair programmer built into your IDE who's watching what you're doing and he's giving you well he's a normal pair programmer so he's actually giving you a lot of critical feedback he doesn't tell you when things good he tells you when everything's bad we could as of now there's no good way of dealing with that you would have to write something that basically an adapter for every single one of them that brought them back up into something else javascript in particular is very very difficult to do this kind of stuff with how many of you have heard the dynamic versus static language debate what's really interesting for me about the debate is that everyone compares let's say Java to small talk or Ruby and they say look how much more productive I can be in Ruby no one ever talks about you put all of this metadata into your code that describes how your code actually works no one talks about all the tools that use that data when you have a static language I can tell you things like what you're actually coupled to at runtime can I do that in Ruby or JavaScript well yes on any given run but that doesn't mean that that guarantee said that's all it does it this also leads us into a fun problem how many of you have tried to quantify how good your tests are that's a hard problem isn't it so we've been playing with a way of doing this what if I were to start randomly changing the aisle of your code there's a couple tools out I think in Yahveh I haven't seen them in dotnet yet that they'll actually do this they will go you were adding one it'll add five does a test break it will basically go through and start manipulating your code to try to figure out if your tests are actually catching the random failures it puts into your code and the way that you figure out if your tests are any good is if they break it's very similar I have been building a database lately and one of the things I have them do is I have them use lots and lots of randomized testing and the question I would always get from them is so we built these randomize tests but how do we know if they're any good and as a really simple answer to this it's how many bugs did they find if you wrote randomize test and they don't find bugs it means that your code is not correct inside the tests you will almost always find at least one bug doing randomized testing something you miss some weird edge case there's some great tools in Haskell in particular has anyone heard of quick check it does randomized testing associated back to your code and this is something I think you're gonna find as a trend over the next let's say three or five years is a lot more randomized testing it especially for edge cases any more questions yes it actually pops out in the tooltip and says in the old days they used to put dragons on the unknown regions of the map as warnings to sailors this is your warning any other questions mmm as of now yes but it's a pain you can export them as P and G's so you would bring up a graph export as PNG bring up a graph export as PNG one of the area's we've looked at going into was basically the ability to do a batch run where you could print out all of the graphs let's say your entire source space any other questions well I guess it's the last speaker today I will thank you all for coming out and I will no longer keep you from getting your beer thank you for having me out [Applause]
Info
Channel: Dev Day
Views: 25,883
Rating: 4.915916 out of 5
Keywords: GregYoung, devday, abb
Id: KaLROwp-VDY
Channel Id: undefined
Length: 59min 28sec (3568 seconds)
Published: Wed Dec 12 2012
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.