Programming with GUTs by Kevlin Henney

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
good afternoon so I've had a talk with this title for a few years but it keeps evolving so this is actually quite different to a number of the versions that I've done before and just a bit of fun because everybody gets exposed to the same languages all the time I've been going through a phase recently of choosing the languages that people aren't using in a particular conference so Michael just showed you a whole bunch of Ruby I don't know whether you're using Ruby or not but I'm going to use Python and everybody always makes a point about certain things being very now or integrated with current thinking but I'm going to go and use bits of C bit just to show you that it's not I don't want to restrict your mindset to thinking that the techniques that I'm talking about here are related to a particular language or indeed a particular framework I'm most of the examples aren't going to even use a framework because you don't need that ok there is a myth that you need a framework in order to test you don't a framework will increase the reach and expressiveness the ease of testing but it will not change the possibility of what you can test ok people get too wrapped up in frameworks a framework will give you a particular mindset which is great when that mindset matches the problem but the problem is that most people are using frameworks that were inspired by ideas in the 1990s or over 10 years ago and don't fit their current problems and they're stuck with one way of testing and one mindset of testing so we need to take a step back from that and ask what do we want so I'll get to explaining what guts are actually I can explain now good unit tests but I'll get to explaining where that phrase came from only think these relevant no they're not really I don't think no is this relevant we aren't entirely sure is this relevant yes it is so this is sometimes known as the Fineman problem-solving algorithm named for Richard Feynman the Nobel physicist and it's kind of how a lot of people try to develop software and we've kind of made an art form of this you know you're right write down the problem think real hard write down the solution in fact people have gone so far as to expand this into a project management style known as the waterfall approach okay analysis design implementation however this clearly is not the complete story because otherwise you end up with stuff like this I just want to point out that in in failure software always loses its encapsulation oh you're sitting there saying I've got I've got state purity I've got immutability I've got privacy of my data items I've got at you've got an error suddenly your beautiful piece of encapsulations dropped on the floor it's cracked into pieces and you can look at where the cracks are and that tells you how something was built doesn't just tell you the language it tells you the care you cannot pay by paypal for orders over zero pounds I think this is not going to be very useful now I have a whole library of software failures people now send them to me I just used to take photographs and screenshots but people now send them to me I've got hundreds of these pictures but the point there is that clearly these are things that are relatively easy to test these are not the big things these are the very simple things these are what one project manager once described to me as stupid little programmer bugs they're not being rude about programmers they're just saying we could have cleared it up just with a simple simple test case and yet our system testers who are doing this manually and are actually end-users trying to experience the system are being held up because of these for stupid programmer mistakes that we could have just mopped up very very easily so clearly there's a little bit more to this process than meets the eye check it unless you are of course a Nobel physicist in which case you feel free to opt out of that one okay so we're going to check it so and clearly is this this is an approach does not really expand well it does not scale well we're not even tight sure that it works anyway but it doesn't it doesn't scale well so we take a different philosophy and best described by the pragmatic programmers test early test often test automatically it's something so important if testing is important enough why leave it till late and why do it just once why make the assumption that you've got a big debt of tests to write debt metaphors become very popular I don't think we should just apply them to technical debt in the code relating to code quality and questions of the architecture I think there's a general idea is that if you leave actions like this writing tests to later in a development cycle and when I say cycle I'm not just talking about some big waterfall thing I'm also talking about scrum I'm talking about the rhythm of scrum the regular cycles we will write the tests at the end of the Sprint or the worst case will have a testing sprint that's just best was wrong but we will write the tests the end sprint don't build up a debt just just get in there early and you're going to do this often then do it automatically sometimes when it comes to the bigger stuff we're not entirely sure what we want to test that's fine feel free to just experiment but once you understand it capture it automate it but here how often how how early well there is a fairly common mantra in test-driven development the test first approach in fact one of the vendors out there even has little stickers I just picked up a couple of stickers for my kids my my boys they're their bedroom doors are like a sort of wall of software development tools of the last decade decade and a half they just put stickers on I just come back from conferences so I've got I've got stuff going back 10 years on on their doors now it's kind of I think maybe we should give the doors to a museum at some point this is advertising over the last decade but one of them is red green refactor okay so it's kind of very standard test first cycle okay the way where people are often taught read write a failing test for new feature green right enough Co to pass the test refactor refine code and tests now there are some issues here with the way that this is normally taught and explained and I've had a problem with this for a long time the problem is this is obviously the standard way of teaching people that's why it ends up on a sticker as the standard piece of merchandising and a lot of people are not entirely sure why it is they want to write a failing test first of all because I mean that's not a big challenge is it right right a test that fails yeah I can do that yeah big deal why am i doing that yeah there's not a strong motivation we focused on an outcome that is actually not meaningful so the focus here is not on intention it's on our on events and these events are irrelevant quite frankly red and green minor not very interesting but even when people get into this sometimes they oversimplify and then get into the red green rituals the first trick is getting people to do the red green thing to just try it out for themselves because most people will dismiss or criticize this without ever having really tried it but when you get them to try it they'll do this but they're so busy focusing on red green that they kind of forget the third bit you can only hold a couple of things in your head as well yeah that's it you're done and they kind of dumb it down red green and it's just like oh yeah and then they'll collapse right I tried to sir I tried this test first programming of yours it's really stupid it gave me really bad code yeah it's a dumb idea alright so what did you do yeah I wrote a test that failed I wrote an off code pass it and I wrote another test that felt like they okay and at what point did you look at your test to see if they needed reworking if you could spot trends in the code and what was emerging did it give you pause for thought did you react to that thought now I just did test first quite literally I wrote a test first then I did something and I write another test first that's it okay so you forgot to bring your brain to the keyboard you are supposed to do that I know it doesn't say that up there but you are supposed to do that now does anybody actually advocate this well the problem is that with anything when people are trying things out there's got so many ideas in their heads it's very difficult to keep them all running at once and we often give people sound bites and helpful advice that are sometimes it's not that helpful sometimes it's too easy to read the wrong thing Uncle Bob has these three laws of TDD you are not allowed to write any production code unless it is to make a failing unit test pass you are not allowed to write any more of a tech unit test insufficient fail you want to laterite any more production code than is sufficient to pass the one failing unit test there you go red green red green red green red green there is no hint of bringing your brain to the party i I have a I have a very old-school idea about what a law should be a law should either be something that somebody is going to throw you in jail for if you transgress it and I think that's a bit serious for a bit of code or it's a fundamental principle of the universe and I don't think this quite qualifies so a little hesitant it is I understand what what I could Bob saying here the problem is that when people sometimes read this literally they end up with a very very curious view so in keeping with the theme that I've been running all week actually I'm going to we're going to solve them you know we're going to tackle that great problem fizzbuzz should be should be trivial for any would-be computer programmer okay so let's do this let's do this out of the box I'm going to write my first test case okay I'm just going to write assertions I'm not gonna worry about structuring assert the fist buzz of one is one okay it's enough Kota pass passes green we're good to go brilliant - ah yep change the code three Oh fears anything divisible by three fears and four is four anything by fives buzz you see this is really classy classy code we get all the way up to 15 which is divisible by both three and five and that's fizzbuzz you've got a job for life if you write code like this okay and obviously hang what was the problem spec all the way at 100 done okay see my tests are thorough I have got 100% coverage okay 100% coverage for the relevant domain that function and this is I'm it's not just statement coverage oh no I've got I've got 100% branch coverage as well and as far as the state space is concerned I got one percent of that this is brilliant you know tick all the boxes ship it okay and we've got we've got as much test code sometimes people say we want as much test code as we have production code done yeah this is ticking all of the right boxes except there's a little bit of you that's thinking that's not really quite right okay let's tell you that there you go done so now it does something reasonable outside it just returns none in this particular case that's what I said I'll do okay and you're looking at that going like well you know I'm sure there's a better way of doing this you think yeah I know because we're all about functional programming these days so what I'm going to do is I have a dictionary of lambdas there we go that's a massive improvement and then your colleague points that it's just a table lookup which is slightly disappointing but it turns out the table lookups are surprisingly powerful and often overlooked technique so yeah there we go you know and in fact if you're feeling really smart you might want to write a code generator to produce this this is how you create enterprise code by the way yeah when you look at this yeah they replicate I'll create a code generator to create this and this will solve our problems no it'll just multiply them by a hundred so you're sitting there thinking like okay well clearly this is actually why fizzbuzz was invented every will be embarrassed about actual real programming let's take another look at this so let's try this test first thing again let's put that back in refine the code and test sometimes people just think it's about refining the code I run a I run TDD courses I run workshops on unit testing and stuff are there and one of the things that I'm guaranteed is that people will forget to look at their tests what you will end up with is a story that starts at the top of the file this is how we were thinking about testing 30 minutes in our test dial has changed 30 minutes later our test style changed in the space of something like an hour and a half to two hours on a toy problem you can end up with a set of tests that is already inconsistent you let those people do this for two years you'll be amazed at what you can get and it's not good the point here is that you're not just supposed to look at the code you're supposed to be a test your tests are a form of Education you're learning about the problem but again there is a problem of dogmas sometimes people get too wrapped up in the mechanism of the mechanics trap came up to me a conference a couple of years ago and he was a little bit concerned he said Kevin when it comes to test-driven development what do you do if you have nothing to do on the third step I mean and I said do you mean red green refactor he said yes and I was about to answer him but he clearly didn't want an answer he wanted to tell me he said sometimes I get to the third step I get to the refactor step and there is nothing to do he seemed worried about this and he said oh so what I do is I make a little change that has no effect and then the next time round I change it back I kind of looked at him and it was just that I didn't do this in front of him but in my head I did that was a pure facepalm moment and as the point is that what we've done is we followed a ritual just as blindly as if we had not done a refactor we have followed a ritual without consideration the whole point of any technique any pattern of practice oh I was wrong the second slide is relevant I use the word patterns but a pattern of practice is eighth is a technique that is understood to work with an understanding of why it works and why it might not work and in what context it works and in one context it does not work and the problem here is that we get sometimes so hung up on the mechanics of something that we forget that we're actually supposed to use these as channels for thinking instead of replacements for thinking so this guy obviously in interpreted as you will refactor whatever you will refactor so he took it as a command whereas actually it's an invitation it's kind of an open question it's like when you go up to it's like when you go to go to the bar as I'm sure some of you did last night I certain and you got to the bar and you turn to your colleague drink it's a question yeah it's a very it's an invitation oh no I'm fine I'm good oh yes please I'll have a farthing had a tequila slammer last night so thank you very much Amanda lasha for that one so there is the idea of it's a question do we have anything to refactor is there anything to do here if yes please go ahead otherwise move along but I don't think this is particularly helpful I have a bit of a problem with this as a kind of way of thinking about although it's true it's not a false structure let's try a different one I'm going to borrow the Deming cycle plan do study act sometimes rendered as plan-do-check-act this is if you like the kind of the cycle at the heart of any agile or lean way of thinking or should be if you're not doing this then there's something wrong with your you'll find that there's something wrong with your overall approach you will have the consequences of this the idea is plan is basically statement here's what we're going to do then we're going to do it then we're going to take a step back there's a pause this is the whole idea and it's important because sometimes people look at certain techniques and they say this technique is going to slow me down if I do this and people ask me you know is it going to slow me down and what they're hoping I'm going to say is no but I say yes which disappoints them the whole point is to slow you down of course it slows you down over the horizon of minutes and hours the idea is that your acceleration is over the periods of weeks to months that's the idea there is a the problem is that we're not very good at perceiving time when are very good at understanding time so we normally live in the present but actually the benefits are not not in the minutes they are in the days the weeks and so on so there is this idea of slowing down so I've done something reflect on it this is the role of a retrospective for example if we take a sprint cycle and don't just think about it and talk about it what are you going to do about it because it's not obvious sometimes what to do but have a go and then go around again so what we've done here is we've taken the idea of the big picture this is a kind of a gem or big-picture process and I'm going to just change it the four R's so to speak right right a test here's a test it's a plan before I'm going to do over the next few minutes this test demonstrates the usage it captures my intention here is what I propose to build it's a scope management mechanism here is I'm not going to do more than this because I just want to focus on this oh I've got a big idea about the bigger picture but at the moment I'm going to focus on just what's in front of me now I'm going to make that work now going to take a step back I'm going to reflect on what it is are there any trends anything that I'm spotting that is interesting or interesting in a good way or interesting in a bad way and therefore do I need to react and refactor that so this is a the idea here is the PDSA cycle works at multiple levels of scale from the minutes all the way through to the months so a blog that Michael wrote a few years ago seven years ago he made this observation now you know so I've talked about mechanics but I don't want to go linger on this idea of test-driven development just the intention that no matter how we do it in whatever sequence we do stuff there is a the idea of reflection and reasoning as integral to the act of testing it's not merely check your codes okay that really sir that really misses the point but there's this idea when you write unit tests TDD style or after you're developing you scrutinize you think and often you prevent problems without even encountering a test failure so there's this idea of the very act of thinking about how you're going to test your code or knowing that you're going to test your code or actually doing the test will change the approach that you take to your code it will naturally slow you down in the short term but it will make you think about why it is there what it is there for now the problem is that this idea of what we want from the test as opposed to the mechanics has got a little bit confused asta Coburn in the same year 2008 made this observation very many people say TDD when they really mean I have good unit tests and this is where the phrase comes from I have guts so the idea of programming with guts and what he was trying to highlight is something that Ron Jeffries had observed tried for years to explain what this was but we never got a catchphrase for it now Ron Jeffries is one of the original members of or one of the original extreme programs and you know he observed we don't have a catchphrase for it now TDD has been watered down to me in guts I encounter a lot of people say yeah we're doing TDD and I'm not really doing TDD and they're you know there's fairly fair to say they're not doing test-driven development in the way we might think of it from a test first programming point of view but they're certainly doing something good and their tests are great you know they're doing good stuff there the problem is we don't have don't have a proper word for that but people are noticing they're doing something qualitatively different so let's just focus on what makes a test good so the first thing about making a test good boss let's pick an example I'm going to pick C because C is a lot of fun because it turns out you can do all kinds of stuff like you can build the internet and you can also crash the internet using the C a buffer overruns are still in the top 20 security breach breaches that you get so and this is all down to the third letter of the alphabet it is all down to C because it has basically unchecked memory rights okay and I can run off the end of an array and quite happily overwrite things it's all good fun you know it's so we clearly don't want that to happen now the other thing that I'm going to do is a very very simple problem what I want to do is I want to take an array of integers so I'm going to point into memory that's something to write I'm not going to change it that's why it's Const I'm going to point it in memory and say here's the number of elements that I want to write because obviously we see you have to tell it how many elements there are an array it's stupid enough not to know for certain reasons for good history and then we're going to write to an output we're basically the write a string a sequence of characters starting at output but no longer than lengths we would be really good to not over run that buffer either okay so the idea here is we're going to take integers we're going to write them out with comma separation now comma separation is a trivial task it's so trivial that kind of every now and then I come across a website that gets it wrong in other words they get the commas in the wrong place they have a spare comma on the end or something like that now this is trivial with a decent scripting language you just go join and you're done but we're going to do this raw let's find out how we're going to do this we've got to describe what it is that we're going do well do we I've got a function here I've got four arguments I need to write I need to account for what happens when I pass in null what happens if I ask for no elements what happens if I don't have enough output space that's an interesting question because it turns out there's multiple possibilities so here's an implementation in c and c c99 okay don't worry about that and here's an implementation in C++ that has exactly the same linkage and exactly the same behavior now i've just made a very strong claim there how do I know it's exactly the same because I thought really hard about it because I closed my eyes and I thought there it feels like it feels good a lot of programmers do that how you doing ah pretty good and your code yep we're kind of done kind of does it work yeah believe so belief it's lovely stuff I believe it does now you need to turn belief into knowledge okay we do this by doing something like this something empirical so the whole point of the plan do study act the damming shirt cycle is an empirical model so let us find out okay what we're going to do is we've got a function therefore logically in my head I'm going to write test function okay function both tests okay so looks like like this okay and you know what I'm by the way just notice there's a dot dot dot there that's important it goes on for a bit it's really it is as boring as it looks it is just as unreadable and trust me if you can read C it doesn't make any difference you will fall asleep by the fifth line okay it's a it's a good trick if your kids can read code and you struggle to sleep churn this is just our God really I mean this might check that it's okay and this is merely checking okay there's a big difference in my mind between oh we're checking it's okay and the biggest problem that we have is we have not communicated meaning if you like all of our software development practices are centered on many ideas one of them is can we get it to run and the other is do we know what it is that's running do we know what we mean are we having the same conversation what are our concepts whether it's the seven ineffective habits that I talked about last year or this idea here recovery of meaning the biggest challenge here is not does it work but what do you mean by it works that's the challenge what are we expecting here and it's not entirely obvious so somebody may come along and say you know Kevin the problem with your test code is you have no comments that's obviously it there we go I put pollen sir okay who reads comments let me think not the compiler and not the programmer so there's your two audiences and neither of them is interested okay but you think oh well maybe there's something here I'm still lacking structure we are driven by visual cues okay we care about things visual spacing visual reasoning so let's do that and now this also creates a separation a logical separation is also scope wise it breaks out and voids any over run from one test case state change dripping into the next one so this is really logical so I've now got blocks code blocks with comments do you know I'm sure somebody invented something for that oh yes functions yeah so we now factor it out so now what we've got is a series of named functions each of these functions captures and describes the intention of what it is trial what the body is trying to show it's about recovery of meanings not about checking it's okay it's about telling you this is what we mean by okay so we've gone from this to a recognition that there's this so if you ever end up with I've got a function now I've got a test you you also have an extra thing a problem yeah you never want this unless your function is hard-coded to return one value you want multiple test cases against a function and every now and then I get somebody say oh well but yeah I could put them all in one function as if somebody is paying tax on test functions and if you are paying tests on tax functions that's absolutely a fine way to solve that problem but I don't think anybody's in their situation ok let's move on a little bit let's go back to this let's look at one of the let's open up one of these cases here I've actually made a decision value so I've got a value that I'm going to write to insufficient output so what this is the big this is the big question what happens when I've got two little output space to write the full number so I've got the number 1234 but I've only got three characters to write into plus a null then what am I going to do am I going to write am I going to overrun well I think we're doing this so that we don't overrun am I going to not write anything at all in other words take it in a more transactional approach you either write everything or you write nothing at all so if I don't have enough space to write 1 2 3 4 I'm just going to have an empty string or do I truncate it and I just write 1 2 3 now I'll leave it to you to decide what you'd like but the point here is that what I need to do to communicate to the reader is tell them here's what I have decided or me and my colleagues have decided or the customer specifically requested and I need to put that in there it's not easy it right is what do you mean by right value to insufficient output writes truncated value just reading that oh it's truncating it ok fine so I'm expecting that ok and we write 42 in this particular case I've got 2 characters to write out sadly one of those needs to be for the null and so therefore that won't work so I should only write 4 and the digit 4 I should only white one character out now some of you may look at that and go oh you know what there's another problem here I've just seen that I'm using 2 assertions does anybody come across that guideline it keeps doing the rounds that guideline that says you should not have more than one assertion in a test case anybody come across that yeah it's surprisingly popular it's completely wrong but it is fairly popular I've dug it up on a blog here the reason I've chosen this blog is not because it does anything wrong but because of one of the responses to it one of the things that Roy Osheroff warns against is multiple asserts in unit tests somebody else noted proper unit tests should fail for exactly one reason yes that's why you should be using one assert per unit test know that wrong and it's fundamentally wrong then we can demonstrate exactly why this is the case here's my travel just imagine we've got some code on to test here's my travel itinerary for the week started in Bristol owing to you track now ended up in Vilnius ok and this is what I'm expecting so I can go ahead and I can write a single assertion ok that my itinerary that I've got from the code under test is exactly what I expected no problem one assertion let's break that down that assertion is equivalent to assert that the resulting array is not null assert the resulting array has three elements in it assert that the zeroth element the first element and the second element are Bristol Utrecht Vilnius and assert that none of this throws that's six assertions oh crap we're gonna have to break out the test know that the point here is that this passes under all the circumstances the previous slide passes and it fails under all the circumstances that previous mother they are functionally equivalent if you write your tests in a style that is so brittle that the smallest change of framework means you have to rewrite your test because you're following a foolish rule says only have one assert then you've got brittle test you don't have guts you have butts brittle unit tests okay all bad unit tests the point here is that you've we're focusing on the syntax and the syntax there is I was trying to say in the original one is that I am writing a single character I'm writing the character for now I've got two ways of looking at that I can spit that out into two test cases say I've written one character or I've ran another one that says and I've also written out the number four these are it makes it looks like they're two coincidentally unrelated things they are the same thing if you want to communicate to a reader that something is related we have a really simple technique for that we put things together if you want to show somebody that something is unrelated you put it apart it's a very simple idea and following a rule blindly as we've discovered is never a good idea the other thing is I've had this discussion with a number of people and they've been absolutely steadfast they said you know what no no no we by the one assert rule and I said you use a mocking framework yes do you have more than one expectation yes an expectation is an assertion phrased for the future well but it's not spelt assert you know not every framework uses the word assert doesn't matter how it's spelt it is a logical assertion or rather it's a syntactic assertion so this is the point Roy doesn't recommend this at all it's a misunderstood thing my guideline is usually you test one logical concept or test you can have multiple asserts on the same object they will usually be the same concept being tested one is a good starting point you should be deeply suspicious of zero one is a good and common result sometimes though what you need to express may be expressed in more than one syntactic assertion even there's one logical idea now this is not news Uncle Bob already talks about this and so 97 things every program should know the Japanese Edition is beautiful but translated back into English for your convenience he talks about the SRP thought single responsibility principle this is the idea cohesion is a principle that applies to test cases as well as other forms of code this principle is often known seeing responsibility principle or SRP ensure it says the subsystem module class or even function or a test should not have more than one reason to change that's the goal here it's a co principle of cohesion now going back to the test code we can now see that there is a these are this part of the same thing and it's mostly because of the way that C is had I've written this in a different language this would have reduced to a single assertion because strings are a proper part of most languages okay but in C I want to make an extra point about this one the result has to be here the bytes written and I want to know that that correlates with the body now if you look closely a little bit of font spacing here but let's make this more explicit there's a three part play here there's a three act play sometimes knows the three A's arrange actor sir we're going to set stuff up we're going to do the thing we're interested in and then we're going to start saying things about it now it turns out sometimes when you're dealing with something that is just a simple predicate simple function sometimes these merge into one that's fine but I want to pull out these three aspects that at the very least we will find that our three aspects we can find if you're doing behavior driven development this is directly equivalent to give them when then I do prefer the BDD way of thinking about it I find that's a little more useful as Jason Gorman noted there's nothing new Under the Sun in fact that seems to be the thing of this conference given when Dennis what we call a triple Tony Hoare made the observation when trying to formalize the consequence of statements of code we have a piece of logic we basically have a sum statement some logical proposition before a statement of the truth we have the statement of interest and then we have a logical proposition following that that must be true so given that p is true when we execute s then q must be true for this code to be correct given when then is identical now given we're talking about logic this is one of the books I learned logic from and you can tell it's quite old just by the font usage people people just don't use interesting fonts anymore the world is kind of either you end up with a beautiful elegant fonts which I love dearly or comic songs this is very much of its time this this I look at this and it says 1970s it's all over it's gorgeous Annie and Yoon Smith makes the observation propositions of vehicles for stating how things are or might be it's a proposition it's a truth and grammatically only indicative sentences ordinary sentences in essence which it makes sense to think of as being true or being false are capable of expressing propositions now what does that mean in practice well if we go back and we take all the test names and take out the other schools we see that these are all regular statements value to null output rights nothing that is if that's true then we've succeeded if that is not met then we have failed okay it's very simple it's a proposition okay let's take that the next step the next step is obviously we go from just recognizing that we don't have a one-to-one relationship with functions we don't also have a one-to-one relationship between tests and methods and this is a very popular problem you know you go to eclipse and you say oh please generate me some test stubs and it'll go ahead and do you something like this probably the least useful form of code generation ever I never want to have this kind of thing you end up in order to test how a method behaves you need to have an object and you need to interact with it you need to ask questions of the object which basically means that you will end up with a very different approach it's also generalizes further we can see it in terms of classes yeah we have a very common thing because II oh yeah I've got a one-to-one arrangement in fact you can see in a lot of directories we have food and we have food tests now we have a one-to-one correspondence between tests and classes and we see this all over the place that's not how you want to do it this is how you want to do it here is a usage I would like to use this particular piece of code in a particular way and it happens to involve this class and that class I want to use this piece of code in a for a different objective it involves this class but not the other class and so on the idea here is you are cutting across the code so this also leads to the question of course of coverage now sometimes I have had pushback on this when I've said to people oh yeah don't structure your code according to don't stretch your tests according to a one-to-one relationship with classes but that's how our build tool works right so you've just told me the problem what are you going to do about the solution yeah people think that that's the end of the conversation that's how our build tool works or everybody understands that that's great you've now described the problem not the solution now figure out a solution other questions pop up when people are always asking about the amount of this kind of stuff well as to our coverage Mountain fail observe from time to time I hear people asking what value of test coverage also called code coverage they should aim for stating their code coverage levels with pride in fact something that Martin doesn't mention here and one of the things that really annoys me is that people use the word code coverage and they never tell you what form of coverage okay there is a default that everybody uses the default is statement coverage and the reason people quote this is one it's easy to get a high number and two it's easy to get that number the minute you head off into branch coverage path coverage and all these other forms of coverage it gets a bit trickier but when people quote high levels they are normally quoting statement coverage in other words my tests touch this statement at least once that's all you're saying which is better than zero times but it's not a great way of thinking about it you need to be specific when somebody says code coverage you say what kind of code coverage and they kind of look at you blankly and then you say right now maybe before you quote the number at me you should learn what code coverages that's a great conversation stopper okay such statements missed the point in the same article quotes Brian Merrick I expect a high level coverage sometimes managers require one there is a subtle difference now one of the one of the things I do is a bit of a hobby and it's Friday two days I do this Facebook page were Friday where I put up unusual words or things about language and one of the ones I put up a few years back was good hearts law which fits this description this problem that we find was coverage perfectly once a metric becomes a target it loses its meaning as a measure if I offer you and I say look we've got 90 percent statement coverage that's a statement about that's a statement about statement coverage time to think I'm using the word statement too often there if somebody comes up to me and says we need to get that to 95 percent ah now we have a problem because now we're pushing on an observation we're pushing on an observation and metric as he observed it's named after an economist Charles Goodhart and his exact quote was any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes the minute you start saying things like we need to reach this and we're going to do everything to do to reach that and even worse when you attach money to it and I've encountered that in a couple of companies when we reach this the team gets a bonus whoa okay your metric is now meaningless okay it's a has lost any value it ever had so we need to be very very careful it doesn't mean that you don't care about it you do care about it it's just that what do you do with that knowledge so let's go let's go back into some code fragments and code examples and observe that leo tolstoy apart from writing very very long books was in fact a programmer he was highly aware of the fact that when a piece of code works it's happy day scenario is well-defined but how many ways can it fail each unhappy family is unhappy in its own way there are many ways something can fail the problem is that when it comes to testing people have a very simple block attitude so here is a very simple set of tests to do to deal with an ISBN type international standard book number which is a code that you'll find on the back of any book you can go to Amazon and see these codes and it has a rule of well formulas which basically means if you pass in an invalid ISBN it should get rejected with an exception the malformed exception now I've got it's all nicely structured here I've got strings for tests instead of doing using function names I'm now using strings and I've got a proposition so I'm using the term proposition this is just an ad-hoc C++ 11 testing framework C++ 11 supports lambdas and this kind of stuff and it fits on a slide if you care and what we've got here the real problem here is not that I'm testing again I've got no I've lost the meaning where's the intention let's look at it look at that that is one of the most common ways that people express here is where I'm testing the validation code I've got a test called test validation and you kind of point out what what does that mean I don't know what you mean by validation what are the rules of well-formed miss for an ISBN you can't tell looking at the example I mean looking there's four examples there can you tell what's wrong with each one you won't be able to guess a couple but it's not immediately obvious and you tell somebody like there's test validation or does that actually mean you haven't told me what your intention is I know what my intention is I want to test validation works if you ever and every now and then you'll find it's guaranteed in Italy and every test suite you'll end up with something tests the blah blah blah works it's like okay unless you're going to actually reasonably have a test that says test that something doesn't work this is a meaningless statement so let's try something else let's try breaking it up oh okay again we're not being taxed for these things ISP ends with more than 13 digits of malformed ice P ends with fewer than 13 digits are malformed ISP ends with non digits are malformed by the way I've simplified the rules for our a little bit here is B ends with an incorrect check digit r malformed it has a check digit at the end calculated based on the previous 12 digits so now we've learned something about is B ends and that's exactly how it should be you should be able to read the tests and you learn the domain in fact I've been very very careful to be utterly consistent I've used the language malformed here because the exceptions called malformed or I named the exception malformed because I use the language malformed hey don't use the word illegal here if you're going to use the word malformed here or vice-versa your opportunity here is to communicate and use the language of the domain with some kind of consistency so there is this very simple idea that we get into that as noted by NAT price and Steve Freeman authors of the book growing object-oriented software guided by tests for the blur for a workshop they did a few years back are your tests really driving divert your development they made they made this point tests that are not written with their role as specifications in mind can be very confusing to read the difficulty in understanding what they are testing can greatly reduce the velocity at which a code base can be changed quite simply because you're looking at the code you say I don't know what I'm expecting I mean the test will fail if I get something wrong I mean it's nice to have the test but I don't know why now what do you do with a test that you're not sure of the meaning you know I'm busy going along and I make some changes and some tests break and I go and look at the tests and I kind of know well I don't know what this test is trying to tell me other than red which is not very helpful so you look at it and you look at the code you think you know what I think the codes all right I think it's alright okay what are we going to do with this test case I know I'll put ignore on it or I'll comment it out or are there and that that's it that's the end of the life of that test case oh it may sit around like a zombie in the source in your source control system for a few years but it will that's its fate you have you've nobody's going to go back in there and say oh I'm going to review this and see if we can bring it back to life resurrect it now then I'm going to do that so let's have a look one of my favorite problems it's a nice small problem um so piece of c-sharp something that classifies when is a year a leap year okay my experience tells me that when you look when you ask an software developers roughly between 5 and 10 percent of people know the full rule for the Gregorian leap year okay just as a just as a as a as a sidenote don't do your own date handling unless you are a total date nerd okay there's that you'd be surprised how weird date and time hammond can get but we're just going to focus on one seemingly simple problem but as I said the number of people that actually know this is typically kind of one in ten one and twenty amongst software developers amongst the large population it's even lower so the goal here is not to test that it is correct the goal here is to communicate to the reader what do we mean by correct what do you mean by leap year leap here is a concept tell me about it okay so the first thing we learn is that there are four kinds of year not to leap years non-leap es there are years that are not divisible by four they have a tickler classification there are years divisible by four but not by 100 those are interesting years are visible by 100 but not by 400 and years divisible by 400 there are four categories yeah okay yeah you're going to run out of reading ability if you keep using camelcase here it's common in a number of languages to use camel case but quite frankly when you start writing out propositions you're going to end up with sentences so I'm going to use underscores even if that's against the language conventions because it's a it's a test convention it's not a production code convention in that environment to test converge okay so we've got that okay let's do something else let's follow a BDD habit a BDD habit is often to put the word should in there years not divisible by four should not be leap years years divisible by four but not by 100 should be leap years years so this year was not divisible by four and there and what there was no 29th of February next year is divisible by four but it's not a multiple of a hundred and it will be a leap year the problem is this word should I one of the things I'm a writer and I I dislike noisy words in fact I'm not the only one Strunk and white right kind of the classic book on English writing style the 20th century make definite surgeons guess what it wasn't just Tolstoy that wrote about tests it turns out Strunk and white wrote about test cases as well make definite assertions avoid tame colorless hesitating non-committal language should what does that mean no yes not divisible by four should not be leap years in English should means yeah but it might be you know I mean I should finish in 10 minutes but I might overrun by a minute you know but you know oh yeah the leap year code you know sometimes it sometimes it misclassifies a leap year but it's okay it should be okay most of the time no shirt is the language of hesitation is that it's a failure to communicate you use use words like should too to build up a sentence to hide what it means it's just that I don't know get get direct no just say what it is years not divisible by four are not leap years and if somebody says oh you know what actually we've got a contradiction to that rule fine because they've observed it you've made a statement and they've said that's not a true statement it's a proposition this is logic and one of the nice consequences of naming your test like this is apart from having to over you you don't over use the word should but your test names are a bit shorter you more your English is more direct and also when your test run the names are a listing of the properties of your code all the things that are green are true of your code and if they're red then that is the thing that is not true in other words you can take the name you just put and you just negate it years divisible by 100 but not by 400 like 1900 are not Li pious well apparently they are according to this in other words the problem is exactly the opposite of the name it's really easy I don't know why I've got the problem but I can tell you what the problem is okay so let's look at some other tests test naming styles sometimes people really go to town with the BDD is I will get to structure all of our tests use and give them when that and you run out of right-hand margin I mean here you must really I mean did you hate your English teacher so much did you really really hate them because this this just shows a complete disrespect for human communication and the person that taught you okay this is just bad English given a year when it is not divisible by four then it is not a sex omit needless words Strunk and white back to back there ah this is better given a year not divisible by four than it is not a leap year that's better because there is no when there's no action it's just a truth I'm not doing anything it just is it is or is not okay okay so there's a the point here is that when we use the language more directly when a sentence is made stronger it usually becomes shorter brevity is a byproduct of vigor you're trying to say exactly what you mean which is a very hard thing to do but that's what we're trying to get out of this now we might wish to break things and use some name structuring so here using annuit I'm taking advantage of test fixtures test features that are not aligned one-to-one with a class or a method or anything I'm just using it because I can get hierarchy and hierarchical hierarchical structuring and grouping is a very natural thing to programmers but it seems to be something that people are relatively reluctant to do in traditional test cases some of the new test frameworks are a bit better at supporting this reasoning and this presentation but it's something that you can you really want to value because what you get is you know leap expect giving a year not apply fall denny's nor leap year given a year divisible by four but not by 100 then easily view but there's a problem with this although it's kind of all there kind of this is not actually how people think about the leap year problem it's not a given and then there's a result the way that people start up typically with classifiers is things are or are not something so the opening position the English is back to front when when is a year a leap year it's not given or something then it's a leap year when you know a year is a year is a leap year when is it a leap year and we opened the other way which looks like this a year is a leap year if it is divisible by four but not why 100 or a year as a leap year is divisible by 400 a year is not a leap year if it is not in other words the English is more direct but we've inverted it we don't use we put the classification upfront in other cases you will find that you want to use the previous approach in other words you need to be sensitive to the problem that you're working with yeah so don't get stuck in a in a fixed pattern of naming now sometimes a fixed pattern name can help I've had people ask me about this and I always yeah we're naming things using a given wind then style is that a good thing or a bad thing I don't know I can't tell you okay let me ask you a question are your test names messy and chaotic if the answer is yes then having a fixed approach is a really good idea because it will bring order to chaos on the other hand if you've been using a fixed naming style and your finding is getting verbose and doesn't fit all the things you want to talk about then now's the time to move on you've now learned enough to know that what to express now something else that's worth noting here is we were paying very close attention you have seen that there were parentheses at the end rather than just empty they are filled and what's going on here well I'm actually driving this using some data now good you know it's a decent testing framework like n units it's pretty decent on this one J unit sucks which is why there's a J unit lambda project to upgrade er but here I can actually choose specific examples I can choose 2012 1984 and 4 as examples of years divisible by 4 but not by 100 I can also use a very simple generator range from year 400 up up to and including 2400 in steps of 400 okay so I've got a very simple approach there to dealing with this stuff now sometimes when people start to get general with their tests they say oh okay I've got it I'm now going to have I see what you're doing here but I think I could put all of these in one test case and I'm going to call it a year is either a leap year or not well you know I think that pretty much covers all the possibilities for a boolean and then I'm going to have a function called is leap year is correct which is just another way of saying is leap year works okay and I'm going to cover 10,000 years so I'm going to get better coverage than your code carefully oh sure that's great so what is your test case look like I need to put this up here because I need to tell people when they look at the PDF no don't do this okay so this so that's obvious really isn't it you're looking like oh well that's it there's any one problem but let's factor out the logic there let's just get rid of that logic let's call it leap your expectation and there is my leap year expectation there's only one problem it's identical to the implementation well done you've come full circle this is the problem sometimes people get when they over generalized stuff they end up having to repeat all the logic on the inside and therefore they will probably copy and paste that logic and you will have successfully asserted true true yeah okay so there is a different way of looking at this and it that different way is to start at this different way of reasoning about it we've looked at the idea that you can hat you want express intention we looked at the idea you can have an example that's the CSV code but we've also looked at the idea you can parameterize with generated or explicit values and we can take that a step further and you'll notice the naming there moves us towards in fact all all naming in some way reflects a property of the code but we can generalize as further as you can see in a number of functional testing frameworks to the idea of actually let's functionally let's describe the properties of the thing as a whole the danger of that is you are more likely to get it wrong the benefit of that is sometimes it will give you the thoroughness that you desire and it will also allow you a different approach to the code that doesn't work with you build up one example at a time so you need to again you need to sense your way you need to feel your way what kind of problem is this what kind of testing approach would would most benefit this code so there's not one rule for everything one root one rule to wring them all so let's go back and look at fizzbuzz so here's an implementation in Python it's a bit boring it works but it's a bit boring it modifies an accumulate state in a string call result and yeah yeah whatever I prefer this one because this one fits in a tweet and there's enough space for a hash tag so you can make rude remarks about that tweak I like this one because it you can't do it in any other language other than Python pretty much it's it's totally gratuitous and if you understand this you will let you understand Python if you don't understand pison then you won't understand this I'm not going to explain it but let me just say that it takes some great liberties with the idea of what a boolean is but that's that lies at the heart of Python so it's very pythonic in one sense okay so how do I know that both of these are equivalent I've given you two things and I've said yeah they're the same what I'm going to do is the range is specified it's a closed problem 1 to 100 and here I can actually generate all the values what I'm going to do is I'm going to have a list of all the results from 1 to 100 we saw that earlier if you want to refer back to the earlier slide remember you type it out one at a time but here I'm going to just take 4 is buzz I'm going to generate that here are the actual results and then they're a set of truths that I want to assert about this there must be a number of things that are true of these hundred elements and then I want all of these to be true ok so in other words that's kind of a universal quantifier all of the things I'm going to say one of the things I want every result is fizzbuzz fizzbuzz or a decimal string okay that's that's got to be true so we're not we're not allowing anything else hello world is not permitted in this set okay non strings are not permitted in this set every decimal result corresponds to its ordinal position so if you find a number say 13 then it should be the 13th position in as if counting from 1 every third result contains fizz ok so the third one should be Fears the 15th one should contain fears because it's fizzbuzz every fifth result contains buzz every 15th result is fizzbuzz no decimal result is divisible by 3 if you see a number it should not be divisible by 3 if you see a number it should not be divisible by 5 the autumn position of every fizz result is divisible by 3 the ordinal position of every buzz results visible by 5 and the ordinal position of every physical as result is divisible by 15 these are the properties that must be true necessarily of those results and then we can just turn it into Python then we go bang if you want to keep the strings around you can put this in a dictionary or you can you can do a couple of other things but the key idea here I hopefully what's been demonstrated is that we kind of shifted in styles but what I've done is I've explored each of these and they all have something in common all of the test styles that I've looked at they satisfy an idea of good unit test and what am i calling good unit test I'm saying good unit test is one that we're express intention of code usage with respect to data but we might do differently we express the intention using names and grouping and nesting we have a code using example that realizes the intent it's the code version of what the name is trying to say and we might illustrate that with one example or many and that example might be explicit those examples might be explicit as I did in the first example I just use a single example or even in the leap year where I use 2012 1984 and for those are three explicit examples or alternatively I might generate them and there's a whole load of things that we can talk about generation but basically a good unit test has these properties there's what is it that we're trying to show with respect to what code with respect to the values that we can pass through it and most testing frameworks don't make these three aspects visible they normally concentrate on to the weakness of that particular last Python example was code usage respected data but we had lacked intention okay although I sketched out on the slide before that's what would make that a good unit test rather than one that is thorough thank you very much for your time
Info
Channel: Build Stuff
Views: 18,175
Rating: 4.9094338 out of 5
Keywords: sotfware, build stuff, developer, conference, lithuania, vilnius, programmer
Id: azoucC_fwzw
Channel Id: undefined
Length: 58min 50sec (3530 seconds)
Published: Mon May 02 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.