Improving your Test Driven Development in 45 minutes - Jakub Nabrdalik

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
but the important thing from the perspective of this talk is that I used this driven development since 2005 and I've made all the possible mistakes I think at least I haven't found anybody else who had more problems that I did because well maybe I'm not that smart but because of that I have I have some experience that I would like to share with you to actually help you out now I'm not I may not be that interesting but whenever somebody gives you a good advice you have to understand what context they're they have because if you if you take a good advice and apply it in the wrong context it will hurt you a lot right so to give you a little bit information about my context I work for a company which has got about 700 micro services more than 600 engineers and then 10,000 servers and I work as a team leader and as a developer as well and most of this is on the spring and and Java and JVM start basically so this will be the samples that I have and I see the problems that I kind of met and I've solved throughout all the different teams inside the company and I also work in Bodega as giving basically giving workshops and teaching people how to solve their problems and I travel from a company to another company and see that the same patterns repeat and again and again so this talk if we'll try to actually summarize and solve this for you now I have to first of all to actually show to you that there is a there is a very good talk about the different philosophical aspects of different approaches to test driven development and this is the thought by Sandra Mancuso and if you want to the philosophical approach and to understand deeply about the different ways you can actually approach the problem you should watch that talk my luck is going to be a little bit different because my assumption here is that you have tried or you use this development but and you will promise like this beautiful landscape and it's beautiful joy right and and journey throughout the test-driven experience right but you kind of feel like there's some thrown into your face and you're not really sure what add all those people so so so happy about because yeah you know it's painful so what's going on so what I will try to do is actually I will try to show you the most basic the most popular problems and actually solutions for them to get you up to the speed with only 45 minutes so that's my limit right the typical problems that people have when using this given developments first of all some people try to test classes and methods basically using unit testing for testing every single class and every single method and this this is quite popular and a lot of people who are starting with resume development think that this is the test driven development in fact and what they do is they find out that there is a problem with it and the problem is this this is a class diagram beautiful class diagram of my production system actually so if you try to test every single class you will have to mock all the collaborators of that class right and you will have to record the behavior of all these different collaborators inside your test and you will have to do that for every single test and every single class which means you will write a shitload of code and it will take time and also what it means is that whenever you are going to try to actually add a new feature or change the behavior of an existing feature there is a large chance that you will break the compatibility of your API so of the class because hey this is just a method right I'm adding an another parameter I'm changing something there and when you do that a lot of unrelated tests will break because they had your behavior recorded inside the mocks so this kind of situation I actually of course I met that that mistake in my life and this what happens is that you change in a few classes and then 500 tests completely out of the blue not connected with what you do break right and you think those 500 tests you introduce some other refactoring out of there and then another 2,000 tests break so if you try to test classes and and methods this is way too low to to be able to actually refactor anything or do any changes and another problem with that approach is that even though you have 100 percent of the test coverage you your code will not work in production well why well because you have just you need tested every single class not all of them together not how they work and how they interact right so that makes it a lot of people actually go into the opposite direction they say all right so if although hyuna testing for classes and methods is basically right should we rather just trust the integration test because well if this works in an integration test there is a high probability that will work in production and they do this and they go into only testing with the integration test and they start and it's way too slow to the to do test-driven development now for an empty spring context actually an empty spring application right out from the Start screen yo it takes 3.5 seconds on my machine to actually boot up and NFI the test right now for a microt service with a lot of dependencies with embedded and so on it takes 22 seconds to start and I could I would not be able to do test-driven development if I had to wait more than a second okay if I had to wait more than a second it's not possible for me why because it breaks my flow basically if something is longer than a second it breaks your flow of thought now people people who do this actually end up in this area where you have the whole application suits tested within 45 minutes and it's such a horrible experience that people do not fire those vests anymore and actually it wait for the Jenkins or whatever continuous integration server they have to do it so is there any kind of a solution to that because too low is too bad to highest to too slow of course there is there is actually this whole thing called modules and you can test you can like find a middle ground there where you can test every single module of your of your system using all the unit tests that you can have and then test integration the integration path just do just enough testing right so why modules make sense and testing module makes sense because the architecture of your system does not change so often if you add a new feature there is not a large chance it will actually change the architecture if you are lucky you are attacked the API of your modules will change once a month probably and if you are very lucky then maybe once in half a year you will be able to change the whole architecture but and it's ok if in once in half a year my tests break and I have to change the mocks and refactor it then basically right so what's the module and module encapsulate its data and does not allow to access its data outside from the public API a module has clearly defined a collaborators in api's which makes it very simple to actually test a module has got all the layers so this is a vertical slicing and it's very much like a micro service in fact when I work with my team with we tend to actually sometimes start with another module inside the micro service and then when it the module growth we take it out and actually put it into a new micro service of its own it's much faster this way and not all modules grow that big right and modules are usually connected with the bounded context which means that they have definitions of the of terms on there so what I will show you today is how to test how to do it test your modules as black boxes using only unit testing there as many as you want because these things go in a run in milliseconds and then if you want to have the and you need to have the integration test then you will if you should test only the crucial parts the parts that bring the money why well because if something breaks on production but it doesn't influence any money of your company and your client basically well let's say that we broke thank you message thank you a message on a label for example after you bought something people don't care about that message people do not care about that screen at all right if it breaks what so what I don't want to be woken up in the middle of the night just to fix it right I can come back later during the day and just just work in it so instead of that I would rather use monitoring right but the crucial part where you have the money you cannot break that because otherwise you will be out of the business so this is also a very natural and I will show you how it works and very important once week with with actually modules based on a very simple example of actually an application when you're just browsing movies and and renting movies like in the 90s right so let's take the first module and when you start actually implementing something after you have of course the architecture diagram and because you have to think about architecture anyway you see you start to do it right so you have to talk with something with some class basically you have to talk with a module so I will call this entrance point as a as a facade so for a film module for example I will have a film facade and I cannot create the film facade by just calling you by JISC on the constructor why because a module is a bunch of classes it may actually be quite big it could be 50 classes all together so if the something is big and it's not that easy to configure then you shouldn't put it into the constructor right so if there is a lot of logic on this I will create another class which is here a film configuration that will give me a fully configured module for my test right and I'm doing a unit testing here so I'm writing some details for example and then I write the tests themselves and and during the testing of course I discover a lot of things because that's how test-driven development works you discover that for example to be able to sell something in the in this module you have to be able to add a movie to the to the catalog otherwise you cannot take it out right so I write that I discover more I write more tests now I want to for example release all the movies so I see that actually taking out the whole database is a stupid idea so maybe I'll need paging and so on and I add all the corner cases for the for everything else and I end up with a lot of classes actually not so many but a bunch of classes to eat to perform that for me and in those classes I have a film which is a domain object some kind of a Creator from type film configuration film facade which you have already seen this is the point entry point I'm talking about and to be able to actually put a movie inside the my module and then take it out later on I need some kind of a database right but I cannot go into the i/o I cannot touch the i/o the reason why I cannot touch the i/o because it's several times slower than working in memory and if you touch the i/o basically you have the integration test out there because you're integrating with the i/o and it's slow so what I can do is I can implement the whole aisle part in memory like for example writing an email or a film repository right but how hard is it actually to implement a database in memory well it turns out that all you need is a hash map that's pretty much it and okay this is a full implementation of a database it may not be the best database out there it will definitely break if you put more data than you have memory for example and and it's got all kind of issues but for testing purposes it's perfect it's even perfect if I want to run it on my machine and you show it during the demo or just play with the application on my laptop right so I do that and then the most important part I have the film configuration with with the metal film facade that will give me the whole module configured and I have all the code that if the main code the logic out there to actually test my idea whether this whole module make sense I'm working really fast because so far I have no spring no frameworks no nothing right basically just playing Java and anybody can work with playing Java and and have a very fast experience or copy notes or whatever right I have verified everything I wanted and I have all the corner cases covered all right and this is really fast now let's add the i/o so I want to add another test and this time I'm adding an integration test there and for the integration test and this is an honorable one I'll show you how to write it better in a moment what I do is I actually do not test every single feature of the application I will actually go into the finding tooth in two movies and I need to enter two movies into the module for example and then finding one of them inside the same test now this is usually thought as an anti-pattern because you have several things pasted in a single test but I do this this here because I have already every single method every single interaction point with the module tested in the unit test and here I can focus on the flow this will verify the the integration part of the application the the part that brings the money but what I do here is I set by having just one test for example here I'm saving the time I would otherwise have to put in make an insert into the database for example right I need the data only once and then I can tell verify different steps on it so I'm cheating a little bit just to gain speed and after the integration space of course I have the i/o which in spring is very very simple I have the controller at the interface for a repository which will be implemented by the spring data for example and that's pretty much it but now my configuration looks a little bit different and here is the trick now I have a configuration I've added the configuration annotation from spring and I have the film facade here and this is the method that is going to be called in the integration test right and also during the production and this this part is building the whole module however it needs the i/o it needs the implementation of the classes that actually talk with the i/o and I have another pet another method which creates the whole module again however this time I have no parameters whatsoever and this allows me to create in the module with the in memory a version of the i/o classes and the important thing is this method should always call the other one right why because if you're building the same module in unit tests and integration tests in different ways then you will actually have two different modules that there is no you may not be sure that it will actually work in production now here the only thing I need to replace is the i/o because well it's too slow for me to to use right so that's what I do here and the clue here is not to let the Isle out during the unit testing part why is it important well let's say that we were we were to use this method actually in the unit testing giving the film repository we could write it by just giving it a mock from example from spark or we could even create the in memory film repository and just pass it through to the best and pass it to the film facade right here but the problem is and I found it in the how it works in practice with people if the developer has access to the internal state of the module by just having the reference to the object that that keeps the state right what the developer would do is actually verify verify the state inside of the behavior of the module so actually we will not see a test that talks with the module to put a film in and then see that we can find on streams but instead we will see that the developer tries to insert this in directly into the repository and very or verify after inserting whether this thing is directly in the repository and this breaks the encapsulation if you do this then you no longer have the modules as blank boxes but you're looking into the integration the implementation part and if you do this then whenever implementation changes your test will break so don't do that put the IO out and don't let the developer actually use it so if if we are doing this and I'm replacing every single IO part as buy more as my own in-memory implementation because it's so easy right so our most good for anything or stops and actually they are they are good when my module needs to talk with another module for example because then I can test my every single module independently using mocks and this will allow me to actually take them out into different microservices shuffle them around or replace them with some of the serf product for example so if I have an article and if I have an article module that talks with two other modules then I can give it notes and this will allow me to have the behavioral verification which is I'm just testing the behavior of the module but each module alone right and this is basically behavior driven development it's not going through the UI it's actually testing the behavior of your systems and in this case the modules are the system's and if you want to go even further with events and you had a plenty of talks yesterday and today actually on on events you should do that because then the only the only thing that will pass to the module will be some kind of an even publisher or even bus and that's pretty much it so that's the first trick to have all the tests on the unit testing level for module now the second trick I want to talk about is there it is the problem of too much information whether you have too much information it doesn't help actually and it's it's the same situation as if you have not enough information let's have a look at a test that creates a problem this is a real test from a real production system okay and it's a good test in the terms that actually it works it verifies the correct thing but it's not very easy to look at now what happens here if you look at the labels this is Spock framework by example by the way then the test phase should find images by category and the first it's given image in category X okay that's wonderful easy we have an image in category X but then you look at the card and you see Rambo X oh that's interesting why is it Rambo does it break if they put command order will they fight with each other what happens with that and then you see some other code and then there is an image in category Y and then there is a block of code and there is the third one which is image in category x and y okay so I have actually three images but there is a block of code that actually stops me from seeing that and I need to filter filter it out and then when we fetch by category X then only the image from the category x and x and y will be visible right so can we actually do something with this this is too much information we without losing any important information for the test we can actually broke it and we have refactor this to that you have there are three images in category X Y and one in x and y and you see that in the code and then when you fetch by category X then you will have the one from the x and x and y and this is the same information without any clutter without anything else that is not important for a test and this is much easier on the eyes much easier to understand and reason about so the important distinction here is about the implicit and explicit information what should be explicit what should be put inside your test is basically the important information to understand the context and requirements of the feature or the test right and implicit information is hide everything else if it's not important whether it's Rambo commando hide it out right and for every single line of your test you should ask yourself whether this is crucial for the read for understanding the requirement or and to give you a few examples if I need a film from a catechol catalog I can say pursue the film's first because I do not care which one it is okay I just need one of them that's it if I need a new release from a catalog I can say persisted films find film type you release and let's split that's all the information that I need to show instead of like create using a constructor and so on and if I need to have something more more complex than that and I know I need for example an unpublished article which is hard to get then I will move that out into another method which will explain what it does without showing all the gory details which are unimportant from the perspective of this path okay now to be able to do this for every single module we found out that we need to prepare testing data or basically the inputs and sometimes even the output because the outputs quite often are inputs for another module right so for every single single module we create classes which we use example something right to be able to do very fast testing how does it look like and and to be able to modify them as we need so for example if I need a new article in my system I will call a static method sample new article that this will this will provide me an article with all the good information out there but if I need an article which has got an illegal illegal title basically an empty title what I would do is I would call the method sample new article given in the title of now so that I have and without giving any other information because everything else is corrected there and if I need the same thing with all the other properties of this object or other objects underneath the I did it and I create an API and a sample data that I can reuse to just point out in the test what is the important part how does it work underneath where underneath I have an ugly method that takes the map of single object then it overrides actually it adds this to the map which means overriding everything I have defined here with the map of all the good data and then I use the builder to basically do it in the type safe way and created any object itself right so I use a map out of there because it's a very good thing especially if you're using groovy because it allows me to test both my objects and I can also see realize this interjection for example and use it to verify to use the same data in unit tests in integration tests and also for verification of the outcome if I want to so after creating all the sample data it's very easy now to create new test cases for the unit test however there is one more thing if you're writing into integration tests like talking with the HTTP and so on you have another problem which is every time you want to test something you want to write another another interaction with the system you have to think about which protocol it was what's the URL basically what are the parameters I want to call something to do for the system to do something for me but I don't really remember what should I call and what happens is usually people going to the controller for example or any search of the classes and see what are the strings out there so instead of that I have another proposal for you and it works great for us which is encapsulate every single common interaction in the integration tests inside a for example a trait it could be static method as well but I'm using trait traits which in groovy are just like interfaces with the state and I'm actually grouping those trades by the name of the endpoints I have operating on article endpoint of Burton on article actions endpoint and so on so that when I open a test integration - I know exactly what it talks with okay makes it very simple to reason about now then when I want to post a new article I don't have to think about what URL I'm posting to I just call a method which it has domain a meaningful name like post new article to actually create the article giving them minimum information that I need if I want to update the existing article I do this if I want to preview in an article I do exactly the same thing so I'm hiding all the gory details now underneath that I have the my trait which will have all the methods which will actually hide underneath whether I have what what is the URL what are the protocols what is the content type and so on so on underneath I will have another trait which will even allow me to actually hide whether this is an asset or or not or synchronous a call and allow me to actually get the information out and so on and so on and the important thing is this when you're writing a system and you're writing some eight API for the system like external API like HTTP for example you need to make important decisions about what is the protocol what is the URL what is the schema and so on what is the payload everything right but you need to make it only once and you should make it only once and not make the developer which is yourself next day think about it again because you don't remember it anymore and if you if you hide it underneath English question those traits in the in in methods what will happen is that will allow you to easily write more tests or explore the situation and here is an for example an exploring test which we'll call the possibility which goes through the whole lifecycle of an article in an integration test just to see what happens if the user clicks so-and-so and you will see here that there is a positive by our part the he will have passed new article update existing article and if you see how easy it is to actually do after you have those hidden away then you will be able to write as many as as many different scenarios to see what happens on production when you have a back report for example as many as you want and this will be very very simple ok ok so now we have the testing data we are testing only modules we have the gory details of the integration part hidden away so we don't have to think about it again once is enough but there is actually one more technique that I would like to show you which is very powerful and this technique comes from the realization that how what is the best way to actually explain to somebody a new requirement and if you think about it most of the developers would about will say that it's ok if I need to explain to you how the system should work the best way I can do it is give me a whiteboard I will draw it right I will talk you with you and I will draw some pictures and can we do the same thing in in our tests well it turns out we can so for example this is another example from the production we were writing a test for what happens when you add a new category to a three of categories there and the first test that we wrote look like this ok and again this is a correct test it tests what it needs to test right but it's like an analogue of code and the question is could we make it easier on the eyes maybe ABS empty spaces somewhere right but if you were trying to explain to me what happens when you add a new category to a tree of categories what would you do you most likely would draw a picture of the other tree right and then you'd write show me where the category comes in or you draw another picture of a tree with a category in that's pretty much it can we do it in tests of course we can we have to start for example with actually creating the tree of categories right so here you have a tree from of categories when you see that the root is a right and everything is underneath and then when we want to add a new category and let's say that we want to add a category C under G at position one then what I do here is C plus G at one which is the minimum possible information actually to under to explain what the test does right and then when I modify because this creates only the DTO for magnification and then when I actually interact with my module to make it happen then the outcome can also be represented this way and it's quite obvious what's going on you know we have added the G at under C at position one right so that's very very simple on the eyes very very easy to understand okay now what on earth is this right if you're using groovy then it's just a simple thing because you have to create we have to create some kind of a class which will represent a node of r3 and as you can see I'm an octal decline all those thing ABCDEF and so on because it's not important inside the test and this is a very simple thing it's got an ID and a list of children which is also the the category nodes right so it's very very simple construct and then we have this part which is calling a method without a name and in groovy that's called that's a method cotton named call that's pretty much it and we have a nice representation of a tree right so what's with this part when we add something oh this is also very simple because this is just overriding an OP a plus operator and I use a plus operator because it's very natural here but you could you could use a method as well and like once again I create a static internal class category note that position which is just a category not an imposition and I override the operator there and an ADD method and that's pretty much it so as you can see this is a very very simple DSL very very simple classes to actually allow you to express your requirements and the whole feature much better so what happens when we move a category because this mechanism tastes best if you reuse it right so if I want to move a category then I again draw a three of those categories then I want to say okay I need to modify the tree moving category best be under F right and of course I get the category B under F after after Y all right so this this works better and better the more tests you have and because all all the tests that I have for the module and there are plenty of them then it pays off very well and this is just the right shift operator or even groovy which is just a method called right shift that's pretty much it right ok so what's another example of that because you can see ok that's pretty that's that's great you got categories but can I do it without the groovy for example without operator overriding sure let's say that we want to move articles when categories move in an Allegra for example articles need to follow those categories because otherwise with eventual consistency you will not be able to find articles anymore so I'm declaring here six categories which actually are not categorized these are to category and you can see that three kind of here like the the root level will be c1 and then you have different categories underneath and I declare also all the articles in those categories so I have six six articles underneath as well and then when I want to say okay what happens when in the category moves with that article well I need to say that I want to move in a category so what I do is I reuse the path to see so I will use this one okay and I will say this is my source that my destination is the path which comes with C then B - then C - so I'm moving the c200 B - and then when I call it I can verify this again without drawing a tree this time but I can verify that the current path for my article I see three which is in category three is actually right now C 1 B 2 C 2 or 3 - right so again by creating a very simple classes and very simple DSL I'm simplifying expressing the behavior out of there and the requirement for the test and you may say okay Jakub this is wonderful we are not working with trees ok so go away this doesn't work for us but you can think about how how many places you can actually apply it at this pattern of using your using the white board and for exactly for another example if you were actually to create a system for car sharing and you needed to find a closest car to your your position what would you do on a white board I would draw a map right do you want me to draw a map in ASCII art in the pet inside the desk now it might be a little bit of an overkill so what can we what else can we do well actually we can do this let's say that our position is 0 0 for simplification and let's declare for cars and one with position 2 and to the second B with 1 and 1 and so on so on and you can clearly see which one is the closest actually to the position 0 0 and then let's say let's talk with the system and tell them that ok these cars were detected they are registered but only two of them are available and it's obvious right away that what you want to have if you if you want to search for the cars from the position zero zero will be that V and a in this order because of how close they are and this this is all right so you don't have to draw a map but you create kind of a bit yourself just to a representation of the logic inside your module right and this whiteboard approach is very very simple just think about whether when you are writing a new module think about whether you can represent the same thing inside with something different by creating a small DSL and you can think about it okay if I needed to explain that to you using a whiteboard what would I draw okay and it simplifies the requirements a lot it's really worth it and the clue here is very very simple you have to remove every single object that clutters the image so that the image is crystal clear and you send the message throughout without any other clutters right you're acting like in this picture when you see a men attacking a river with an axe which is obvious because you have no other object that will take away from your view right so this is this is really an essence of the of working with good unit testing good stash right and if you if you follow that then you may think okay so now we are writing domain-specific languages for each module because each module can have a difference but the domain-specific language right what if we don't need to actually do it maybe we should just go as high as just writing all the tests in in in plain English for example right and there are actually two different mmm select the types of testing frameworks one of them is like a cucumber and JB half which is writing a test in plain English and then writing some glue code underneath right and the others are like it's a spoken J unit when you both mix the the explanation for the test which is the comments out there and also together with the code right and the first one is the very high abstraction level which is okay this is my perfect DSL only English whatever I talk with the client I will use it there so there are some benefits of using the frameworks for as plain text but the main benefits are that the business is supposed to actually work with those tests so business because this is plain English business should be able to actually write them to modify them and to do exploratory testing which is verifying themselves without asking you ask for anything now from my experience this never happens okay this never happens why business does not actually want to write anything they want they would rather prefer to call us and explain to us what needs to be done why is it so well the reason is business people are not trained in logical thinking okay you do not thank you you do not need proper logical thinking and logic okay to do business the real world is messy it's not binary so basically you can scream a lot and somehow things will get faster try screaming at the computer nothing gets faster if you break the logic in a real world things will still happen if you break the logic the compiler will never forgive you that's simple so we are trained in in logical thinking and in strict logic right businesspeople are not they will get very very they would very to get confused and angry very very fast if they have to work with those files even this is even though it looks like plain English just because there is compiler underneath so they prefer us as an interface to actually explain to us what they want and we can actually implement that in the code and this works best so the only advantages of using the plane the best in plain English so I have never seen them in in real life never seen them used this way in real life because business doesn't want to use them and there is also there this advantage of actually writing all the glue code it's a little bit harder so who is patched together with the code it's much faster from my perspective to write and maintain them because you can see all those things together and the outcomes of those tests are just as readable as the the fact the files in plain English if you use labels so if you're using Spock I recommend that plug-in which is called Spock reports which creates beautiful reports of the tests that you can send to the client and say hey is this dispatch how the system should work for example why not business cannot modify them but that's okay we don't want the business to mess with the code at all all right business doesn't want to do that as well so if you think about it which one is better the question here is actually about the level of abstraction you want to have by writing a DSL for a module you want to have a level of abstraction that will be high enough to hide to solve to focus and help you focus on the important information so that you will see everything that is important and nothing beneath because if it's low enough at the same time actually to help you work with it because you have to write the code right so if it was playing English and you have some other interpreter underneath than you do it there will be a compiler underneath that then you have two levels of abstraction before you actually touch the code and with those frameworks it's much easier to actually do this right so now we know that we can actually write all the unit testing and have the modules have that sample data have the interactions and how to write proper tests that look very very nice but still sometimes your modules and your systems will get slow because of how many interactions with the i/o you have so for example I have a module that needs to talk with elastic s3 Adwords and Kafka that's a lot of Io right so I can test all of that but even my integration test even if I do the happy paths only right and the part that only brings you money I will not test all the corner cases in the integration test because I have them covered in the unit testing then I still have a problem of this running way too slow okay and the question is what can we do about it well there is actually a very simple trick that I can show you you can say first of all you have to think about which of those iOS change together with your domain code the most right and for example do my adwords client change a lot when I change them when I do refactoring in my code not really this is pretty much set right thus kafka events that does the events that I sent to Kafka change when I when I work with my domain code not so much because actually these are the events that are consumed from all the different parts of my company which means any system can touch them and actually I use other as well so I will need to publish a new version of the schema so I try not to change that too too often because other system would have to actually implement something to ver to be able to consume the new version or just ignore the new version anyway right so these two do not change so much same with s3 and even elastic actually does not care that much about my object it will you can throw everything at it and it will just index it and the only problem with it is whether you want to search with the same wage as you got on the very beginning the thing that changes a lot when I change the code in my module is actually MongoDB right because I need to migrate the data I need to think about what happens underneath with the i/o so what means then is that I can have all the integration tests with the MongoDB because this changes a lot and I need to fire it often but I can then do something else I can treat all the order and all the other actually endpoints and all the other mechanisms as libraries so I have an API from Adwords and I have an API for elastic search I can move my whole API the whole code that actually talks with this and does the integration into another jar and if I have in another jar and I have it completely tested out there with only integration tests because why would you need any unit tests for integration with elasticsearch does it even make sense to have unit best for testing elasticsearch you want to test this whether this thing works or not so I move all the integration tests out there and do only integration testing inside the jar and then I can use this jar in my project as a library whenever you take a library from the internet you're not testing the library or maybe you are testing the library just to find out how it works or is it works at all especially it's a nodejs library but then after that you don't have to think about it because you assume that there are tests and all the corner cases are covered right so you can do the same with your own code removing everything in two different jars and then what you do is I have an of course I have an in-memory version of my elastic right because it's very simple to basically elastic it does the same job of any database so again you need a hash map and you're pretty much set and so on for every single library and the clue here is to remove it and move it outside of your everyday build because every day because you don't want to pay the price for every single time you build or test with the application for having an elastic somewhere on the class path that's pretty much it so that's very very very simple thing that you can use to help you remove those slope s and then one person in rostov actually asked me about you know that's all great and cool but our tests still run very slow and I will show you how so test run actually for me so here you have a production micro service one of the micro services here you can see I have my - there about two modules in this micro service I have my micro service tests that I have 478 tests running in for under five seconds right which is about the average I usually get okay but if I want to have integration tests even without the elastic search and everything else for example I will have a situation like this these are my 133 tests running in 50 seconds okay so now you see that I I'm a I'm okay with waiting one second for a single test when I work with it and I'm okay for waiting about up to 10 seconds for all the suit of all the tests okay but if I have to wait 50 seconds every time I fired those tests that's not good enough for me so what I do is I will use the integration test testing only the crucial parts that bring me money after after I've had all the unit tests done so I fire this guy only once in a once in a while right and this allows me to actually go fast with development okay so the guy in Rafah actually asked me okay so that's all beautiful but you know what we are using that given development writing a lot of unit tests and even though cine tests are slow even though we do not touch the i/o and I said Wow how is it possible how many things do you have and he said well ten thousand I was like you have built a system that has got ten thousand tests you shouldn't have done that you should have done several systems that have much less test than that because what is the change that you actually have to work with all the code base at once zero okay you cannot put it into your head this is my micro services were invented or rediscovered to be to be honest right because we don't have you cannot use all that code at once anyway so why put it all together that's pretty much it and your goal is to keep it under 30 seconds actually the unit testing I would say under 10 seconds because it turns out and there is the research from 1993 from Jakob Jakob notion about the importance of time and after one second everything that takes longer than a second breaks your photos lot of flow of fault right so if you are programmer and you have flow waiting more than one second there is a flaw if you wait more than ten seconds it loses your interest so what it means is that you're on the Facebook now on you're on Twitter or maybe you went for a coffee or something else right so if you have to wait more than 10 seconds for the whole suit that's a problem and if you have to wait more than 30 seconds for the whole suit of unit test run there you will most likely not fire that suit anyway because well 30 seconds is a lot of time I can have like three Facebook's open and with integration test it's also pretty bad because if you research there was the gentleman from Sabre I think in 2010 showing us the research that they did there that if you have tests the integration tests running for more than if three minutes on the machine of the developer the developers tend not to fire those tests anymore and just throw it to the Jenkins and say yeah yeah well the integration continuous integration will actually fire this and tell us what happens and you have you need aggression tests that take for 45 minutes and actually I would never fire them on my laptop because what would what else would I do during that time right four to five minutes we can have a meeting which is a waste of time okay so remember that you have the budget for tests and you have to be careful with this budget and this budget is very very small and for all the tests together the integration and the unit test and three minutes is completely maximum okay and aim for that so to summarize the things I've shown you focus on testing modules instead of classes which is too low or doing on the integration test on the whole system which is too high and too slow basically that's the behavior that's the black box and not the implementation so don't don't don't let it don't let the implementation go out and and do not touch the repository out of there in your test prepare sample data for the module for every single module and reuse it as often as possible your experience will be much better with that heΓ­d API for integration under meaningful names and meaningful from the domain perspective right build a small DSL whenever you you think that you can express it better think about it how could you do it if you could if you had another whiteboard what would you draw and maybe you can build a DSL for each single module because each single module can have a different DSL because it's bounded context usually right it have even a completely different architecture as well and then think about explaining this at the whiteboard and extract every extract all the slow integration tests that you have with the i/o and move them to jars create libraries for that see what things you do not need to change as often but you actually pay the price for every single build running this okay because it slows you that down and if it if it's something that you can move it away now work with the 30 second maximum budget for unit testing or maybe 10 seconds I would prefer and you can happen in 30 seconds you can easily have 3,000 unit pass no worries right and then work with trimming and budget for integration from the start and remember that tests are the specifications are the requirements so you should really pay attention to actually how you write them to express the requirements and to be able to reason about what should happen inside the system and it will also help you to actually communicate with your client and expect this job this job of refining requirements and actually creating the scenarios to be very very hard which I talked about yesterday so if you haven't been on my talk yesterday you can watch it on YouTube I suppose so that's pretty much it I would like to thank you for from here I would like to thank my team for embracing all these ideas and actually using them and and improving them all together the guys you you are awesome right and if you need this presentation later on it's already on the web and if you need a job you can contact me as well thank you questions do we have any questions there is the question hello okay you prefer to use their in memory repository what about testing the real repository do you think it's needed to not with real repository with a database in the implementation of real repository yes so I do test that because these are the integration tests right and I'll show you how it works now here I have the in-memory file in a more repository right and I have the unit test so far everything is unit tested but then I write integration tests and here I talk with the HTTP and I start implementing this and I talk with the HTTP and with the real database that I have in memory so in my case this was the MongoDB for example or maybe h2 if I if I use a rep SQL or a relational database right and this is pretty much it I don't need to actually test the repository out there because the flow and the only sql's that go there and all the queries will be fired here right okay thank you so what about getting to the point that your DSL is so complex that you actually would need to test the DSL or maybe in a less clean approach and that I have seen tests that have this plumping method that goes for like 300 lines and you can read the test but they really believe you know what's happening in the lambing method ok I'm not I'm not getting the clue here you're saying that what is the problem with something very large so I've got a long-running transaction or with now I mean once you develop that DSL it gets yet more and more complex so what if bugs start appearing in your DSL instead of you know that's a very good question what if we created the DSL and it's so hard that how do you test your test basically right the answer is you don't and the reason why you don't is because the chance for actually let's let's so let's so a DSL for that right there actually the chance the chance for you to screw it up twice inside the unit test or basically the DSL right and again in the production code is very very low so you can just not worry about it and also there's another reason for that is you still need monitoring right ok so that's given development is no way to actually you know just throw away your monitor and you still need monitoring a production but how often do I have a problem where I've made a mistake on the test and I also made the same mistake on in production code maybe that happens month in and month one month in two months I don't remember exactly because I don't remember such cases because they are so rare so I wouldn't worry about that I'm actually your DSL is growing very very large but if your DSL is growing very very large the question is maybe your modules are to be take a question about those in memory implementations yeah I all related stuff so what if you want to use some more sophisticated features of by or backing database so they are not easily or it's let's say it's not feasible to implement them just by trivial hashmap so this is very interesting thing because actually this is what I expected I expected it to not be so simple to implement for example in memory version of a database because I have all those queries which are very very strange right it turned out to not be that don't be true and not be a problem at all and the reason why it turned out not to be a problem is because if you do not if you're still in memory only in memory and CPU so if you need to find something if you need to create a hard query for example or create a graph using you are using possibly SQL and you need a graph part or you need a JSON you're just storing JSON stir or doing anything anything fancy for example or just maybe going with pointers and just streaming the data right in all those things are very very simple in plain Java if you do not care about the size and in test you should not have a lot of data why you should not have a lot of data because every single object that you create will actually cost you something so you need only the data that you you should have to actually verify your assumptions that's pretty much it so it turned out that whenever we need something fancy and maybe I don't have an example of a fancy here then all we need to do is stream filter map and so on and it's done right and if you have ten elements in the in the in memory database but you need so I haven't had I haven't had an experience of actually requiring kind of a specific behavior that I couldn't model in pain Java it's always the other way around when in playing Java it's very very simple but if you want to have that on a big data set you need a real database that's it so one comment to that from my experience what comes handy in such situations are contractors where you have the same test suite for multiple implementations so you can test your fake implementation and your real implementation with the with the same suite of tests so they they share the specs but like for instance ng gene it is possible to have an abstract test class which specifies test cases and then implementations do all the setup required so one implementation can run with a spring spring runner and the other in memory yeah so I haven't shown you this here but actually I used the same approach because what I have is when I write the module and I have everything you need best bets right I write those exceptions respective specifications or tests as well there and then I copy and paste them to the integration test and I just write that integration code just to interact with the system but it's basically the same test we found out that it's better for us at the very beginning we thought like ok if we have the acceptance specifications in the integration test maybe we don't need them in the unit test part and we just need to attach the corner cases but that's not true because I want to find a unit test and the unit test comes first this is the first thing I produce so we return into the situation the other way around so all my tests are actually tested as you said the in-memory version and the real database at some point this peste there are two separate tests for me but they test exactly the same behavior how do I keep them in sync I don't and I don't keep them in sync because they they never break separately because if it breaks in the unit testing code or basically in the pure Java domain then it will break in the integration test as well and if something breaks only in the integration tests then you know exactly that this is the integration problem because for example your query is wrong or your integration password part with the i/o is wrong right and this is what you actually want to have no questions more questions I have questions about testing within memory databases we encourage them their problems but we for example were using Postgres SQL but for the tests we used h2 in memory and what about keeping the schema because for example you can have more complex things done in the Postgres while h2 doesn't support that do you keep the two versions of the schema or okay what do you do so first of all I mean when I say that I do not have the situation when my in-memory database is h2 I treated that as an i/o to be honest right so when I say talk about an in-memory version of my io then I'm talking about just playing Java class right no no it takes time for the h2 to actually get up okay and this is way too slow for me but testing H 2 versus for example possibly SQL right now I don't have that situation right now I'm using but before I had that situation and sometimes if you're a very fancy with your database and using Postgres cos whisky was beautiful and it has got so many features then actually what you should do stop okay there are many approaches for actually testing and one of them is to have your database start up in memory and it's beautiful because you control the whole thing so it goes together with the code so that whenever you're testing with this with the code and you want to for example see how another branch works you just move to that branch and you have a proper schema for that branch but this is not the only way the other way is actually to have a single database inside the company all right for your project this is a testing database which is usually in table you do not modify the data day there and it's very simple because they just do roll backs after after each transaction right begins and you test against the diff why would I write an integration test against the h2 if the integration that I really want to verify is the possibly SQL right the only reason why I would use h2 is because I tried I trust h2 to be exactly the same with simple cases as the Postgres possibly SQL which is the case for me but if I had something more advanced I would throw h2 away why would I need that because I'm going to use advanced features from the possibility well and if working with the PostgreSQL actually has it is a problem for you inside the integration test because the possibly SQL is somewhere out there and education may have a different schema although Flyway and liquid-based will help you a lot with that but you cannot change the schema just for one test and for example two people are working with a database so if one of them changes the schema it also changes for the other guy then you may think whether you want the this single feature of your PostgreSQL that you're using let's say the object store or whatever else and treat it as a library and hide it underneath and in the in your integration test use only that memory version that works on the h2 but just streams the data for example and and just filters and maps everything but have the external jar for the PostgreSQL where you will have all the integration test against the possibility well and you then you can even slow things down and use a docker fired up the real possibly SQL fitted with data and just have a very very slow test but you will not touch that on every single build so it doesn't matter that much okay if that could be the last question then because we have to clear out the what you just want to talk to you Cooper you can just come up to him thank you [Applause]
Info
Channel: Devoxx
Views: 30,396
Rating: 4.9120135 out of 5
Keywords: DVXPL18
Id: 2vEoL3Irgiw
Channel Id: undefined
Length: 59min 59sec (3599 seconds)
Published: Sun Jul 15 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.