>> HEVERY: So today, we're going to talk about
how to write untestable code because we're also good at it. How do you write hard to test code? >> [INDISTINCT] analytic. >> HEVERY: Some analytic, okay. >> Non-deterministic. >> HEVERY: Non-deterministic, exactly. Okay. That's good. So, it is something interesting, right? We--when asked these questions on an interview,
most people have a really hard time answering me what--how exactly they would go about writing
hard to test code, even though the code they read is really hard to test. So, we are intrinsically good at this, even
though we don't--ourselves are not--we don't know how exactly we're good at these things. It's kind of like a spider weaving a web,
you know. It just knows how to weave a web and you can
ask him, "How did you do it?" And the spider says, "I don't know. It just kind of works." So, this is what people normally say. Like, make things private, use the final keyword
where you have long methods and you're pointed out, doing stuff monolithically that kind
of goes with long methods, et cetera. Non-determinism, that's a good one too that
I don't have. But here is the thing that--real issues of
unit testing. And then it's mixing the new operator with
your business logic. I mean, I get to why exactly that's a problem
in a second. Looking for things, and we do this on our
code all the time, you know. Doing work in a constructor, that makes it
so that it's really hard to instantiate things inside of your test. Having a global state, which is essentially
where all of the uncertainty comes from. Singletons which is just another name for
global state. And static methods, which is essentially procedural
programming. And one thing that you can think about is,
suppose somebody gives you a purely procedural code, how would you test it? And it turns out that I have no idea how to
test purely procedural code. Because in order to test something, I need
to isolate something. And in order to isolate something, I need
to have some kind of a seam. And seam in object-oriented world is my polymorphism
coming to play, something that I don't have in procedural code. Yes? >> So, I've--maybe I don't understand. Why exactly is static methods hard to test? I guess, I'm [INDISTINCT]
>> HEVERY: Because you don't have a seam. >> [INDISTINCT] two to three years [INDISTINCT]
>> HEVERY: So, here's the kind of the problem. I don't want to get into it too much because
there's a couple of slides later that we're going to cover this. But the basic issues is this, if you have
a leaf methods such as math that absolute value, piece of cake to test, right? Because it's a leaf, it doesn't call anybody
else. But if you have a method that is way up in
a call hierarchy and you're trying to invoke that method and you want to prevent that method
from calling--I don't know--a database or something like that, there is no way for me
to prevent that call from happening because all the methods are static and there's nothing
for me to overwrite. So, yes, in a simplest case, when you have
a simple leaf method like an absolute value or such a--such a thing, piece of cake. But when it has to be a more complicated program,
the answer is no. So, the worst thing is trying to test a main
method. You're trying to test your application from
the main method? Good luck. Chances are, you cannot do it. So, we don't really how to do it. The other thing is deep inheritance hierarchies
because it's essentially the same problem. I cannot divorce myself from the inheritance
in a--in a--at runtime, right? At runtime, I would like to build--instantiate
a small portion of my application. And if the test I want to--if the class I
want to test is X and X inherits from A, B, C, D, E, F, G, then whether I like it or not,
I'm testing all the other classes as well. And so inheritance--deep inheritance hierarchies
is something that makes it really hard. So, notice this tool list. Most people actually cannot answer the question
of what makes code hard to test even though we do this all time. So, we kind of want to talk about this little
further. Yeah, and last line I forgot, are good old
favor, too many conditionals which is the if statement. But anyways, let's get down this into--further
on. This is a really long list. So, here's a thing, what can I tell you about
writing tests? It turns out, nothing. Like, I cannot teach you anything. There's no magic to writing tests, absolutely
not. It has a couple of framework--I mean, it has
a couple tools like easy marking framework and stuff like that. But for the most part, there is no secret
knowledge I have about testing, none whatsoever. What I do know a lot about is how to write
testable code, and that kind of is the core of the problem, which is most people assume
that I'll write code and I'll throw it over the wall and here comes my test engineer and
he'll write some test, except at point, it's too late because the code is already written
in such a way that the test engineer cannot write a good test. It's too late. And it makes it kind of worse because the
place where the mistake is made, which is writing the code, and the person who feels
the pain of hard to test code are not the same people. And as a result, it's really hard to kind
of affect change in an organization like that. And so, we're real kind of guilty of this. >> Is it done? >> HEVERY: Yes. >> But for unit test, it should be ideally
the same people, right? So, the people who will do that... >> HEVERY: Yes. Exactly. So, you'll do the test... >> So, are you talking about other tests as
well or...? >> HEVERY: No, we--this is kind of an introduction. So, absolutely, we want to get everybody to
unit testing level. And in the unit testing, it has to be the
same person. I'm just trying to point out how the other
kinds of test don't really work. So, we're going to talk about it as a unit
test in a second. So, just hold on one second. So--glad to know that you were ahead of the
curve, though. It's good. So, what can I tell you about writing testable
code. Well, things that I can--we can talk about
what Good OO is and how it helps testing and we're going to dwell into this thing a little
bit, and we can also talk about something we call dependency injection. Sometimes I feel like the dependency injection,
I'm selling a snake oil because it fixed so many things. But it does actually work. Of course, there's the Test Driven Development,
which as you point out, and we want to take the unit testing folks together. You want to make sure that the person who
writes the test and the person who writes the code is going to be the same person, and
you want to go definitely into the unit testing route, which we'll get to in a second. So, here's the thing. There is absolutely no secret to writing tests,
none whatsoever. The only secrets there are as to writing testable
code, and that's kind of what we want to talk about. And it sounds like you're already ahead of
the game and you already know that the answer is unit testing; but for most people, aren't--they
aren't actually that far along, and they always--that most people are still stuck on the premise
that there is just some secret sauce to testing, which there isn't. So, how do we--I like to think about unit
testing. Imagine you want to test a car and somebody
says to you, "Please test the car for me and to make sure the car works." Everybody who's new to testing will immediately
says, "I know. I'm going to build a framework in a context
of a car. I'm going to build something were the car
can sit on top of and I'm going to build some machine that will tend to be a driver and
will turn the steering wheel and push the brake and the gas and play with the knobs,
and that's how I'm going to test that the car works." This is what basically a scenario test is. And a test like that is actually pretty cool
because it does actually prove that the car works. The problem with it is the execution, and
that is, these tests are horribly slow because, let's say you want to prove that the car--all-wheel-drive
system works correctly. Well now, you have to get the car into where
there is ice. So now, what are you going to do, drive it
into a refrigerator? And then, you want to make sure that the car
can--it doesn't overheat at, you know, really hot temperatures in a desert. Now, what are you going to do, drive it in
the oven, right? Even if you can do that, even if you can build
all this frameworks, these things are going to be really, really slow and they're going
to be flaky. There's so many things that can possibly go
wrong. You're testing the whole system end to end. Like, maybe the oven's broken, maybe it's
not the fault of the car. So, the problem with scenario or large scale
test is that they're flaky. They're slow and then flaky. And it's not uncommon for you to have to take
several hours to execute all of your scenario based test, not very useful. So then, you say, "Well, maybe we can do something
better." So, we kind of mentioned this that you discover
basically when your tests are slow and you discover that tests are flaky, right? We kind of covered this. So, this is the kind of the first stage of
unit testing. The people who have discovered, "Hey, test,
good." So, we can automate this thing. Here's the good thing. You have that really high confidence when
things work that the thing actually worked, that the car actually worked, right? Whereas, if something goes wrong, you're not
really sure if the problem with the car or is it because the designer has moved a knob,
you know--on inch to the left, and all of a sudden, the framework can't grab anything. You're just really not sure what exactly went
wrong. And a lot of them--things are just flaky,
and you don't really know why because there's so many variables coming to play. So, it's really hard to reproduce failures. So, suppose if it's flaky and it fails and
you're like, "Okay, let's do it again." And all a sudden works this time and you really
have no idea how and why and so on. So, I'm just pointing out how troublesome
that is. So then, you kind of think about it and you
say, "Well, maybe instead of testing the whole car, I can break the car down into parts and
test them individually. So, maybe instead of pretending to be a driver
where I turn on and off the radio, maybe I can take the radio out in isolation and hook
it up to maybe in a solar scope for the output of the--of the radio and remove the knobs. And instead of the knobs, put some kind of
analog-to-digital converter that directly controls the knobs. And now, I can test the radio in isolation
independent of the car and I can bake it, cook it and do all kinds of things where the--to
make sure that the radio works just as we planned." We're going to do the same exact thing for
the engine, for the transmission, and for any other component in the car that's large
scale component. And so, what you discover is things get a
lot better. Again, when you--when the things are green,
you're pretty sure that the thing works. When it's red, you're also pretty sure that
things aren't--something went wrong because you took so many variables out and you only
have these large scale systems, that you're pretty sure that things are just--something's
broken. The thing--the problem with that is, suppose
the radio doesn't come on, like, good luck figuring out which part of the radio is broken. The engine doesn't start, good luck figuring
out what exactly is wrong with the engine. Like, it's much better than having the whole
car and pretending to be testing that, but it's not quite what we want, right? So, we call this medium level test or functional
test because you take a single functionality and try to test that in isolations. From a software point of view, this is kind
of like taking your app, and instead of replacing the outside servers, like the authentication
server, you're going to replace with an in-memory fake LDAP or something like that, which auto-authenticates
and so on. So, you basically focus down on individual
pieces and you test them in isolation. So then, you say to yourself, "Wait a minute,
if going from large scale testing to medium scale testing, we got better at this; maybe
we can go in further and go down to individual components. And in the world of software, that's individual
classes. So, instead of testing that the engine works
as a whole, maybe I can basically have individual test that verify that the piston is [INDISTINCT]
of the correct shape, that the oil is present, that the sparkplug has the correct clearance
and so on and so forth. And I'm just individually testing all these
pieces. And I know that if all of those pieces are
correct, then I am very, very confident that the engine will actually start. And if I discover a case where the engine
doesn't start, I can always go back and figure out what was the root cause, and I got to
test for that root cause. So, it turns out that these kinds of tests
are great because they're super fast, right? This is our unit test. From a software point of view, this is where
you're testing individual classes. They're really good because you have really
high confidence. The tests are really fast. We went down from several hours to run and
verify that the product works down to seconds, literally seconds. And now, you can do crazy stuff. You can say to yourself, "Maybe I can hook
up my save button. So, every time I save the code, it just runs
all the tests because--they're a couple of seconds, what's the problem?" So, imagine writing code where you just code
along and say, "Yeah, okay. I think I'm ready to save." Ctrl+S and you know immediately if you broke
something or not. It's a nice world to be in, right? And then, really--the other nice thing about
it is, if the test fails, it directly points the cause, right? If the sparkplug clearance is not good, you
know exactly what needs to be replaced, like there's no question about this, right? So, if a function that is supposed to be doing
sorting fails and it doesn't sort properly, like you know exactly where to go to look
for the error. Like, in most of the cases, you don't even
need a debugger to figure this out. So, this is the promise land of unit testing. So, as I've said, most people when you first
tell them, "Write me a automated framework for testing," they'll immediately think, "Oh,
I got to pretend to be a user and I got to write some kind of a framework." And we call it a scenario based testing. And there's so many problems with that that
I think your effort is better spent on unit testing. If you had unit testing and you have nothing
else, you are way better off than if you have just scenario testing. Now, of course, you're better off if you have
unit testing and functional testing and a little bit of scenario testing. But for the most part, you want to have unit
testing. Now, when they build a car in the factory,
here's that something fun that happens. They put the car together, they have individual
test that prove that pieces work and they have one final test. And the final test is they take the key, they
show them the ignition, they turn and they drive it to the parking lot. If that works, that means a lot of things. That means that the battery got hooked up
and it's charged, right? It means that the steering wheel's hooked
up and there's gas in the engine and so on and so forth. There's a whole ling list of things that kind
of means that it kind of works. Now notice, we didn't prove that all these
things work in--under all condition. All are proving to ourselves is that they
get to hook up properly together. And that's the purpose of a scenario test. You just want to kind of make sure that things
got hooked up properly together, and you have separate unit test to prove that all the pieces
work. And you kind of have functional test to kind
of prove that individual related pieces work. Like the radio works in isolation, the engines
work in isolation, the transmission works in isolation. So therefore, I just want to make sure it's
hooked up together, and I'm pretty confident the whole system is going to work as well. So, all of these different levels, as they
say, are important because you have all different probabilities that you're going to find a
bug, but they're different kinds. As I said, unit testing is all about--it doesn't
do the right thing. Whereas in the other extreme is--isn't hooked
up properly. And then everything in the middle is kind
of, you know--the medium test, they kind of test a little bit of both, but again, we just
want to have these kind of extremes. We don't want them test everything at once
because it becomes hard to test. It turns out that if you--there's a way to
code so that you separate out the hooking up problem from the functional problem. And that way of coding is actually called
the dependency injection. We'll look at it in a second why that is important. We kind of already touched on this but I'm
just going to cover this again. And that is, you really want to have a large
number of unit tests. Typically, the number of unit test is going
to be semi equivalent to--or the number of lines of code of test, unit test code is going
to be approximately equivalent to the number of lines of code and production code. That pretty much, you know, give or take,
you know, in a--in a same ballpark, which also pretty much translates to about a roughly
same number of test cases to function methods you have. But that does not imply that you actually
want to have one test case per function. You just have approximately same number of
test cases. You want to have a--lot smaller set of functional
test that kind of test that this sub--individual subcomponents kind of work together properly. They're, you're starting to get more in the
business of, "Is it hooked up properly? When I pass this object to this object, does
the other object expect together than the correct state?" That's kind of what you're testing over there. And then the scenario test purely is a test
in the form of, "Does the pieces kind of talk to each other on how we expect it? Can a server come up in isolation kind of
a thing?" We really don't go into the details of replicating
things. I'm going to skip to where it is and I'm going
to come down to here. So, unit testing. We decided in here that unit testing is a
good idea. So, you have a test driver, the JUnit; and
you have class under test. And you apply some stimulus to the class under
test, you go some methods right on it, and then you assert that something expected happened. Piece of cake? Easy? So, why are we having this discussion? Why is this so hard? What's the problem with this model right here? Yes? >> [INDISTINCT] has dependencies? >> HEVERY: Things often have dependencies,
exactly. So, the really--reality of it is that the
class under test usually has these other classes that it depends on. And guess what, those things depend on another
classes. So, I do something benign. Like, I say "New class X." And then, the constructor of it, it goes off
and starts constructing other classes. And those classes in that constructors go
often construct other classes, and so on and on and on and on. So, we have the same problem as we had with
procedural programming as you pointed out at the beginning. If it's a leaf class, yes, it's a piece of
cake to test. Nobody really has to explain to me how exactly
I test array.sort. Piece of cake. It's a leaf. But how do I test--I don't know--the log in
page?" Totally different end of the things. So, in order to test this thing, we really--we
really need something we called a seam. We basically need to be able to take a knife
and then kind of cut all the dependencies. And the seam is important because it allows
us to divert the execution of the code. This is why procedural programming is problematic
or rather static methods of problematic, right? Because if you call another static method,
there's nothing I can do in a test to prevent that goal from happening. Now, I'm sure you can come up with a simple
case. Like, "But I'm just calling math that absolute
value, therefore it's okay." But usually, when you have some static [INDISTINCT]
method, people keep adding stuff to it. And so, what started off as a benign method
goal which was non intense and non interceptible ends up to be this complicated beast that
all of a sudden is non interceptible and all of a sudden that's not so good. So, I take the extreme point of view and I
simply say, in my code, I don't have--want to have any kind of static whatsoever. It turns out that in most cases, when I see
static calls, I usually look at them and I say, "Yes, this actually belongs on this class
over here and, you know, there's something wrong about the old decomposition of the project." For example, let's take the extreme example
of math dot absolute value. I firmly believe that the five should be able
to say five dot absolute value. Why do I have to say absolute value and passing
the five? It should be just a compiler of sugar that
does all the magic underneath it. And I believe in languages like Ruby, you
can do that. That doesn't imply that the five has to be
an object. It just means that the compiler knows how
to convert all these things. So, we need to have a seam. So--great. So, we have a seam, and what seam allows us
to do is to replace these dependencies with friendlys. Now, when I say a friendly, I don't necessarily
mean to mock. It could be a--the real class that I'm already
tested somewhere else and I already know that it's going to do the right thing, therefore
I'm perfectly happy to instantiate the real thing. But I trusted. I know when the test fails, the problem isn't
over there because I have other test to prove that that stuff works. It could be the real thing. It could be a stub such as that it does nothing. Like for example in a login framework, I'll
just throw in a stub so that you don't bother logging in anything because it's not relevant
for the purpose of the test. It could be a mock which returns in each some
collaboration or it could be a simulator kind of a thing that kind of simulates the thing
which is kind of like a smarter mock. The point here is not what you put over there. The point here that from a testing point of
view, I have a choice to place anything I want over there, and that requires a special
way of writing code. If you and your code just simply called a
new operator on a class, well, there's nothing I can do, right? Even if it's a--even if you--if there--if
anything create an interface for these things, but if you instantiate the implementation
of the interface, there is nothing I can do from a testing point of view. So, it is really, really important to have
these seams. And how exactly we place seams in some--or
inside of our code, that is something that is--that most of us are not experts at. It's not something we have learned in school. It's not something that we have learned through
hacking. It's not something you even needed because,
unless you were writing test already, why would you be placing seams everywhere? So, how exactly do we need the seams? So, let's back up and then keep the seam in
the back of our mind and let's talk about something else. In most of our classes that we have, we have
Object, Graph, Construction and Lookup, with Business Logic. Business Logic is the if statement and the
loops and the stuff that actually does work, and the Object Graph Construction Lookup is
basically your new operators when you're constructing the Object Graph, and it's also your let-me-go-and-find-what-I-need
code. Usually, in terms of let me go talk to the
context objects so that I can find my property so that I can use the property to open a file
and read the parameters that I need, which makes it impossible for me to ever give you
the fake parameter in a test or at least makes it really, really miserably hard. So, that's what I mean by object construction
and lookup, and then really good stuff happens where most of our bugs are which is in the
if statements and the loops and et cetera. In most code, I have seen those two pieces
are together, and it's probably the code that you write as well. But it turns out that those two responsibilities
need to be separated. You are either in the business of constructing
things, building object graphs and constructing the application with all the instances of
the classes or you are in the business of being those things that got constructed and
doing the actual work. If you separate these two things out, it turns
out testing is trivial. Not trivial but really, really easy. So, let's look at how that works out. The little bubble on the arrow represents
where the new operator is located. If that little bubble started off inside of
the blue class under the test, I could have never, ever controlled the construction of
that class. But because I have migrated the new operator
into my test, now the test has the responsibility of constructing the object graph, and then
I take those objects and I pass them through the constructor of the class under test, and
then class under test then collaborates with the things that I've passed in. Now, this gives me a choice in a testing world;
because now, I am free to instantiate a subset of the application that I want to test. I don't really have to instantiate the whole
thing, I simply instantiate the stuff that I really, really care about. And I have a choice in terms of how I set
things up. If I choose to instantiate the real thing
then maybe I can, you know, configure it in the correct way. So, if I'm testing a cache, I'm going to instantiate
the cache that has a cache size of one so that I'll get misses all the time, right? Whereas in production, I know I'm going to
have cache size of 10,000 but I don't want to do that in testing purposes because I--gosh,
it's going to take forever to call misses to happen, right? So, the point is, you want to make sure that
your code is kind of devoid of the new operator. Because new operator is static and it causes
direct binding. And you want to basically say, "I need these
objects to collaborate." In new constructor, you say, "Hey, I need
the file cache. Please provide it to me." It is not my responsibility to go and read
some property file on the hard disc in order to figure out how to instantiate a file cache
and then configure it into some specific way or it is not my responsibility to do cache.getInstance
and look into the global state variable, which is the instance variable of the cache and
get a hold of the cache that way either. I simply say, "I need a cache in my constructor." And one will be provided for me. In a testing world--you know, test world will
provide you with a--some kind of a small size cache, which we can go and test. And in production, you'll be provided with
the real thing. So, the new operator separation is important
because that allows us to do sub-classing. When we can do sub-classing, we can take advantage
of polymorphism. And polymorphism is what the seam is. Does that make sense? Now, show of hands. How many of you guys actually do this in your
code? You do this? Excellent. So, you know about dependency injection. So, let me ask the question again. How do you write hard to test code? The real crux of the problem is, if you want
to make code hard to test, you're going to mix your object creation code with business
logic. The moment you do that, you mix those two
pieces together, you cannot instantiate anything in isolation. And when you cannot instantiate anything in
isolation, the only thing you can possibly instantiate is these humongous chunks of application,
which pretty much cause you to instantiate the whole thing. And now, we're back to square one, which is
scenario based testing. Kind of--we kind of already decide if it's
not a good idea. Wow, I went through this really fast today. So, the take away is that this, that unit--we
really, really want to be able to test in a form of a unit test, not as a scenario based
test. And in order for us to be able to take unit
test, we need to separate the object instantiation responsibility from the actual responsibility
of doing work. You are either a factory, which is responsible
for creating some object graph or you are the--part of the objects that are doing some
work. You don't want to mix the two. All right. I kind of rushed these slides. I'm not sure why. So, we have covered them all in amazing half
an hour. So, I'm going to turn into you guys for questions. And if you're being so kind and grab the microphone,
that would be awesome. And maybe even come closer so you're not so
far away. >> I was thinking I should mention something
which is--I was thinking I should mention something which is that some of the dependency
injection is barely needed depends on what language you program in. >> HEVERY: Okay. >> So, if you program in Pearl or Ruby, you
can often find a seam through the language being dynamic? >> HEVERY: Yes. You are probably referring to Monkey-patching,
right? >> Yes. >> HEVERY: Okay. We haven't talked about the idea of global
variables. But basically, global variables make your
code really, really, really hard to test because things are unpredictable, the order of the
test matters and so on and so forth. And it turns out that doing monkey patching--code
is in global space and you monkey patching the code, oftentimes results in you changing
global state which means if you go and monkey patch a class and then another class runs
and you didn't properly clean up after you resolve, then the new test will fail. So, even though many languages like, Ruby,
JavaScript etcetera allow you to do monkey patching, whether--if you do monkey patching
and a class instance, you can probably get away with it. If you do it on a class level, it usually
is a bad idea and it's no different than having global state. And there's a separate talk we do just on
all the evils of global state. So, even if you have that, I think dependency
injection is still a good idea. And I think--Paul--Dave, are you going to
say something? >> Yeah, I disagree. >> HEVERY: You disagree? You think monkey patching is a great idea? >> Yes. >> HEVERY: Okay. Do you want to give your point of view? >> Yes. What you said is exactly right. The problem is, if you don't clean up properly
after yourself and the next test runs and then [INDISTINCT]
>> HEVERY: Uh-hmm. >> So, if you have a good framework that will
ensure that things are cleaned up, then you don't have that problem. >> HEVERY: Okay. >> And genuinely, it's a lot less [INDISTINCT]
framework than a DI framework. >> HEVERY: Different opinions. Of course, Dave is the Ruby guy who's the
highly dynamic [INDISTINCT] I'm the static type of guy. >> Yes. So, I respect my framework does exactly that. It cleans up after itself if you step up methods
to return instances and so forth. >> HEVERY: How does it know which things you
modified? >> Because you modify through the framework. >> HEVERY: Okay. >> And it keeps track through everything. >> HEVERY: So, it keeps track whatever comes
[INDISTINCT] >> And restores everything when it's done. >> HEVERY: Other questions? So, in Java, you have dependency injection
and you have nice fancy tools like guice for what we call automatic dependency injection
frameworks. People are oftentimes confused and say, "I'm
in C++. C++ doesn't have guice therefore, I--dependency
injection is not for me." It turns out that dependency injection is
the practice which is asking for things in the constructor. In dependency injection, the automatic frameworks
and guice are two independent things. And so, you can have one without the other. And you can perfectly well, use dependency
injection in C++ with manual, where you have to write the factories yourself. Great. Well, thanks you guys for coming. See you next week.