(presenter)
Hello, this is machete-mode debugging with Ned Batchelder. Ned has been programming Python
since 1999. He is the maintainer of coverage.py,
and he works for edx.org. Please give a warm welcome
to Ned Batchelder. [applause] (Ned Batchelder)
Hi everyone, thank you. Oops, hi everyone, thank you. As Jesse mentioned,
my name is Ned Batchelder. You can find me on Twitter
or IRC or GitHub as nedbat. If you want to follow along online, the slides for this talk
and another version of the text are at that bit.ly URL. A very important announcement:
today at 7 o'clock, we're going to be having
a juggling Open Space, and I welcome any of you
to come and juggle with us. [applause] OK, machete-mode debugging. Whoops, let's get that clickable. OK, so I've been programming, as Jesse mentioned, in Python,
for a very long time. Not as long as some,
but a very long time, and it can be a bit chaotic. I love Python
for its dynamic nature. One of the things
that fascinates me about Python is that in its deepest structure,
it's really very unstructured and you can build lots of different structures
for your program out of it. But through conventions
and agreement, we tend to build programs
in a way similar to more strict languages which lets us build very large systems
that work together. We can reason about our code. But in its nature,
Python can be chaotic. It has dynamic typing, which means that names
can take on values of different types
at different times and sometimes unpredictably. There's no access control
on your objects, no protected, private, or final, which means that things can change
from far away in your program that you didn't expect. All of the objects are on the heap. There is no stack allocation, so things can live for much longer
than you expect. Fundamentally, nothing is off limits
in Python, right? We get questions about, "How can I ensure
that someone doesn't do blah?" And generally Python doesn't do,
"You can't do," very well, right? Whatever you want to do in Python,
you can do. This can cause problems. If you build large systems,
you'll get yourself into trouble where you have to debug situations
because the chaos got a little farther out of hand
than you had intended. So, let's use that to our advantage. The chaos got us into this mess. We can use the chaos
to get us out of this mess. That's the fundamental thesis
of this talk, which is: Python is very dynamic but we can --
in a karate-like move, we can use that against our opponent
(our own program), and we can take the upper hand
by using that chaos to get the information we need to get ourselves out
of a sticky situation. The bulk of this talk
is going to be a discussion of actual problems
from a real project. And I won't tell you
what project it is, because I want you
to like Open edX. [laughter] The point is that these are actual problems
that happened at work. I wrote up blog posts about them. People tended to like
those blog posts, and so I've collected together
the experiences, here in this talk. The other thing is,
I really want to emphasize this. The things I’m going to show you,
you shouldn't use in real code. Most of the code
that I’m going to show you has meant to be in your code base
for about 10 minutes. You write this code,
you get the information you need out of it, you fix the problem, and then you get rid of that awful thing
that I’m about to show you. If I hear that any of you
are using any of this in production later, I’m going to feel really, really bad, and I’m going to
not like you personally anymore. [laughter] So, don't do it. All right, Case 1. We've got four cases to cover. Case 1: Double importing. The problem was
that we had modules in our system that were being imported
more than once. And if you know
about importing modules, you know that
one of the fundamental ideas is that when you import a module
you always get the same object no matter how many times you import it,
but that wasn't the case for us. And the classes in those modules
were then defined twice, which means
we had two classes floating around which had the same code
in the same name. And usually that's not a problem,
although sometimes it is, but modern -- recent versions of Django
actually complain about this. They will detect
that this is happening and head off the eventual problems
by complaining about it and preventing your program
from running. And when we upgraded our code
from Django 1.4 to Django 1.8, we started to see those complaints
and we had to fix them. Now, how can it be that modules
are imported more than once? So, here's a quick refresher. Oh by the way,
as I’m going through this, what I’m going to show you is
what the problems were, what mechanisms in Python
made those problems possible, and then what mechanisms helped us
debug those problems and fix them. So, I’m hoping along the way,
in addition to showing you versions of code that you're
not supposed to use in production, that you'll come away
with a deeper understanding of some of the mechanisms
underlying Python that got us into the mess
and got us out. So, here's a quick refresher
of how modules work. When you import a module,
you ask for a module name. The first thing that happens is there's a dictionary in the sys module
called sys.modules which has, as its keys, the names of all the modules
that have been imported and as its values,
the actual module objects. So, when you import a module,
the first thing that happens is it looks in that dictionary
to see if the module has already been imported. And if it has been,
it just returns it. These two lines of code are what make it
so that when you import modules, first of all,
it goes very fast the second time, and you get the same object back. And if it's not found, then for every directory
in the thing called sys.path, which is a list of directory names, it looks to see if it can make
a file name in that directory from the module name that exists,
and if it does, then it's going to actually
execute that file to get an object that's going to stuff the object
back in the sys.modules under the key and return it to you. That's how import works
the first time. And if after going through all that loops,
it doesn't find anything, it raises an import error. So, this is a wildly simplified version
of how importing modules actually works, but this is good enough to get you
about 10 years into your career with Python. So, this is pretty much -- I mean, nothing
bad about Brett and all the good work he’s done, but this is enough for you to understand
how importing works. So, with all this machinery in place, how did we have modules
being imported more than once, right? We've somehow broken this fundamental promise
that Python gives us. And how are we gonna find it,
most importantly? So, this is the code that I actually
put into an actual file of Python. And it's a little bit dense
for you to read right now, but the idea is to get across
a couple of points. One is, I actually went to the models file
that Django was complaining about, and I actually put real code
right into the top of the module. Right? What you're not supposed to do. And the code I put in was gonna use a module
in the standard library called inspect. Inspect is a really useful tool for understanding
how your program is structured. It can tell you about the contents
of modules and classes and methods. In this case
what we’re going to use is -- we're going to use a function in inspect
called stack which gives you a list of tuples, every tuple representing
one call frame in your stack, showing you who called you, and who called
them, and who called them and so on. And in those tuples are information
about the file name, the function name,
and the line number so that you can essentially
create a traceback of your current position. So, here, what I did is,
right there in the module, when it gets imported, I’m going to open a file name,
and I’m going to append to it. And what I’m going to append to it
is that I’m importing the file. And then for all of those objects
in the stack, I’m going to write out a nicely formatted
line, and I’m going to write that line. And then I actually
have the models, right? Because I've just dumped this code
straight into a file that has nothing to do
with what I’m trying to find out, right? I’m not -- this isn't a file
about stack traces. It’s a file about Django models. But like I said,
I’m doing things the wrong way, because I just need
to get the information I need. And when I ran it,
I got results like this. It told me that it was
importing first/models.py and that was being imported
from this place, and it also told me that it was
importing that file again and that it was coming
from this place. And so, now I had the two locations
where the file was being imported. And both of these locations were importing
it and somehow executing the file. And when I looked at those locations,
I could see what the problem was. One of them said
"import thing.apps.first.models," and the other place said
"import first.models." And the reason that's a problem is
because in our directory tree -- I've got a map
of the directory tree here, twice, and the stars are the directories
that are on sys.path. So, the first import
in the project directory found a thing directory, with an apps directory,
with a first directory, with the models.py so it could import
thing.apps.first.models which put thing.apps.first.models
into sys.modules. The second import, because apps
was also on the system path, could find a first directory
with a models in it. And because the keys are different,
thing.apps.first.models versus first.models, the uniqueness check,
didn't kick in, right? So, I had that little bit of code
that printed out a stack trace that told me exactly what I needed to know:
where are the two modules being imported? From there, I can get the clues
that I needed to fix it. But the reason, by the way, that sys.path is like this
is because in our code we literally have sys.path.append to append extra directories
onto the system path. And this is one of the reasons you shouldn't go around appending things
onto system path, right? So, that's Case 1 solved, and by the way,
the solution to the double import: the best solution, frankly, would be
to get rid of the sys.path.append. I’m looking forward to that
in the future. That's going to be awesome. The way we actually fixed it was to at least
make all the imports have the same form, so that everyone
who was talking about the module talked about it in the same way
and the uniqueness check would work. So, what have we learned
from Case 1? First, we learned
that import really runs code. Now, if you're coming from another language,
perhaps with static typing, you may think of an import as being, "There are classes and functions
to find somewhere. Go and find those definitions,
and let me use them." And in a way that's true,
but the way Python does that is, it really executes all the code
in that .py file. Now if you happen
to write your .py file to have nothing but imports,
and class, and def statements, then all that's going to happen
when you execute the code is to define classes and functions. But if you put in a "with" statement
and "print" statements, and if you put in global mutation statements,
they're all going to run. Importing doesn't have a special mode
where it just looks for definitions. All it does is
it executes all the code. And we used that
to our advantage in this case because we wanted to print out a stack trace
when we imported the code, right? The file was being imported twice,
we wanted to get two stack traces. It worked great
to just dump the stack trace at the top level of the module
as part of the import. But you shouldn't do that
in real code because it makes it very difficult
to reason about the code because you have code
that's executed one time when you import it but not all the other times
that you import it. So, don't put code
at the top level of the module, but understand that
that's how Python does imports. The second lesson we learned
about machete-mode debugging is we just hardcoded
a bunch of stuff in there, right? I just said, "with open/temp/,"
you know, "my information.txt." You'd never put that in real code. But the code is only going to live
for 10 minutes, who cares? Just write straight to the file,
and be done with it. In this case, wrong is OK, because we just need
to get the information. And in terms of a positive lesson,
don't append to sys.path, right? Don't fiddle with your system path to try to make your imports
convenient or something. Choose a disciplined way to do it. Keep everything straight,
and you won't run into this kind of chaos. Case 2: Finding temp file creators The problem was that we had tests --
that's not the problem. [laughter] The problem was that we had tests
that would make temp files like this using tempfile.mkdtemp --
in this case a temp directory. And some of them would add a cleanup
so that the temp directory would be sure to get cleaned up
at the end of the test, but some tests would make a temp directory
and didn't clean it up. And so you’d run
your whole test suite and you end up with 20 temp files
and directories left behind, which isn't really a problem,
but you know my OCD kicks in, "That seems kind of messy;
we should clean that up." But how do we find them, right? There's lots of tests. I think in our test suite
we have about 8,000 tests. I’m not gonna be able
to grep the whole test suite and find the places where it gets created
but not cleaned up. Sometimes the cleanup is far away, sometimes it's a helper function
that’s called from lots of places. It's just too hard. That's another underlying current here,
which is -- other languages have
really great static analysis tools, and that's something that Python
has a difficulty with because of its dynamic nature. So, we'll just skip
the static analysis. And notice here I’m upgrading grep
to static analysis, which sounds fancy... [laughter] But that's fundamentally what it is. It's a tool for looking at your source code
without running it and trying to understand it. That's what static analysis
is about. What I’m doing here
is all dynamic analysis. Let's put something in the program
that when you run it will tell you
what you need to know. So, the temp files aren't getting cleaned
up, and there's too many to eyeball. What I wanted to do is -- I wanted to put some information
in the file itself, right? After all, the whole problem here is that there's something left behind
when something goes wrong. What if I could just use
that thing left behind to give me
the information I need, right? Unfortunately, I can't write
into the temp file itself. The contents of the file
are important to the test. They’ll fail if I just start writing
random junk into it. But the interesting thing about temp files
is that no one cares what they're called. So, we're going to put the information
into the file name. And the way we’re going to do that is
we're going to monkeypatch the standard library. So, monkeypatching is a technique
where you write a function and you stuff it in place
of some preexisting function. So, in this case,
we're going to import a temp file. We're going to write a function
called "my sneaky function," and we're just going to assign it
to tempfile.mkdtemp. And what that means is
that the unsuspecting product code is going to import tempfile
and call tempfile.mkdtemp, but now that's referring
to "my function." So, when the product code
tries to make a temp directory it's actually going to be calling
"my function." This is called monkeypatching. And the key idea from Python
that makes this possible is that any name
can be reassigned. It feels a little bit weird,
you know? The standard library's this thing
that’s been handed down to us on engraved tablets, right? It's the foundation
upon which we build our programs. It’s something
we’ve come to count on. But it's just a Python module
with attributes like anything else, and they can all be reassigned, so, we can just go ahead
and reassign it when we want to. Of all the things I’m telling you
not to do in production, definitely don't do this one [laughter] Now, what are we
supposed to monkeypatch? Well here's where
we can just read the source, right? Tempfile.py is a file on your disk
in the standard library. You can go and find it
and you can open it in your editor and you can read it, right? If you look in the temp file module, there are actually a half dozen or so
different functions for making temporary things
in different ways. We had some directories
and some files, so we actually needed
to deal with a number of those. And by the way,
we only wanted to tweak the file names. There's a bunch of machinery
in creating temp files that we didn't want
to interfere with. We just wanted
to give them new names. It turns out that there is
a helper function inside tempfile called get_candidate_names. And the way
the temporary functions work is, they use get_candidate_names to produce a series of those
classic tempfile junky randomy things and then they use those names
to find a file that doesn't exist yet and then they go ahead and make their file,
and so this is perfect. Get_candidate_names
solves both of our problems. It's used by all
of the temporary-making things and it's only
where the name comes from. So, if we monkeypatch get_candidate_names,
it will do exactly what we want. But the other trick with monkeypatching is
that you have to do it before the function
gets called, right? If the function gets called
before you monkeypatch then your code is way too late. It's not gonna work. What we'd like to have is
a feature in Python that says, "Before you run the program, "run this little piece of code
so I can monkeypatch first." Python doesn't have
a switch like that. Perl has a switch that says,
"Use this prologue before the main program." Python doesn't have that feature,
but it has a thing called .pth files. Now path files are essentially symbolic links
in your site packages directory. And you can go and look;
you probably have a few of them. And they do this very odd thing which is
when Python starts up, it finds all the .pth files,
and it looks at every line in the .pth file. And literally, if the line starts
with "import (space)" it executes the line. [audience chuckles] I’m not, I didn't -- OK look,
I’m showing you lots of weird code. I didn't write this, OK? [laughter] This is really in there
and every time you run Python, this is happening. And if it doesn't start with "import"
then it just appends the line to sys.path. So, this is how sys.path gets really, really
long and points to all of your imported modules. So, if you create, sorry --
if you create a 000.pth file in your site packages directory
that just imports "first thing," then you can write a first_thing.py and it will run before any other code
in your Python process. And what we're going to do here
in first_thing.py, again, is, I’m going to use inspect.stack
to get information. First, I’m going to save
off the original value of get_candidate_names because I actually like
that randomy stuff. That's still important to keep, so I’m going to keep that function
as real_get_candidate_names. And here again, functions -- Python’s functions as first-order objects,
first-class objects, lets us just hold that function
with a new name, and we can use it later. Then I’m going to make my own
get_candidate_names and I’m gonna take inspect.stack
and join it together in such a way that I get a really long string
that's still kind of readable so that I can see
who's been calling me. And then I’m going to get the actual randomness
from real -- from get_candidate_names and I’m going to yield
my own sequence, right? And then I’m going to do
the real monkeypatch. Again, I know this code
is really dense. It's all online,
you can go and study it later. But I’m trying to get the point across
that we’re monkeypatching the standard library, and as a result, we get tempfile names
that now look like this. And in this file, you can see that at case.py,
Line 53 called case.py Line 78 which called
test_import_export Line 289 So, I can go into test_import_export.py Line
289 and see there's a mkdtemp right there. And that's when
it's not getting cleaned up. So, I can fix that line
and then go on to the next one where test_video 143
is calling tempfile line 455 and etc., etc., etc. So, what did we learn? One, this is often overlooked. Forget monkeypatching for a second. You can just go and read the standard library,
and sometimes that's all you need, right? The very fact
that Python is open source -- and forget the contribution,
and the license, and all that stuff. The source is on your system. You don't even have to go
to hg.cpython.org to dig it up. The standard library is all on your disk
as Python source code and you can read it
to figure out what it does. It's also patchable,
so we can go in there and affect its behavior where we need to
to get information. And for this kind of debugging,
you should use whatever you can. Whatever you can touch and change,
use it, it's fine. That code is only going to live
for 10 minutes. You only have to feel
really, really bad about yourself for 10 minutes and then you'll have the solution
and everyone will think you're a hero and you don't have to explain to them
how dirty your hands got in the process [laughter] And by the way,
do use addCleanup. So, if you're using the unittest library
and you're used to setups and teardowns, addCleanup is a much nicer way to clean up the behavior
of your setup function than a teardown is, so look into that. OK, Case 3. Who is changing sys.path? The problem we had was that sys.path had an extra directory in it
that we didn't expect, and in this case,
it actually caused a problem because of some naming collisions where
when we tried to import a certain block.py it was finding the wrong one,
and we couldn't understand why that was. And again,
grep couldn't find sys.path. And here, of course, I mean,
as you remember from Case 1, we were doing some really ugly things
to sys.path. My first thought is,
"Well I guess there was "some more sys.path shenanigans in there
that we should look for." But no, it wasn't our fault
this time. We weren't doing a sys.path append. So, we needed to find
who was adding that directory to sys.path. So, we figured it had to be
in third-party code, right? Because we can grep all
of our own code. Now, we're not going to go and grep all
of the third-party code, right? Open edX has a requirements.txt suite
that includes about a hundred packages, including NumPy, and SciPy, and SimPy, and you're not going to go and grep
all that code, so, you need dynamic analysis
to get at it. What we wanted
was a data breakpoint. It would be really awesome
if we could go into pdb and say, "Not break when you get to this line
in this file, "but break whenever that piece of data
changes in a certain way," right? What we wanted to know was:
when does sys.path get a new entry at element 0
that ends with /lib? That's what we wanted to know. Who is adding that thing
to sys.path? Pdb doesn't have that as a feature. There's no way to implement that
directly in the debugger, so we write a trace function. Trace functions -- if you haven't
encountered them before, CPython has
a very simple-sounding feature which is that you can write a function
and you can register it with the interpreter, and it will call your function for every line of your program
that gets executed. And this is actually
how debuggers are implemented, and profile tools,
and coverage.py. The way a lot of these
dynamic analysis tools understand the running of your program is
that they write a trace function and then CPython calls them over and over
again for every line of your program that gets executed. This makes it go very slow but you're only going to need it
for a little while. Here is an example
of a trace function. In fact, this is
the entire trace function that I wrote. So, a trace function gets the frame
that you're running in, it gets an event
which is called a return or a line and it gets an arg which, in this case,
isn’t interesting to us. In fact, none of the arguments
are interesting to us, because we don't care where in the program
we are and we don't care what's happening. What we want to know is -- if the first element of sys.path
ends with/lib, we want to stop right there
and see what's going on. To make the trace function work,
you call sys.settrace and you give it
your trace function, and from then on
it gets called on every line. Now, what we did here --
if you've seen this before, pdb.set_trace -- that's the horribly-named
API to getting pdb to break, right? It should be called break_into_debugger,"
but it's called set_trace because literally this is
where pdb sets its trace function as the trace function, right? This is a great example of an API
being named for the internal concerns rather than for the external use, but this isn't a talk
about API usability. And I apologize that pdb.set_trace
has an underscore and sys.settrace does not. Again, see some other talk
about API usability. [audience chuckles] But in this case, the trace function
is incredibly simple, right? In fact, what I’m doing here
is using what sounds like a really, really advanced feature,
a trace function, but the amount of code and the complexity
of code I had to write to use it was much simpler
than the previous examples I've shown. And frankly when I wrote it, I wasn't quite sure:
Am I allowed to call pdb.set_trace while I’m actually inside a trace function
that is already being invoked by CPython? I figured there was about a 50-50 chance
that this just wouldn't work at all, right? But it took me about a minute
to write that function, so what have I got to lose? And in fact it worked great. I ran this,
and it broke into the debugger and it was "nose,"
the test runner. So, nose has a helpful feature
where if it sees that you have a directory called lib, it figures you probably want
to import from it, and it adds it
to the sys.path for you. Luckily, it also has a switch
where you can just say, "Don't do that," and we set the switch,
and the problem was fixed. So, here's a trace function. It's a very advanced feature,
but sometimes, it's exactly what you need. So, what did we learn from this? One, it's not just your code,
right? It's a classic beginner mistake
to think that it's a compiler bug or, you know,
the standard library has a bug. Sometimes, it is other tools
that do have bugs, right? You have to be open
to that possibility. And because of Cases 1 through 2 or 3,
whatever we’re up to here, you know, I was very willing to believe
that it was our own code that was at fault, but it wasn't, and we needed to figure out a way
to get at the behavior of these other third-party tools. Again, dynamic analysis
is very, very powerful. This was an expensive thing to do,
run an 8,000-test test suite with a Python implementation
of a trace function. You can imagine
how much slower it would run. Luckily it was very early on
in that test suite that it hit that breakpoint, because it was the test runner
setting it. But even if it took eight hours, that's probably faster
than finding it some other way. And sometimes, you have to
use big hammers, right? This, frankly, is kind of overkill
to find that, but it was actually less time on my part
and more time on the computer’s part, and it worked out really well. All right, Case 4:
Why is random different? The problem: so Open edX
presents problems to students, and we have
a massive number of students. What we wanted to do is we wanted
to present problems that were randomized so the problem I saw was different
than the problem you saw. But we wanted them
to be repeatable so that the next time I came back
to look at a problem, I’d see the same problem
I'd seen before. And so we do that
by seeding the random number generation with a seed
that's particular to the student. So, each student has a seed,
we seed the random number generator, and then when it comes time
to run the problem code that's going to present the problem, when the random number is generated,
it comes out predictably. So, what I've shown here
is the problem code generating a random number from 1 to 1,000,
and it should be 420. The problem we had was that
the first time that code ran, it came out different --
it came out as 284. And then the second, third, fourth, all the rest of the times,
it came out as 420. So, there's something weird
about how the random number seed was being used to produce
the random number sequence. And the fact that we had that first time
different than the other times made us think, maybe it's
about that import thing, right? Remember, code gets run on import and then not the next time
you import it, right? Different the first time
than times 2 through n. So, how were we gonna find it? Well we're gonna monkeypatch again
but we're going to use a new technique. And this is
one of my favorite techniques. Well, this looks like, maybe,
an esoteric thing. No, it's actually
just 1 divided by 0. This is a really easy piece of code
you can drop into anywhere. It generates an exception
because you're not allowed to divide by 0. It's really fast to type
because it's only three characters, and this is an exception that your real code
probably never generates, right? So, if you put this code
in the middle of anywhere and then you see an actual ZeroDivisionError
come out on your console, it's that code
that's making it happen. So, it's really easy to spot,
right? These are -- this has got to be my favorite
three-character Python expression. And I’d be glad to hear other candidates
for great three-character Python expressions. I don't think
you're going to be able to top 1/0. So, what we're gonna do is
we’re going to monkeypatch again. We're gonna monkeypatch "random"
with a booby trap, right? We're going to import "random," and we’re going to say
random.random = lambda: 1/0, right? And now, notice how reckless
we’re being here. We don't care what the arguments are, we're not trying to reproduce the behavior,
we're not returning anything. It's just an exception...
but it worked great. So, we've got
a booby-trapped random and what actually happened is,
we got a ZeroDivisionError, and we could see that
in one of our third party packages was a default value for a function
of random.random, right? There was actually a class
for this package for its tests. And one of the arguments
to the dunder init was random.random. And remember that all the code in your modules
is executed when it's imported, and when you define a function, the default values are evaluated so
the value can be stored with the function. And so it was actually calling
random.random once during import but only the first time. So, that was taking one of the numbers
out of the sequence which put us off by one number which is why
we got a different number the first time than all the other times, right? And I see some of you
scrunching up your eyebrows, like, "Why would someone do that?" Just for an extra bonus, they never
actually used that default value -- [laughter] -- because the only places
this function were ever called, actually supplied
their own value for that. So, it was kind of
a comedy of errors. The good news is
we reported the bug, and they were very, very, uh,
understanding, and fixed it. So, what’d we learn here? One: exceptions are a really good way
to get information, right? The great thing about exceptions is
that if no one catches it, it will come all the way
back up to the top, right? So, you can have an exception
way deep down in your program, and unless it's something that might get caught somewhere else
like AttributeError -- ZeroDivisionError is
very unlikely to be caught unless you have an "exceptException" someplace,
or, God forbid, an "except:" someplace. But it's very likely that it will come
all the way out to the top of your program. And by the way, another good technique
is that you can -- oop, sorry. You -- no, we're not? Sorry. So, exceptions are a good way
to get information. And you can actually put information
in the exception, right? You can put a string
in your exception. That doesn't have to be
a hard-coded string, right? Whatever value, deep down there, you want
to see, format it into the message and let the exception
bring it all the way up to the top. And don't be afraid
to blow things up, right? This monkeypatch was horrible;
The program wasn't gonna run, right? But I didn't care. I just needed to find out
where the random.random was,and it told me that. And sometimes you get lucky,
because, of course, there's an obvious flaw
in this monkeypatch which is, maybe, it wasn't the first value
that was going wrong. Maybe there were three values getting taken
and every one did that right, but then a fourth, extra fourth value
was being pulled off the sequence. And if that had been true, then I wouldn't have
found out anything interesting because the first one
would blow up. Well, then I’d have
to try something else, right? In this case what I did is
I tried the simplest thing I could think. Maybe it'll work,
maybe it won't. It worked great,
now we can move on. If it hadn't worked, well then I’d have to come up with
a different way to see maybe more of random. It’d be a trickier monkeypatch,
but I could still get in there and see where
all the randoms were going. But sometimes you get lucky
and it works out that way. So, don't over-engineer these things,
just hack away at it, right? That's what machete mode’s all about. You're in the jungle,
you need to get out. You’re not planning a whole paved road
with road signs and traffic lights and everything. You just use the machete
to cut your way straight through. Now, the real problem here was
that we were sharing global state, right? There was one global
random number sequence that we were using
and this program was using -- this package was using
for its random numbers, right? The real solution was that we
started creating our own random object to get our own random numbers from. Shared mutable state
is a very, very difficult thing because it means that anywhere in your program
could be filling with that, and it's very hard to reason
at that kind of distance. So, do use your own random object. And do suspect third-party code. Again, you know, this is
kind of a messed up piece of code that we got from a big well-known project
that we trusted to do a lot of other stuff and it was kind of in a weird part
of their code. By the way, the other weird thing is
that just importing their main code was importing their test helpers
which is where this code was. You know, people get sloppy,
it's all right, you know? We're all in this together. But you have to be prepared for
that kind of thing to happen. All right, the big lessons
from the whole talk. One: break conventions
to get what you need, right? This code doesn't even
have to be checked in, right? It's all on your machine. You can use
the full dynamic nature of Python to get the information you need,
but only for debugging, right? So, the nefarious among you
may be jotting down notes about how you're going to do that thing
on that server somewhere. And I don't know who you are,
so I can’t take any blame, but I really recommend
you don't do that or I’ll be here next year
debugging what you put on your server. So, and again,
dynamic analysis is something that Python’s introspectability
and malleability really lends itself to, so use it. And understand the mechanisms
that underlie Python, right? If you understand how import really works,
or what path files are, or the global
and shared state of random, it will help you reason about
the problems that you're seeing and get you the answers sooner. Any questions? Thank you. [applause] Do we have time for questions? Jesse, do we have time
for questions? He's got no mic. (Jesse)
Kind of. I think we can take just one or two. (Ned Batchelder)
One or two. (audience member)
So, Ned, you told us -- great talk, by the way, Ned. You told us when you have
a bad third-party library, you submitted a patch, or you told them
what their problem was, but in the meantime,
between the submission of the patch, how do you fix the problem? Do you actually patch the code
and run it locally yourself or do you change your own code? (Ned Batchelder)
No, in this case, it's the second bullet from the bottom:
we used our own random object to avoid the global mutable state
completely. (audience member)
OK, so all right, so, you changed your code. (Ned Batchelder)
In this case, we had an option. It could have been worse, and it could have been
that we would have to fork the project and not an aggressive fork,
a fork in the GitHub sense, and have our own copy of the code. And we've had to do that in a few places,
too, just to keep things working. (audience member)
OK, thank you. (Ned Batchelder)
Sure, thanks. I don't know. Are we still -- (audience member)
Thanks for the talk, Ned. So, we've seen the answers
to these debugging situations. Can you talk a bit
about the thought process of, kind of, coming up with these? Like, were these your first suggestions
and they just kind of worked out, or did you have a few
that kind of didn't work out? How did you come up with
these pretty, kind of, clever -- (Ned Batchelder)
That's a good question. I’m not, I’m not sure
I've got any good answers for how to come up with these ideas
other than to think outside the box and understand that it's all possible
and you can -- you can play around
with that malleability. You can break outside of, sort of,
the strict style of coding and treat it more like
the touchable thing that Python is. I don't know how else to say it
than that. I think we have to go,
unfortunately, but thank you for coming. I'd be glad to talk about it
with anyone else, outside. (audience member)
Thank you. (presenter)
Thank you very much.