Welcome. Please help me welcome
Scott Sanderson and Joe Jevnik for their presentation,
Playing with Python Bytecode. [applause] Hey everybody. Thank you all
for coming out today. I know you were all really excited
to see Scott and Joe come here and talk about
Playing with Python Bytecode. Unfortunately I have
a little bit of bad news today. Scott and Joe
actually couldn't make it. They just texted us saying they were
involved in a freak scheduling accident. But right before they texted us,
they also sent us their outline. So I know everyone here was really
excited to learn about bytecode, so I'm going to try to do my best
to give the talk in their place. So their outline said
that they were going to talk about CPython's internal code
representation. They were going to demonstrate
some techniques for creating code objects
from scratch, whatever that means. And they were going to show
some techniques for manipulating and inspecting code objects
and making code from new code. So I guess if this is going to be
a talk, you know, about manipulating functions
and bytecode at runtime, we probably need some functions
and bytecode to manipulate. So maybe we want to start with,
like, a "def add(a, b):". And just very simple function. We'll "return a + b". And we'll just call that
and make sure that works. All right. So far so good. So I guess, you know,
we've got a function here and we want to somehow
get at its bytecode. Everything that's sort of
secret or interesting in CPython starts with a double underscore,
so maybe we can try to find that here. So we've got
__annotations__, __call__, __class__, __closure__... I don't see anything about bytecode
but we got a __code__. If I was going to put
a code object on a function, __code__ is probably
where I would put it. Right, so we've got a code object. So this is code
at some memory address. It was created by file
"<ipython-input-six-gibberish>", but we really want --
we want bytecode here. So maybe we need to dig
a little bit further. So, dot -- we've got a whole bunch
of attributes here. So we've got co_argcount. So that's 2, so I guess co_argcount
is just the number of arguments to this function. How about -- let's see,
so we've got co_cellvars, co_consts. Yeah, well, maybe we'll try to
figure out that together here. So co_consts on this I guess
is just the constant -- or the tuple containing None. So maybe we'll figure out
what that means a little bit later. Bytecode...hmm...bytecode... All right, well, there's a co_code here. And if we look at that,
this is a byte string. So this is co_code. It's a bytes. My guess is this is probably
the bytecode. So I guess the bytecode for add is |\x00\x00|\x01\x00\x17S. I think that makes sense
to everyone, right? All right, sweet. OK, well, so this is
a non-printing string, right? There's all these characters
we can't see. So probably a better way
to understand this is to just look at the raw integers
in that bytes object. So, you know, if we do "list(add.__code__.co_code)", there's definitely a little bit
more structure here, right. So I've got
124, 0, 0, 124, 1, 0, 23, 83. So, you know, there's definitely
kind of a repeating pattern here. So maybe the first 124
corresponds to the first variable, 124, 1 corresponds
to the second variable, and then 23 and 83
definitely means something. I was really thinking
this might be a little bit easier. You know, I've got an idea.
We're here. We're at PyCon. We're surrounded by some of the best,
most knowledgeable programmers who know about Python around. Surely there's someone
here in the audience who has worked with bytecode,
who understands bytecode, who maybe could come up and help me,
you know, teach everyone interactively about how bytecode is supposed to work,
so is there anyone here who knows about bytecode
or has worked with bytecode? (presenter 2)
Well, I'm actually a PSF-certified bytecode expert. [laughter, applause] (presenter 1)
Well, ladies and gentlemen, we have a PSF-certified
bytecode expert here among us. Come up on stage. Can we get
a microphone for him? (presenter 2)
No need. I brought my own. (presenter 1)
Wait, you brought your own microphone to somebody else's -- (presenter 2)
Let's get back on track here. You had the right idea
with that code object, but you're not going to get very far
looking at it that way. Luckily, Python provides a module
to help look at this. Why don't you try "import dis"? (presenter 1)
OK, "import this"? All right, "The Zen of Python,
by Tim Peters." (presenter 2)
No, no, "import dis" with a D. It's the disassembly module. (presenter 1)
Ah, OK. "Import dis". All right, I've imported dis.
What do I do with dis? (presenter 2)
We're going to call dis.dis(add). (presenter 1)
All right, dis.dis(add). All right, well,
that's definitely way better than just that, you know,
list of integers. Maybe i'll put that back up there
so we can see the difference. All right, well, can you tell us
a little bit more about what this this table means? (presenter 2)
Sure. So, while we have 8 bytes in our bytecode, we actually
only have four instructions. We have a LOAD_FAST, another LOAD_FAST,
a BINARY_ADD, and a RETURN_VALUE. So the 124, 0, 0 represents
that first LOAD_FAST. The 124, 1, 0 represents
the second LOAD_FAST. Then the 23 and 83 are the BINARY_ADD
and RETURN_VALUE respectively. (presenter 1)
OK, so 124, 0, 0 actually means LOAD_FAST. Why does LOAD_FAST take up
three bytes in the bytecode when BINARY_ADD and RETURN_VALUE
only take up one? (presenter 2)
LOAD_FAST says to load a local variable, but it needs to know
which local variable to load. So the second two bytes there
are the argument which says which local variable
we will be loading. The 124 is the opcode, and then
0, 0 says "load local variable 0." Then we have 124, 1, 0,
which says "load local variable 1." (presenter 1)
Wait, we're loading 1 and 0? I thought we wanted to load A and B. (presenter 2)
Ah, the argument is encoded as a 16-bit little endian integer, which is an index into an array
of local variables. So dis helps us out on the right by showing the numeric value
of that argument, but that actually means
we're going to load local variable A. Then on the second line we will see
that the numeric value of the argument is 1, but that
actually means "load local variable B." (presenter 1)
OK, so 124, 0, 0 represents LOAD_FAST of 0,
but that actually means LOAD_FAST of A. Where exactly are we loading
A and B to here? (presenter 2)
Load instructions load variables to a shared stack so that they may be
manipulated by other instructions later. As we can see, the BINARY_ADD does not
have an argument in the bytecode because it will just pop
the top two values off the stack, add them together, and then push
the result back onto the stack. (presenter 1)
OK, let me make sure I understand here. At the start of this function
we're gonna have an empty stack. We're going to do a LOAD_FAST of 0,
which pushes A onto the stack. Then we're going to do a LOAD_FAST of 1,
which pushes B onto the stack. We're going to do a BINARY_ADD,
which will pop both values off the stack, add them together, and push
the results back onto the stack. And then finally, we're going to
execute a return value instruction which will pop the top value
off the stack and return it
to the calling stack frame? (presenter 2)
Exactly. (presenter 1)
OK, I think I understand the right-hand side of this table.
How about this set of integers running down next to the
instruction names? What do those mean? (presenter 2)
Those are the bytecode offsets where those instructions appear. So of course, the first instruction
starts at index 0. However, the second instruction
starts at index 3, because indices 1 and 2 are occupied
by the arguments to our LOAD_FAST. (presenter 1)
OK, and then BINARY_ADD is at index 6 because indices 4 and 5 hold the
arguments to the second LOAD_FAST. OK, I think I understand that. What is this "2" up in the top
left-hand corner here? (presenter 2)
That's the line number in our source code where these instructions appear. This would be a little easier if we tried
a function with more than one line. (presenter 1)
OK, well how about just def add_with_assign,
and we'll still do A and B. But then we'll do X = A + B,
and then we'll return X. And then we'll do
dis.dis(add_with_assign). And, OK, so what this says is the first four instructions
of our code object correspond to the second line of our cell,
where we're doing X = A + B. And the last two instructions
of the code object correspond to the third line of our cell,
where we're doing a return X. (presenter 2)
Yeah, I think you're getting the hang of this. Why don't we try a function
that's a little more difficult? (presenter 1)
OK, maybe like an absolute value function? So if we do def abs,
take a single argument X, and then we'll say,
if X is greater than 0, we'll return X. Oops, return X. Else, we'll return negative X. And then we'll do dis.dis(abs). It's got a nice ring to it. OK, I think I've got this one. So at the start of this function,
we're going to do a LOAD_FAST of 0, but that actually means
LOAD_FAST of X. Then we're going to do
a LOAD_CONST of 1. So I guess that means there are
different kinds of load instructions. We're going to a LOAD_CONST of 1,
but that actually means "load the constant value 0." Then we're going to do
a COMPARE_OP of four, which means..."greater than"? How to CPython know
that 4 means "greater than"? (presenter 2)
Well, not all arguments are just some index into some array. Here, the argument
is actually an enum representing which comparison
we want to perform. So there are entries in this enum
for all the comparison operators like greater than,
less than, or equals. (presenter 1)
OK, and then after that, we're -- (presenter 2)
How about I take the next instruction? It's a little difficult. The POP_JUMP_IF_FALSE does exactly
what it says it will do. It pops the top value off the stack. If it's true,
it continues execution like normal. However, if it's false,
it will jump to the bytecode offset specified in this argument. (presenter 1)
OK, so if the result of COMPARE_OP is truthy, we're just going to continue
executing to the LOAD_FAST at index 12, but if it's falsy, we're going to jump
to the instruction in index 16. (presenter 1)
If you see those arrows, that's dis's hint to the non-experts
that that instruction is a jump target. (presenter 2)
OK. Let me make sure
I can walk through this one more time. So if X is greater than 0, we're going to jump
to these two instructions, execute a LOAD_FAST and RETURN_VALUE. If it's less than 0,
then we're going to do a LOAD_FAST, then do a UNARY_NEGATIVE to negate X
and then return that value. It looks like there's two instructions
at the bottom of this function that can never be hit.
Why are those even here? (presenter 2)
You're right. Those instructions
are actually dead code. CPython has a pretty simple
code generation algorithm, and one of the rules is that if a function
doesn't end in a return statement, an implicit "LOAD_CONST of none,
RETURN_VALUE" is added. So it may appear like
this function ends in a return because both branches end in a return,
from CPython's perspective, this function actually ends
in an "if" statement. So those dummy instructions
will be added even though
they can never be executed. (presenter 1)
That seems kind of wasteful, don't you think? (presenter 2)
Well, it's only four bytes, which is half a pointer.
It's not really worth the added complexity
to the compiler to remove them. (presenter 1)
OK, but say we really cared about those four bytes. Is there
some way that we could remove them? (presenter 2)
Well, you don't have to use the compiler to create a code object. You can just create one
like any other object in Python. (presenter 1)
OK, well, let's write our own abs function. (presenter 2)
Hold on there, killer. How about I start you off
with a function a little more your speed;
maybe addone? (presenter 1)
Yeah, I feel like we could have done something a little more complicated
than addone, but fine. OK, so if we're going to try
to write addone, we probably should write
the Python equivalent so we know what we're trying to do. So I guess we're going to have
addone(X). And this will just return X + 1. Addone(5) is 6. All right. OK, so you just said
that I can create a code object just like any other Python object. And any other Python object
I construct by calling its type. So where do I find the type
for a code object? (presenter 2)
Ah, that would be the types module. (presenter 1)
OK, I guess that makes sense. And what am I importing
from the types module? (presenter 2)
CodeType. It's the type of code. (presenter 1)
All right. "from types import CodeType". And let's see what the docs say
about code types. So we'll do
"print(CodeType.__doc__)". All right. OK, we've got
a billion arguments here. Argcount, kwonlyargcount... All right, "Create a code object.
Not for the faint of heart." All right, well fortunately for us
we've got a bytecode expert here to guide us through this,
so I guess we'd better get started. Well, all right, so we'll do
"my_code = CodeType" of -- well, argcount, that's just 1.
We've only got one argument. Kwonlyargcount, I guess that's
"keyword only argument," so that's probably just 0. Nlocals. We've only got
one local variable in this function, which is X,
so that's probably just 1. Stacksize. I'm going to throw this
over to you, Mr. Bytecode Expert. What is stacksize? (presenter 2)
Stacksize tells Python how much space to allocate for variables
on the stack. So we need enough slots to hold
the maximum number of elements that will ever appear on the stack
at any given time. (presenter 1)
OK, well, the largest the stack is ever going to be in this function
is right before we execute the BINARY_ADD, when we've got both X and 1
on the stack. So the stack size here should be 2. All right, next up we've got
the flags argument. What are the flags? (presenter 2)
Flags is a bit mask representing a set of various options
this code object could have. There's a lot of these, so I went ahead
and prepared some material ahead of time. (presenter 1)
Wait, you prepared -- (presenter 2)
Could you be so kind as to hit the down arrow on the keyboard? (presenter 1)
What -- how did you even get these here? (presenter 2)
Let's get back on track here. The first flag here is CO_OPTIMIZED. This says that certain
optimizations can be made when executing this code object. In practice,
this means that this code object comes from a function
and not a class body or a module. The next flag is CO_NEWLOCALS. This says that a new
locals dictionary should be created every time we execute
this code object. Again, this just means that it's
a function and not a class or a module. (presenter 1)
OK, I'm guessing that CO_VARARGS means we take *args
in our function, and co var keywords says
that we take **kwargs? (presenter 2)
Exactly. The next flag we care about
is CO_NOFREE. Co nofree says this code object
does not share any variables with any other code objects
through a closure. (presenter 1)
Al right, and then last up here we've got CO_COROUTINE
and CO_ITERABLE_COROUTINE. What's the difference between
a coroutine and an iterable coroutine? (presenter 2)
These flags were added in Python 3.5 to support the async def
or types.coroutine decorator. So CO_COROUTINE is set when a function
is declared with async def, but ITERABLE_COROUTINE is when
a function is an old-style coroutine decorated with types.coroutine. (presenter 1)
All right, well, there were definitely a lot of those flags. I guess we should try to get back
to that original function. And there's more flags, OK. (presenter 2)
These are the flags enabled when you do a
from __future__ import statement. For example,
from __future__ import division. (presenter 1)
OK, I think I've seen ABSOLUTE_IMPORT, WITH_STATEMENT, PRINT_FUNCTION,
UNICODE_LITERALS... What's CO_FUTURE_BARRY_AS_BDFL? (presenter 2)
Ah, that says that the user has enabled
enhanced inequality syntax. (presenter 1)
Naturally. Okay, well I -- all right, I've got to imagine
that's the last of the flags. So can we just get back
to that function that I was running? Why did you reformat all of this? (presenter 2)
You'll see, I've selected the flags that we need here: CO_OPTIMIZED,
CO_NEWLOCALS, and CO_NOFREE. (presenter 1)
I am so changing all of my passwords when this is over. OK, so we've done argcount,
kwonlyargcount, nlocals, stacksize, flags. Next up is codestring, and I don't see
anything else about bytecode, so I'm guessing this is our main event.
So we're going to want a "bytes," and then probably it's easiest
to just write these as integers. So we need the actual op codes
for our bytecode. So what do we need here? (presenter 2)
Ah, we will need 124, 0, 0, 100, 0, 0, 23 83. [laughter] (presenter 1)
Care to explain any of that for the rest of us here? (presenter 2)
Yes, what this function needs to do is load X onto the stack,
then load 1 onto the stack, perform a BINARY_ADD
to add them together, and then a RETURN_VALUE
to return this value to the caller. So we start with 124, which is the opcode
for the LOAD_FAST instruction. We only have one local variable,
so we can store it at index 0. Next, we will emit 100, which is the
opcode for the LOAD_CONST instruction. We only have one constant,
so we can store that in index 0. Finally, we have 23 and 83, which are the opcodes for BINARY_ADD
and RETURN_VALUE, as we saw earlier. (presenter 1)
OK, I guess that's not that bad. All right, next up
we've got constants. Well, you just said we're only
going to have one constant, so this is just the tuple
containing 1 for our addone. Now we've got names and varnames. What's the difference between
a name and a varname? (presenter 2)
Names is a tuple containing the names of any global variables or attributes
that will be referenced in this function. Since we don't have any,
we can just use an empty tuple. (presenter 1)
All right, one empty tuple coming right up. (presenter 2)
The varnames are the names of all of the local variables
of this function. So this can just be the tuple
containing the string X. (presenter 1)
All right, got a tuple containing X. (presenter 2)
The next four arguments don't really mean much if we create
a code object from scratch like this. So the filename is the name of the file
of source where this function came from. We don't have one
so you can pick your favorite string. Next is the name,
which is the name of this code object. This should just be addone. Then we have the firstlineno,
which is the first line in the source file
where this code object appears. This can just be
your favorite integer. Finally we have the lnotab,
which stands for the line number table. This is a mapping
between bytecode offsets to line offsets in the file. We don't have any lines, so this
should just be an empty bytes object. (presenter 1)
All right, I can do that. All right, and then last but not least,
we've got the freevars and cellvars. (presenter 2)
these are the names of any variables that we share with other
code objects through a closure. Because we don't share any variables
and we set CO_NOFREE, these both better be empty. (presenter 1)
All right, two more empty tuples. All right. So that's all the arguments. So I guess if we make --
if we call this thing, then we should have ourselves
some executable code. All right, that didn't crash,
so I guess we should try it, right? My_code(5), and this should give us --
hey, what gives? I thought you said you were
some kind of bytecode expert. (presenter 2)
We don't normally call code objects, do we? No, we work with
function objects. (presenter 1)
OK, well, I bet you I know what you're going to say next
is that I can make a function object just like any other type in Python,
which means I need to call its type, which means I need to get
FunctionType. So "from types import FunctionType". All right, well,
that worked well enough. Let's see what the docs say
about FunctionType. FunctionType.__doc__. This is way easier
than the code object. So this says, "Create a function object
from a code object and a dictionary." Well, I've got a code object
and I know how to make a dictionary, so I think we've got this one. All right, so I can do
my_addone = FunctionType(my_code and an empty dict. All right, that didn't crash. All right moment of truth:
my_addone(5)... gives me 6! [applause] All right, I guess we don't even need
the CPython compiler anymore. But I suppose we should see
if we generated the same thing. So let's do dis.dis(addone), and then we'll just print
a separator to -- the addone here is CPython's version and then I'll do dis.dis(my_addone)
so we can see the difference. All right, well, LOAD_FAST, LOAD_CONST,
BINARY_ADD, RETURN_VALUE. LOAD_FAST, LOAD_CONST,
BINARY_ADD, RETURN_VALUE. Other than those nonsense line numbers,
I think we've got exactly the same thing. (presenter 2)
Well, not quite. You'll notice ours has
a LOAD_CONST of 0 but the compiler gave us
a LOAD_CONST of 1. (presenter 1)
Hey yeah, that's kind of interesting. Why is C python's version
doing a LOAD_CONST of 1? All right, well we'll do
print(my_addone.__code.__c_consts). That's just our tuple containing 1.
That's what we should expect. What did C Python generate? print(addone.__code__.co_consts). We got None -- why is None
in the co_consts from C Python? Nothing uses None here. (presenter 2)
That's just a quirk of the compiler. None will always be at index 0
for the co_consts. (presenter 1)
Wait, so are you saying that our handcrafted
artisanal organic bytecode is actually more sleek and optimized
than what C Python generates? (presenter 2)
In a way that does not matter at all. [laughter] (presenter 1)
I don't know. That None, that could be the difference.
OK, we -- wait a second. So if CPython is just looking up
values out of this constants tuple, does that mean I can just
switch that out on a function and change its behavior? What happens if I do
my_addone.__code__.co_consts = (2,)? Can I change my_addone
into my_add -- oh, man. (presenter 2)
Luckily, Python blocks these kinds of shenanigans. If you were able to execute that line, anyone who had a reference to my_addone
just got a reference to my_addtwo. (presenter 1)
My_addtwo sounds great. I don't know
why you wouldn't want that. Actually, I guess I can imagine
some scenarios where you might want, you know, numbers to stay the same. OK, so if mutating code objects
in place is a bad idea, does that mean there's no way for us
to take a code object and turn it into something else? (presenter 2)
Well, we can't mutate it in place, but we can always just
make a new code object by copying all the attributes
off our old one and changing any parameters. (presenter 1)
OK, so you're saying that what we need is like a function that performs
a functional update on a function. All right, I think I can write that. (presenter 2)
I also went ahead and wrote this one for you. It's a little complicated, you know.
Save some time. (presenter 1)
All right, so what you're saying this function does is
it takes a function f and **kwds, and what it does is
grab the code off of the old f and then constructs
a new code object by copying all the attributes
from the old code object but overriding any attributes
that were passed into this, and then wrapping that up
in a new FunctionType with all the other attributes
of the function copy. So that means that I should be
able to do update of -- Let me make sure I executed that. All right, you're saying
that this means I should be able to do update(my_addone,
co_consts = (2,)_, And this should give me a new function
that instead of adding 1 adds 2. So this will be my_addtwo. All right, still didn't crash. My_addtwo(5). Bam, we get 7! [applause] I guess we're all well on our way
to becoming bytecode experts too. (presenter 2)
You know, that that's cute and all, but you'll only get so far
updating the metadata. The real meat is in that bytecode. (presenter 1)
OK, well, co_code is just another attribute
of the bytecode, right? If I can update co_consts,
I can just as well update co_code. (presenter 2)
Now you're cooking with gas! Why don't we write a function
that updates all the 23s with 20 in that co_code? (presenter 1)
Wait, 23s and 20s? (presenter 2)
Oh, BINARY_ADD and BINARY_MULTIPLY. (presenter 1)
OK, so you're saying that what we should write is like
def add_to_mul that's going to take a function
and then it will grab its __code__. So we'll do
old = f.__code__.co_code. And then we want to do
is take this bytes object and replace all the 23 bytes with 20s,
which should replace all the binary add instructions
with binary multiplies. So we'll do new = old.replace, and we'll do bytes([23])
and then bytes([20]). And then finally I'm going to
take that bytes object and wrap it back up
in a new function. So I'm going to
return update(f, co_code=new). And then if I do
add_to_mul(my_addtwo), we didn't change the name,
but if I do add_to_mul(my_addtwo), then I should get my_multwo. And if I call my_multwo -- move that up a little bit --
of 5, I get 10. [applause] (presenter 2)
See, bytecode hacking isn't so hard when you know how everything works. [laughter] (presenter 1)
You know, I think there's actually a bug in this generation algorithm
you gave me. (presenter 2)
No I don't write bugs. How could there be a bug? All we did was replace the binary adds
with binary multiplies. (presenter 1)
Well, no, we replaced all the 23s with 20s, and you told me not a moment ago
that not all the instructions in the byte -- er, not all the bytes
in the bytecode are instructions. Some of them are arguments. (presenter 2)
Yeah, but I mean, 23 means add, like -- 23 is never going to be
an argument... ...right? (presenter 1)
Well, what if we had a function that had 23 local variables? (presenter 2)
No one's going to write a function with 23 local variables. (presenter 1)
Well, now that you mention it, I actually have a function -- [laughter and applause] I actually have a function right here
that takes 26 local variables. So this is my get_x function. And, you know, you can pass it
all the alphabet and it returns X. And X, in case you haven't noticed,
is the 23rd letter of the alphabet. So if I do get_x(*ascii_lowercase),
which is just all the lower case letters, then I get x. And I'm not doing any addition
or multiplication or any fancy math stuff here,
I'm just returning a value here. So add_to_mul should just be
a null op on get_x, right? (presenter 2)
Yeah... [laughter] (presenter 1)
But add_to_mul is going to replace all the 23s with 20s, which means instead of loading
the local variable at index 23, I'm going to load
the local variable index 20. And that means
that add_to_mul(get_x) is going to turn it into get_u. [laughter, applause] And you know, now that I think about it,
this actually could have been a lot worse. I mean, at least there was
a local variable at index 20 for us to load here, right?
I mean, what would have happened if we had, you know,
swapped out the index entirely so that it wasn't even
a valid value in range? Like, say we did update(my_addone), and we just did
co_consts = an empty tuple. That would turn my_addone into
some sort of, like... baddone. And if I do baddone(5)... Ooh. [laughter, applause] (presenter 2)
I think you segfaulted the interpreter there. Oh, you know,
now that I think about it, I think this bug also manifests
in those jumps. You know, we were just going to jump
to the bytecode offset in that argument, but if that wasn't a valid instruction
or out of range entirely, like, who knows what would happen? (presenter 1)
Yeah, and now that I think about jumps a little bit more, those jumps
are just, you know, going to some particular offset
into the bytecode, which means we could never insert
or delete any instructions. We would change all those jump offsets.
They would never work, right? (presenter 2)
Yeah, we would need some way to recalculate
all those jump offsets. (presenter 1)
That seems like a lot of work. (presenter 2)
Yeah. (presenter 1)
Hmm, this bytecode hacking thing feels harder than I thought. (presenter 2)
Didn't you say earlier that Joe and Scott had worked on
a library to help with some of this? (presenter 1)
Oh yeah, codetransformer. I actually downloaded it
right before the talk. I was thinking, you know,
maybe if we got far enough we could look at it a little bit. Maybe they've got some ideas
for how to solve some of these problems. So let's do from codetransformer. -- all right, what do we got here?
Code, CodeTransformer -- I guess that makes sense,
there's a CodeTransformer class -- decompiler, display, instructions,
option, patterns, tests -- they've got tests --
they've got transformers, import -- Joe and Scott imported --
er, implemented add2mul. I guess great minds
really do think alike. All right, well let's see
what's in add2mul here. Add2mul. Oh, it's a module. So I guess we should probably go
try to look at the source for this. OK, well, add2mul. So add2mul is "a transformer
that replaces BINARY_ADD instructions "with BINARY_MULTIPLY instructions." And it looks like
what's happening here is we're doing from codetransformer,
import CodeTransformer, and pattern. And then
from codetransformer.instructions, we're importing BINARY_ADD
and BINARY_MULTIPLY. So there must be some sort of
instruction objects being used here. That that seems a lot nicer
than just memorizing 23 and 20 all over the place, right? (presenter 2)
Maybe. (presenter 1)
OK, and then we're going to make a CodeTransformer class. And we've got a method
decorated with a pattern. So it looks like
what's happening here is we're specifying patterns
of instructions to match, and then we write generators that yield
replacements for those instructions. So here what's happening is we're saying
"Match the pattern BINARY_ADD," and whenever we see that pattern, just yield a BINARY_MULTIPLY
as a replacement for it. (presenter 2)
What's that steal method do? (presenter 1)
Well, let's see. Through the magic of Emacs,
we can find out. Steal says
"Steal the jump index off of 'instr'. "This makes anything
that would have jumped to 'instr' "jump to this Instruction instead." So I guess this is some technique
for dealing with some of those jump resolution issues
that we thought about a moment ago. (presenter 2)
Yeah, that sounds a lot nicer. (presenter 1)
Yeah. (presenter 2)
You know, we built some interesting tools today,
but they weren't particularly useful. Maybe there are some useful tools
built into CodeTransformer itself. (presenter 1)
Well, let's see what we've got here. So we've got transformers import. We've got asconstants,
bytearray_literals, decimal_literals, frozenset, interpolated_strings. How about ordereddict_literarls?
That sounds kind of interesting. So I think I actually read
in the documentation that these are supposed to be used
as decorators. So if I do @ordereddict_literals,
and then I do def make_dict, let's say take A, B, C, and then we'll do
return 'A' mapped to A, and 'B' maps to B,
and 'C' maps to C. All right, and then if I call this,
do make_dict(1, 2, 3)... hey, look at that. I get an ordered dict
instead of a regular dict. That's pretty neat. All right, let's see if we can find
one more to go through here. Whoops. We've got haskell_strs...
all right, interpolated_strings. I think this is for those of us
who were too impatient to wait for the new f-strings
feature in Python 3.6. So if I do @interpolated_strings
and I do def inter, say we'll take an A and a B,
and we'll just return a bytes. That returns A and B. And if you're used to
something like Ruby or languages that do
string interpolation, I think what should happen here is,
if I call that with 1, 2, it just magically gets
interpolated into my string. [laughter] So, I mean, I guess there --
maybe there are some, you know, non-insane uses for this. But I actually think that's
just about all the time we have here. So to recap a little bit, you know,
I know you guys were all really excited to see Joe and Scott come out
and talk about bytecode here today. But I hope, you know,
thank you to my bytecode expert friend for coming up on short notice
with no planning whatsoever. [laughter] And just to recap a little bit
what we talked about today, we looked at CPython's
internal code representation. We saw some techniques for constructing
code objects from scratch. We looked at various ways
that we could swap out attributes of code objects
to change their behavior. But we also saw a lot of the dangers
of, you know, playing God with the CPython compiler
for ourselves. And maybe at the end here
we saw a few techniques for trying to mitigate
some of those dangers. But yeah, again, I want to thank you all
for coming out here. I hope you all have a great PyCon. [applause] So in case you guys haven't
figured it out by now, I'm Scott. (presenter 2)
I'm Joe. (Scott Sanderson)
We wrote a library called codetransformer that's for doing this kind of
bytecode manipulation. We think it's kind of a silly
and whimsical talk -- er, topic, so we wanted to do a sort of
silly and whimsical talk that went with that theme. In real life when we're not
doing things like this, we work at a company
called Quantopian that builds tools for people
who do algorithmic trading in Python. We do not use any of the techniques
that you just saw there to trade other people's money. (Joe Jevnik)
Or anywhere else on the platform, for that matter. (Scott Sanderson)
You can find us both on GitHub. I'm github.com/ssanderson. (Joe Jevnik)
And I'm a barcode. (Scott Sanderson)
What Joe means by that is he's github.com slash ten lowercase L's. [laughter] And on Twitter you can again find me
at the reasonable name of @ssanderson. (Joe Jevnik)
And I'm @__qualname__. (Scott Sanderson)
So if we've got time to do questions, we'd be happy to take questions
in whatever time we've got left. (host)
Absolutely. Please raise your hand if you'd like to ask a question. (Scott Sanderson)
We've got one down in the front here. (audience member)
Hi, thanks for the awesome talk. Well, I have two questions. The first is, you said that this is
only CPython-specific. Is this the Python's model
of [indistinct]? (Scott Sanderson)
No, so all of the bytecode stuff we saw here
is very specific to CPython and even down to minor versions
of CPython. So the bytecode format
is not in any way, like, standardized or guaranteed to be stable
across versions. So bytecode from Python 3.42 to 3.43
would not even necessarily be the same. (audience member)
I mean, I understand that the bytecode is not the same, but doesn't Jython
for example have the same code model? (Scott Sanderson)
But Jython's generating, like, Java, jvm bytecode. It's a totally different
format with totally different semantics. So like, I don't think there even is
a dis module in Jython or pypy for example,
or if it does, it's just for compat. (audience member)
All right, and my second question is, can you jump into the middle
of the instructions? (Scott Sanderson)
Can you jump into the middle of an instruction? (audience member)
Right. Like, if your instruction is 3 bytes, can you jump into the second byte? (Joe Jevnik)
The compiler does no validation on the bytecode at run time. So it assumes that the compiler
has generated sane code. You can feed anything you want there
and it will just run. But there's no guarantee that will do
anything other than segfault. (Scott Sanderson)
Yeah, so you can, but almost certainly, terrible terrible things
will happen to your machine. (Joe Jevnik)
And just so people know, all of these examples were done
on Python 3.5. Like, when we say there are changes
between versions, like, 3.5 added new instructions. On 3.6 head of default right now. There's word code which changes
the size of all of these instructions. So this is truly
an implementation detail. [audience member inaudible,
speaking far from microphone] (host)
Pardon me. Please raise your hand and have the microphone come to you
so we can get that in the recording. Thanks. This young lady had a question. (audience member)
The LOAD_FAST instruction, it looks like it ended
and it was actually 3 bytes. What -- then the last one was 0. So I was just wondering
what that signified. (Scott Sanderson)
You're asking about -- (Joe Jevnik)
It's the 124, like 1, 0 or 0, 0? (audience member)
Yeah. (Joe Jevnik)
So the 124 is the opcode which says this instruction is a LOAD_FAST
and it's going to load a local variable. And when it sees the 124
in the interpreter loop, it will then read the next two bytes
as a short or a little endian integer. And it will use that to index
into the fast locals, which is a field in an array,
and use that to load a local variable out. So it's 1, 0
because it's little endian and the least significant byte
is first. (Scott Sanderson)
Do we want to get that question on microphone? (audience member)
So you're saying you're working on Python 3.5, 3.6, was it? Your interpreted --
interpolated strings, you used a bytecode object
to return that. But the result we saw
was a Unicode string. (Scott and Joe)
Yes. (audience member)
Does it actually decode? (Scott Sanderson)
Yeah, so I sort of glossed over this, actually. You notice
that when I decorated that, I called it instead of just
decorating directly. So one of the things that that
interpolated strings decorator does is say which --
if I can get back -- there we go. It takes keyword arguments to say
which kind of string literals that should apply this to,
so by default it will -- basically it lets you use bytes --
like, byte strings as f-strings, rather than as Unicode strings,
and you can pass a flag that says "also do this for Unicode strings"
or "don't use it for bytes." But it's really just using that
as a signifier, and under the hood this is
essentially getting rewritten into something like the literal
.decode UTF-8 .format **locals. (Joe Jevnik)
So we just use bytes because it's less common to see bytes literals
than Unicode literals. So that's more of just a marker that says we're going to do
something special with it. (host)
Any more questions? (Scott Sanderson)
We've got one here. (audience member)
How many times did you guys go through this awesome talk and practice
your dry run? (Scott Sanderson)
We've probably rehearsed it, start to finish, 15, 20 times now. [applause] (audience member)
So I'm just wondering if you're familiar with kind of broader
practical applications of this, like using it to write
directly to hardware, or, you know, places where
eliminating a few lines here and there
actually could be useful, and if you know of anybody
that's done that type of stuff. (Joe Jevnik)
In terms of practical applications, there aren't many because this is
truly implementation-specific, and like, no guarantees
that any of this will work. I have seen some interesting things
with a project called pyrasite, which lets you hook into a running
Python process and inject code. And someone had told me
they were working on a project, I think it works totally, where they
could hook a function at runtime to add break points so that they could
attach a break point to a running server and then you'd hit a route and then
it would hit their break point. And then when they were done, they could
just strip those break points out and there was no trace
that they were there. (Scott Sanderson)
That's sort of -- we talk about, like, everyone who had an addone
suddenly got an addtwo. For something like "I want to inject
a debugger in there," it's actually a case where you might
truly want to suddenly globally change everyone's behavior. Another example of something
that's kind of in the same vein although much more extreme than this
is there's a project in the Numerical community
from Continuum Analytics called Numba. And so here what we were doing
was taking bytecode and sort of rewriting it
into different bytecode. What Numba does is take your bytecode
and just throw it in the trash and replace it with lvm
intermediate representation so that you can -- it takes your Python
and tries to transliterate it into a much more low-level
machine-specific language. And it's, like, numpy-aware
and does some other fancy things. So that's sort of like
the logical extreme of this if you're trying to do it
for something like performance. (host)
Other questions? (audience member)
So if I was reading the example with a simple function
and the LOAD_FAST, it seems to mean
that I can't have a function with more than 64k local variables.
Is that right? (Joe Jevnik)
There's actually a pseudo-instruction called extended_arg. And it has an opcode
followed by two arguments. And then that would precede
an instruction that also took a short
as its argument, and it would merge those together
to get a wider argument. So I believe those will be emitted
for a LOAD_FAST. I'm not positive.
I haven't tried that. I'm sure codetransformer has a test
somewhere that does that, but -- (Scott Sanderson)
If you want to see some examples of, like, truly horrific code
or strange code, look at the codetransformer
tests we did. [laughter] (host)
Other questions? Well, please help me in thanking
these guys for a great talk. [applause]
This isn't very useful in real production code, but heck of a presentation to watch. Highly recommend.