Scott Sanderson, Joe Jevnik - Playing with Python Bytecode - PyCon 2016

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

This isn't very useful in real production code, but heck of a presentation to watch. Highly recommend.

👍︎︎ 2 👤︎︎ u/KronosKoderS 📅︎︎ Jun 01 2016 🗫︎ replies
Captions
Welcome. Please help me welcome Scott Sanderson and Joe Jevnik for their presentation, Playing with Python Bytecode. [applause] Hey everybody. Thank you all for coming out today. I know you were all really excited to see Scott and Joe come here and talk about Playing with Python Bytecode. Unfortunately I have a little bit of bad news today. Scott and Joe actually couldn't make it. They just texted us saying they were involved in a freak scheduling accident. But right before they texted us, they also sent us their outline. So I know everyone here was really excited to learn about bytecode, so I'm going to try to do my best to give the talk in their place. So their outline said that they were going to talk about CPython's internal code representation. They were going to demonstrate some techniques for creating code objects from scratch, whatever that means. And they were going to show some techniques for manipulating and inspecting code objects and making code from new code. So I guess if this is going to be a talk, you know, about manipulating functions and bytecode at runtime, we probably need some functions and bytecode to manipulate. So maybe we want to start with, like, a "def add(a, b):". And just very simple function. We'll "return a + b". And we'll just call that and make sure that works. All right. So far so good. So I guess, you know, we've got a function here and we want to somehow get at its bytecode. Everything that's sort of secret or interesting in CPython starts with a double underscore, so maybe we can try to find that here. So we've got __annotations__, __call__, __class__, __closure__... I don't see anything about bytecode but we got a __code__. If I was going to put a code object on a function, __code__ is probably where I would put it. Right, so we've got a code object. So this is code at some memory address. It was created by file "<ipython-input-six-gibberish>", but we really want -- we want bytecode here. So maybe we need to dig a little bit further. So, dot -- we've got a whole bunch of attributes here. So we've got co_argcount. So that's 2, so I guess co_argcount is just the number of arguments to this function. How about -- let's see, so we've got co_cellvars, co_consts. Yeah, well, maybe we'll try to figure out that together here. So co_consts on this I guess is just the constant -- or the tuple containing None. So maybe we'll figure out what that means a little bit later. Bytecode...hmm...bytecode... All right, well, there's a co_code here. And if we look at that, this is a byte string. So this is co_code. It's a bytes. My guess is this is probably the bytecode. So I guess the bytecode for add is |\x00\x00|\x01\x00\x17S. I think that makes sense to everyone, right? All right, sweet. OK, well, so this is a non-printing string, right? There's all these characters we can't see. So probably a better way to understand this is to just look at the raw integers in that bytes object. So, you know, if we do "list(add.__code__.co_code)", there's definitely a little bit more structure here, right. So I've got 124, 0, 0, 124, 1, 0, 23, 83. So, you know, there's definitely kind of a repeating pattern here. So maybe the first 124 corresponds to the first variable, 124, 1 corresponds to the second variable, and then 23 and 83 definitely means something. I was really thinking this might be a little bit easier. You know, I've got an idea. We're here. We're at PyCon. We're surrounded by some of the best, most knowledgeable programmers who know about Python around. Surely there's someone here in the audience who has worked with bytecode, who understands bytecode, who maybe could come up and help me, you know, teach everyone interactively about how bytecode is supposed to work, so is there anyone here who knows about bytecode or has worked with bytecode? (presenter 2) Well, I'm actually a PSF-certified bytecode expert. [laughter, applause] (presenter 1) Well, ladies and gentlemen, we have a PSF-certified bytecode expert here among us. Come up on stage. Can we get a microphone for him? (presenter 2) No need. I brought my own. (presenter 1) Wait, you brought your own microphone to somebody else's -- (presenter 2) Let's get back on track here. You had the right idea with that code object, but you're not going to get very far looking at it that way. Luckily, Python provides a module to help look at this. Why don't you try "import dis"? (presenter 1) OK, "import this"? All right, "The Zen of Python, by Tim Peters." (presenter 2) No, no, "import dis" with a D. It's the disassembly module. (presenter 1) Ah, OK. "Import dis". All right, I've imported dis. What do I do with dis? (presenter 2) We're going to call dis.dis(add). (presenter 1) All right, dis.dis(add). All right, well, that's definitely way better than just that, you know, list of integers. Maybe i'll put that back up there so we can see the difference. All right, well, can you tell us a little bit more about what this this table means? (presenter 2) Sure. So, while we have 8 bytes in our bytecode, we actually only have four instructions. We have a LOAD_FAST, another LOAD_FAST, a BINARY_ADD, and a RETURN_VALUE. So the 124, 0, 0 represents that first LOAD_FAST. The 124, 1, 0 represents the second LOAD_FAST. Then the 23 and 83 are the BINARY_ADD and RETURN_VALUE respectively. (presenter 1) OK, so 124, 0, 0 actually means LOAD_FAST. Why does LOAD_FAST take up three bytes in the bytecode when BINARY_ADD and RETURN_VALUE only take up one? (presenter 2) LOAD_FAST says to load a local variable, but it needs to know which local variable to load. So the second two bytes there are the argument which says which local variable we will be loading. The 124 is the opcode, and then 0, 0 says "load local variable 0." Then we have 124, 1, 0, which says "load local variable 1." (presenter 1) Wait, we're loading 1 and 0? I thought we wanted to load A and B. (presenter 2) Ah, the argument is encoded as a 16-bit little endian integer, which is an index into an array of local variables. So dis helps us out on the right by showing the numeric value of that argument, but that actually means we're going to load local variable A. Then on the second line we will see that the numeric value of the argument is 1, but that actually means "load local variable B." (presenter 1) OK, so 124, 0, 0 represents LOAD_FAST of 0, but that actually means LOAD_FAST of A. Where exactly are we loading A and B to here? (presenter 2) Load instructions load variables to a shared stack so that they may be manipulated by other instructions later. As we can see, the BINARY_ADD does not have an argument in the bytecode because it will just pop the top two values off the stack, add them together, and then push the result back onto the stack. (presenter 1) OK, let me make sure I understand here. At the start of this function we're gonna have an empty stack. We're going to do a LOAD_FAST of 0, which pushes A onto the stack. Then we're going to do a LOAD_FAST of 1, which pushes B onto the stack. We're going to do a BINARY_ADD, which will pop both values off the stack, add them together, and push the results back onto the stack. And then finally, we're going to execute a return value instruction which will pop the top value off the stack and return it to the calling stack frame? (presenter 2) Exactly. (presenter 1) OK, I think I understand the right-hand side of this table. How about this set of integers running down next to the instruction names? What do those mean? (presenter 2) Those are the bytecode offsets where those instructions appear. So of course, the first instruction starts at index 0. However, the second instruction starts at index 3, because indices 1 and 2 are occupied by the arguments to our LOAD_FAST. (presenter 1) OK, and then BINARY_ADD is at index 6 because indices 4 and 5 hold the arguments to the second LOAD_FAST. OK, I think I understand that. What is this "2" up in the top left-hand corner here? (presenter 2) That's the line number in our source code where these instructions appear. This would be a little easier if we tried a function with more than one line. (presenter 1) OK, well how about just def add_with_assign, and we'll still do A and B. But then we'll do X = A + B, and then we'll return X. And then we'll do dis.dis(add_with_assign). And, OK, so what this says is the first four instructions of our code object correspond to the second line of our cell, where we're doing X = A + B. And the last two instructions of the code object correspond to the third line of our cell, where we're doing a return X. (presenter 2) Yeah, I think you're getting the hang of this. Why don't we try a function that's a little more difficult? (presenter 1) OK, maybe like an absolute value function? So if we do def abs, take a single argument X, and then we'll say, if X is greater than 0, we'll return X. Oops, return X. Else, we'll return negative X. And then we'll do dis.dis(abs). It's got a nice ring to it. OK, I think I've got this one. So at the start of this function, we're going to do a LOAD_FAST of 0, but that actually means LOAD_FAST of X. Then we're going to do a LOAD_CONST of 1. So I guess that means there are different kinds of load instructions. We're going to a LOAD_CONST of 1, but that actually means "load the constant value 0." Then we're going to do a COMPARE_OP of four, which means..."greater than"? How to CPython know that 4 means "greater than"? (presenter 2) Well, not all arguments are just some index into some array. Here, the argument is actually an enum representing which comparison we want to perform. So there are entries in this enum for all the comparison operators like greater than, less than, or equals. (presenter 1) OK, and then after that, we're -- (presenter 2) How about I take the next instruction? It's a little difficult. The POP_JUMP_IF_FALSE does exactly what it says it will do. It pops the top value off the stack. If it's true, it continues execution like normal. However, if it's false, it will jump to the bytecode offset specified in this argument. (presenter 1) OK, so if the result of COMPARE_OP is truthy, we're just going to continue executing to the LOAD_FAST at index 12, but if it's falsy, we're going to jump to the instruction in index 16. (presenter 1) If you see those arrows, that's dis's hint to the non-experts that that instruction is a jump target. (presenter 2) OK. Let me make sure I can walk through this one more time. So if X is greater than 0, we're going to jump to these two instructions, execute a LOAD_FAST and RETURN_VALUE. If it's less than 0, then we're going to do a LOAD_FAST, then do a UNARY_NEGATIVE to negate X and then return that value. It looks like there's two instructions at the bottom of this function that can never be hit. Why are those even here? (presenter 2) You're right. Those instructions are actually dead code. CPython has a pretty simple code generation algorithm, and one of the rules is that if a function doesn't end in a return statement, an implicit "LOAD_CONST of none, RETURN_VALUE" is added. So it may appear like this function ends in a return because both branches end in a return, from CPython's perspective, this function actually ends in an "if" statement. So those dummy instructions will be added even though they can never be executed. (presenter 1) That seems kind of wasteful, don't you think? (presenter 2) Well, it's only four bytes, which is half a pointer. It's not really worth the added complexity to the compiler to remove them. (presenter 1) OK, but say we really cared about those four bytes. Is there some way that we could remove them? (presenter 2) Well, you don't have to use the compiler to create a code object. You can just create one like any other object in Python. (presenter 1) OK, well, let's write our own abs function. (presenter 2) Hold on there, killer. How about I start you off with a function a little more your speed; maybe addone? (presenter 1) Yeah, I feel like we could have done something a little more complicated than addone, but fine. OK, so if we're going to try to write addone, we probably should write the Python equivalent so we know what we're trying to do. So I guess we're going to have addone(X). And this will just return X + 1. Addone(5) is 6. All right. OK, so you just said that I can create a code object just like any other Python object. And any other Python object I construct by calling its type. So where do I find the type for a code object? (presenter 2) Ah, that would be the types module. (presenter 1) OK, I guess that makes sense. And what am I importing from the types module? (presenter 2) CodeType. It's the type of code. (presenter 1) All right. "from types import CodeType". And let's see what the docs say about code types. So we'll do "print(CodeType.__doc__)". All right. OK, we've got a billion arguments here. Argcount, kwonlyargcount... All right, "Create a code object. Not for the faint of heart." All right, well fortunately for us we've got a bytecode expert here to guide us through this, so I guess we'd better get started. Well, all right, so we'll do "my_code = CodeType" of -- well, argcount, that's just 1. We've only got one argument. Kwonlyargcount, I guess that's "keyword only argument," so that's probably just 0. Nlocals. We've only got one local variable in this function, which is X, so that's probably just 1. Stacksize. I'm going to throw this over to you, Mr. Bytecode Expert. What is stacksize? (presenter 2) Stacksize tells Python how much space to allocate for variables on the stack. So we need enough slots to hold the maximum number of elements that will ever appear on the stack at any given time. (presenter 1) OK, well, the largest the stack is ever going to be in this function is right before we execute the BINARY_ADD, when we've got both X and 1 on the stack. So the stack size here should be 2. All right, next up we've got the flags argument. What are the flags? (presenter 2) Flags is a bit mask representing a set of various options this code object could have. There's a lot of these, so I went ahead and prepared some material ahead of time. (presenter 1) Wait, you prepared -- (presenter 2) Could you be so kind as to hit the down arrow on the keyboard? (presenter 1) What -- how did you even get these here? (presenter 2) Let's get back on track here. The first flag here is CO_OPTIMIZED. This says that certain optimizations can be made when executing this code object. In practice, this means that this code object comes from a function and not a class body or a module. The next flag is CO_NEWLOCALS. This says that a new locals dictionary should be created every time we execute this code object. Again, this just means that it's a function and not a class or a module. (presenter 1) OK, I'm guessing that CO_VARARGS means we take *args in our function, and co var keywords says that we take **kwargs? (presenter 2) Exactly. The next flag we care about is CO_NOFREE. Co nofree says this code object does not share any variables with any other code objects through a closure. (presenter 1) Al right, and then last up here we've got CO_COROUTINE and CO_ITERABLE_COROUTINE. What's the difference between a coroutine and an iterable coroutine? (presenter 2) These flags were added in Python 3.5 to support the async def or types.coroutine decorator. So CO_COROUTINE is set when a function is declared with async def, but ITERABLE_COROUTINE is when a function is an old-style coroutine decorated with types.coroutine. (presenter 1) All right, well, there were definitely a lot of those flags. I guess we should try to get back to that original function. And there's more flags, OK. (presenter 2) These are the flags enabled when you do a from __future__ import statement. For example, from __future__ import division. (presenter 1) OK, I think I've seen ABSOLUTE_IMPORT, WITH_STATEMENT, PRINT_FUNCTION, UNICODE_LITERALS... What's CO_FUTURE_BARRY_AS_BDFL? (presenter 2) Ah, that says that the user has enabled enhanced inequality syntax. (presenter 1) Naturally. Okay, well I -- all right, I've got to imagine that's the last of the flags. So can we just get back to that function that I was running? Why did you reformat all of this? (presenter 2) You'll see, I've selected the flags that we need here: CO_OPTIMIZED, CO_NEWLOCALS, and CO_NOFREE. (presenter 1) I am so changing all of my passwords when this is over. OK, so we've done argcount, kwonlyargcount, nlocals, stacksize, flags. Next up is codestring, and I don't see anything else about bytecode, so I'm guessing this is our main event. So we're going to want a "bytes," and then probably it's easiest to just write these as integers. So we need the actual op codes for our bytecode. So what do we need here? (presenter 2) Ah, we will need 124, 0, 0, 100, 0, 0, 23 83. [laughter] (presenter 1) Care to explain any of that for the rest of us here? (presenter 2) Yes, what this function needs to do is load X onto the stack, then load 1 onto the stack, perform a BINARY_ADD to add them together, and then a RETURN_VALUE to return this value to the caller. So we start with 124, which is the opcode for the LOAD_FAST instruction. We only have one local variable, so we can store it at index 0. Next, we will emit 100, which is the opcode for the LOAD_CONST instruction. We only have one constant, so we can store that in index 0. Finally, we have 23 and 83, which are the opcodes for BINARY_ADD and RETURN_VALUE, as we saw earlier. (presenter 1) OK, I guess that's not that bad. All right, next up we've got constants. Well, you just said we're only going to have one constant, so this is just the tuple containing 1 for our addone. Now we've got names and varnames. What's the difference between a name and a varname? (presenter 2) Names is a tuple containing the names of any global variables or attributes that will be referenced in this function. Since we don't have any, we can just use an empty tuple. (presenter 1) All right, one empty tuple coming right up. (presenter 2) The varnames are the names of all of the local variables of this function. So this can just be the tuple containing the string X. (presenter 1) All right, got a tuple containing X. (presenter 2) The next four arguments don't really mean much if we create a code object from scratch like this. So the filename is the name of the file of source where this function came from. We don't have one so you can pick your favorite string. Next is the name, which is the name of this code object. This should just be addone. Then we have the firstlineno, which is the first line in the source file where this code object appears. This can just be your favorite integer. Finally we have the lnotab, which stands for the line number table. This is a mapping between bytecode offsets to line offsets in the file. We don't have any lines, so this should just be an empty bytes object. (presenter 1) All right, I can do that. All right, and then last but not least, we've got the freevars and cellvars. (presenter 2) these are the names of any variables that we share with other code objects through a closure. Because we don't share any variables and we set CO_NOFREE, these both better be empty. (presenter 1) All right, two more empty tuples. All right. So that's all the arguments. So I guess if we make -- if we call this thing, then we should have ourselves some executable code. All right, that didn't crash, so I guess we should try it, right? My_code(5), and this should give us -- hey, what gives? I thought you said you were some kind of bytecode expert. (presenter 2) We don't normally call code objects, do we? No, we work with function objects. (presenter 1) OK, well, I bet you I know what you're going to say next is that I can make a function object just like any other type in Python, which means I need to call its type, which means I need to get FunctionType. So "from types import FunctionType". All right, well, that worked well enough. Let's see what the docs say about FunctionType. FunctionType.__doc__. This is way easier than the code object. So this says, "Create a function object from a code object and a dictionary." Well, I've got a code object and I know how to make a dictionary, so I think we've got this one. All right, so I can do my_addone = FunctionType(my_code and an empty dict. All right, that didn't crash. All right moment of truth: my_addone(5)... gives me 6! [applause] All right, I guess we don't even need the CPython compiler anymore. But I suppose we should see if we generated the same thing. So let's do dis.dis(addone), and then we'll just print a separator to -- the addone here is CPython's version and then I'll do dis.dis(my_addone) so we can see the difference. All right, well, LOAD_FAST, LOAD_CONST, BINARY_ADD, RETURN_VALUE. LOAD_FAST, LOAD_CONST, BINARY_ADD, RETURN_VALUE. Other than those nonsense line numbers, I think we've got exactly the same thing. (presenter 2) Well, not quite. You'll notice ours has a LOAD_CONST of 0 but the compiler gave us a LOAD_CONST of 1. (presenter 1) Hey yeah, that's kind of interesting. Why is C python's version doing a LOAD_CONST of 1? All right, well we'll do print(my_addone.__code.__c_consts). That's just our tuple containing 1. That's what we should expect. What did C Python generate? print(addone.__code__.co_consts). We got None -- why is None in the co_consts from C Python? Nothing uses None here. (presenter 2) That's just a quirk of the compiler. None will always be at index 0 for the co_consts. (presenter 1) Wait, so are you saying that our handcrafted artisanal organic bytecode is actually more sleek and optimized than what C Python generates? (presenter 2) In a way that does not matter at all. [laughter] (presenter 1) I don't know. That None, that could be the difference. OK, we -- wait a second. So if CPython is just looking up values out of this constants tuple, does that mean I can just switch that out on a function and change its behavior? What happens if I do my_addone.__code__.co_consts = (2,)? Can I change my_addone into my_add -- oh, man. (presenter 2) Luckily, Python blocks these kinds of shenanigans. If you were able to execute that line, anyone who had a reference to my_addone just got a reference to my_addtwo. (presenter 1) My_addtwo sounds great. I don't know why you wouldn't want that. Actually, I guess I can imagine some scenarios where you might want, you know, numbers to stay the same. OK, so if mutating code objects in place is a bad idea, does that mean there's no way for us to take a code object and turn it into something else? (presenter 2) Well, we can't mutate it in place, but we can always just make a new code object by copying all the attributes off our old one and changing any parameters. (presenter 1) OK, so you're saying that what we need is like a function that performs a functional update on a function. All right, I think I can write that. (presenter 2) I also went ahead and wrote this one for you. It's a little complicated, you know. Save some time. (presenter 1) All right, so what you're saying this function does is it takes a function f and **kwds, and what it does is grab the code off of the old f and then constructs a new code object by copying all the attributes from the old code object but overriding any attributes that were passed into this, and then wrapping that up in a new FunctionType with all the other attributes of the function copy. So that means that I should be able to do update of -- Let me make sure I executed that. All right, you're saying that this means I should be able to do update(my_addone, co_consts = (2,)_, And this should give me a new function that instead of adding 1 adds 2. So this will be my_addtwo. All right, still didn't crash. My_addtwo(5). Bam, we get 7! [applause] I guess we're all well on our way to becoming bytecode experts too. (presenter 2) You know, that that's cute and all, but you'll only get so far updating the metadata. The real meat is in that bytecode. (presenter 1) OK, well, co_code is just another attribute of the bytecode, right? If I can update co_consts, I can just as well update co_code. (presenter 2) Now you're cooking with gas! Why don't we write a function that updates all the 23s with 20 in that co_code? (presenter 1) Wait, 23s and 20s? (presenter 2) Oh, BINARY_ADD and BINARY_MULTIPLY. (presenter 1) OK, so you're saying that what we should write is like def add_to_mul that's going to take a function and then it will grab its __code__. So we'll do old = f.__code__.co_code. And then we want to do is take this bytes object and replace all the 23 bytes with 20s, which should replace all the binary add instructions with binary multiplies. So we'll do new = old.replace, and we'll do bytes([23]) and then bytes([20]). And then finally I'm going to take that bytes object and wrap it back up in a new function. So I'm going to return update(f, co_code=new). And then if I do add_to_mul(my_addtwo), we didn't change the name, but if I do add_to_mul(my_addtwo), then I should get my_multwo. And if I call my_multwo -- move that up a little bit -- of 5, I get 10. [applause] (presenter 2) See, bytecode hacking isn't so hard when you know how everything works. [laughter] (presenter 1) You know, I think there's actually a bug in this generation algorithm you gave me. (presenter 2) No I don't write bugs. How could there be a bug? All we did was replace the binary adds with binary multiplies. (presenter 1) Well, no, we replaced all the 23s with 20s, and you told me not a moment ago that not all the instructions in the byte -- er, not all the bytes in the bytecode are instructions. Some of them are arguments. (presenter 2) Yeah, but I mean, 23 means add, like -- 23 is never going to be an argument... ...right? (presenter 1) Well, what if we had a function that had 23 local variables? (presenter 2) No one's going to write a function with 23 local variables. (presenter 1) Well, now that you mention it, I actually have a function -- [laughter and applause] I actually have a function right here that takes 26 local variables. So this is my get_x function. And, you know, you can pass it all the alphabet and it returns X. And X, in case you haven't noticed, is the 23rd letter of the alphabet. So if I do get_x(*ascii_lowercase), which is just all the lower case letters, then I get x. And I'm not doing any addition or multiplication or any fancy math stuff here, I'm just returning a value here. So add_to_mul should just be a null op on get_x, right? (presenter 2) Yeah... [laughter] (presenter 1) But add_to_mul is going to replace all the 23s with 20s, which means instead of loading the local variable at index 23, I'm going to load the local variable index 20. And that means that add_to_mul(get_x) is going to turn it into get_u. [laughter, applause] And you know, now that I think about it, this actually could have been a lot worse. I mean, at least there was a local variable at index 20 for us to load here, right? I mean, what would have happened if we had, you know, swapped out the index entirely so that it wasn't even a valid value in range? Like, say we did update(my_addone), and we just did co_consts = an empty tuple. That would turn my_addone into some sort of, like... baddone. And if I do baddone(5)... Ooh. [laughter, applause] (presenter 2) I think you segfaulted the interpreter there. Oh, you know, now that I think about it, I think this bug also manifests in those jumps. You know, we were just going to jump to the bytecode offset in that argument, but if that wasn't a valid instruction or out of range entirely, like, who knows what would happen? (presenter 1) Yeah, and now that I think about jumps a little bit more, those jumps are just, you know, going to some particular offset into the bytecode, which means we could never insert or delete any instructions. We would change all those jump offsets. They would never work, right? (presenter 2) Yeah, we would need some way to recalculate all those jump offsets. (presenter 1) That seems like a lot of work. (presenter 2) Yeah. (presenter 1) Hmm, this bytecode hacking thing feels harder than I thought. (presenter 2) Didn't you say earlier that Joe and Scott had worked on a library to help with some of this? (presenter 1) Oh yeah, codetransformer. I actually downloaded it right before the talk. I was thinking, you know, maybe if we got far enough we could look at it a little bit. Maybe they've got some ideas for how to solve some of these problems. So let's do from codetransformer. -- all right, what do we got here? Code, CodeTransformer -- I guess that makes sense, there's a CodeTransformer class -- decompiler, display, instructions, option, patterns, tests -- they've got tests -- they've got transformers, import -- Joe and Scott imported -- er, implemented add2mul. I guess great minds really do think alike. All right, well let's see what's in add2mul here. Add2mul. Oh, it's a module. So I guess we should probably go try to look at the source for this. OK, well, add2mul. So add2mul is "a transformer that replaces BINARY_ADD instructions "with BINARY_MULTIPLY instructions." And it looks like what's happening here is we're doing from codetransformer, import CodeTransformer, and pattern. And then from codetransformer.instructions, we're importing BINARY_ADD and BINARY_MULTIPLY. So there must be some sort of instruction objects being used here. That that seems a lot nicer than just memorizing 23 and 20 all over the place, right? (presenter 2) Maybe. (presenter 1) OK, and then we're going to make a CodeTransformer class. And we've got a method decorated with a pattern. So it looks like what's happening here is we're specifying patterns of instructions to match, and then we write generators that yield replacements for those instructions. So here what's happening is we're saying "Match the pattern BINARY_ADD," and whenever we see that pattern, just yield a BINARY_MULTIPLY as a replacement for it. (presenter 2) What's that steal method do? (presenter 1) Well, let's see. Through the magic of Emacs, we can find out. Steal says "Steal the jump index off of 'instr'. "This makes anything that would have jumped to 'instr' "jump to this Instruction instead." So I guess this is some technique for dealing with some of those jump resolution issues that we thought about a moment ago. (presenter 2) Yeah, that sounds a lot nicer. (presenter 1) Yeah. (presenter 2) You know, we built some interesting tools today, but they weren't particularly useful. Maybe there are some useful tools built into CodeTransformer itself. (presenter 1) Well, let's see what we've got here. So we've got transformers import. We've got asconstants, bytearray_literals, decimal_literals, frozenset, interpolated_strings. How about ordereddict_literarls? That sounds kind of interesting. So I think I actually read in the documentation that these are supposed to be used as decorators. So if I do @ordereddict_literals, and then I do def make_dict, let's say take A, B, C, and then we'll do return 'A' mapped to A, and 'B' maps to B, and 'C' maps to C. All right, and then if I call this, do make_dict(1, 2, 3)... hey, look at that. I get an ordered dict instead of a regular dict. That's pretty neat. All right, let's see if we can find one more to go through here. Whoops. We've got haskell_strs... all right, interpolated_strings. I think this is for those of us who were too impatient to wait for the new f-strings feature in Python 3.6. So if I do @interpolated_strings and I do def inter, say we'll take an A and a B, and we'll just return a bytes. That returns A and B. And if you're used to something like Ruby or languages that do string interpolation, I think what should happen here is, if I call that with 1, 2, it just magically gets interpolated into my string. [laughter] So, I mean, I guess there -- maybe there are some, you know, non-insane uses for this. But I actually think that's just about all the time we have here. So to recap a little bit, you know, I know you guys were all really excited to see Joe and Scott come out and talk about bytecode here today. But I hope, you know, thank you to my bytecode expert friend for coming up on short notice with no planning whatsoever. [laughter] And just to recap a little bit what we talked about today, we looked at CPython's internal code representation. We saw some techniques for constructing code objects from scratch. We looked at various ways that we could swap out attributes of code objects to change their behavior. But we also saw a lot of the dangers of, you know, playing God with the CPython compiler for ourselves. And maybe at the end here we saw a few techniques for trying to mitigate some of those dangers. But yeah, again, I want to thank you all for coming out here. I hope you all have a great PyCon. [applause] So in case you guys haven't figured it out by now, I'm Scott. (presenter 2) I'm Joe. (Scott Sanderson) We wrote a library called codetransformer that's for doing this kind of bytecode manipulation. We think it's kind of a silly and whimsical talk -- er, topic, so we wanted to do a sort of silly and whimsical talk that went with that theme. In real life when we're not doing things like this, we work at a company called Quantopian that builds tools for people who do algorithmic trading in Python. We do not use any of the techniques that you just saw there to trade other people's money. (Joe Jevnik) Or anywhere else on the platform, for that matter. (Scott Sanderson) You can find us both on GitHub. I'm github.com/ssanderson. (Joe Jevnik) And I'm a barcode. (Scott Sanderson) What Joe means by that is he's github.com slash ten lowercase L's. [laughter] And on Twitter you can again find me at the reasonable name of @ssanderson. (Joe Jevnik) And I'm @__qualname__. (Scott Sanderson) So if we've got time to do questions, we'd be happy to take questions in whatever time we've got left. (host) Absolutely. Please raise your hand if you'd like to ask a question. (Scott Sanderson) We've got one down in the front here. (audience member) Hi, thanks for the awesome talk. Well, I have two questions. The first is, you said that this is only CPython-specific. Is this the Python's model of [indistinct]? (Scott Sanderson) No, so all of the bytecode stuff we saw here is very specific to CPython and even down to minor versions of CPython. So the bytecode format is not in any way, like, standardized or guaranteed to be stable across versions. So bytecode from Python 3.42 to 3.43 would not even necessarily be the same. (audience member) I mean, I understand that the bytecode is not the same, but doesn't Jython for example have the same code model? (Scott Sanderson) But Jython's generating, like, Java, jvm bytecode. It's a totally different format with totally different semantics. So like, I don't think there even is a dis module in Jython or pypy for example, or if it does, it's just for compat. (audience member) All right, and my second question is, can you jump into the middle of the instructions? (Scott Sanderson) Can you jump into the middle of an instruction? (audience member) Right. Like, if your instruction is 3 bytes, can you jump into the second byte? (Joe Jevnik) The compiler does no validation on the bytecode at run time. So it assumes that the compiler has generated sane code. You can feed anything you want there and it will just run. But there's no guarantee that will do anything other than segfault. (Scott Sanderson) Yeah, so you can, but almost certainly, terrible terrible things will happen to your machine. (Joe Jevnik) And just so people know, all of these examples were done on Python 3.5. Like, when we say there are changes between versions, like, 3.5 added new instructions. On 3.6 head of default right now. There's word code which changes the size of all of these instructions. So this is truly an implementation detail. [audience member inaudible, speaking far from microphone] (host) Pardon me. Please raise your hand and have the microphone come to you so we can get that in the recording. Thanks. This young lady had a question. (audience member) The LOAD_FAST instruction, it looks like it ended and it was actually 3 bytes. What -- then the last one was 0. So I was just wondering what that signified. (Scott Sanderson) You're asking about -- (Joe Jevnik) It's the 124, like 1, 0 or 0, 0? (audience member) Yeah. (Joe Jevnik) So the 124 is the opcode which says this instruction is a LOAD_FAST and it's going to load a local variable. And when it sees the 124 in the interpreter loop, it will then read the next two bytes as a short or a little endian integer. And it will use that to index into the fast locals, which is a field in an array, and use that to load a local variable out. So it's 1, 0 because it's little endian and the least significant byte is first. (Scott Sanderson) Do we want to get that question on microphone? (audience member) So you're saying you're working on Python 3.5, 3.6, was it? Your interpreted -- interpolated strings, you used a bytecode object to return that. But the result we saw was a Unicode string. (Scott and Joe) Yes. (audience member) Does it actually decode? (Scott Sanderson) Yeah, so I sort of glossed over this, actually. You notice that when I decorated that, I called it instead of just decorating directly. So one of the things that that interpolated strings decorator does is say which -- if I can get back -- there we go. It takes keyword arguments to say which kind of string literals that should apply this to, so by default it will -- basically it lets you use bytes -- like, byte strings as f-strings, rather than as Unicode strings, and you can pass a flag that says "also do this for Unicode strings" or "don't use it for bytes." But it's really just using that as a signifier, and under the hood this is essentially getting rewritten into something like the literal .decode UTF-8 .format **locals. (Joe Jevnik) So we just use bytes because it's less common to see bytes literals than Unicode literals. So that's more of just a marker that says we're going to do something special with it. (host) Any more questions? (Scott Sanderson) We've got one here. (audience member) How many times did you guys go through this awesome talk and practice your dry run? (Scott Sanderson) We've probably rehearsed it, start to finish, 15, 20 times now. [applause] (audience member) So I'm just wondering if you're familiar with kind of broader practical applications of this, like using it to write directly to hardware, or, you know, places where eliminating a few lines here and there actually could be useful, and if you know of anybody that's done that type of stuff. (Joe Jevnik) In terms of practical applications, there aren't many because this is truly implementation-specific, and like, no guarantees that any of this will work. I have seen some interesting things with a project called pyrasite, which lets you hook into a running Python process and inject code. And someone had told me they were working on a project, I think it works totally, where they could hook a function at runtime to add break points so that they could attach a break point to a running server and then you'd hit a route and then it would hit their break point. And then when they were done, they could just strip those break points out and there was no trace that they were there. (Scott Sanderson) That's sort of -- we talk about, like, everyone who had an addone suddenly got an addtwo. For something like "I want to inject a debugger in there," it's actually a case where you might truly want to suddenly globally change everyone's behavior. Another example of something that's kind of in the same vein although much more extreme than this is there's a project in the Numerical community from Continuum Analytics called Numba. And so here what we were doing was taking bytecode and sort of rewriting it into different bytecode. What Numba does is take your bytecode and just throw it in the trash and replace it with lvm intermediate representation so that you can -- it takes your Python and tries to transliterate it into a much more low-level machine-specific language. And it's, like, numpy-aware and does some other fancy things. So that's sort of like the logical extreme of this if you're trying to do it for something like performance. (host) Other questions? (audience member) So if I was reading the example with a simple function and the LOAD_FAST, it seems to mean that I can't have a function with more than 64k local variables. Is that right? (Joe Jevnik) There's actually a pseudo-instruction called extended_arg. And it has an opcode followed by two arguments. And then that would precede an instruction that also took a short as its argument, and it would merge those together to get a wider argument. So I believe those will be emitted for a LOAD_FAST. I'm not positive. I haven't tried that. I'm sure codetransformer has a test somewhere that does that, but -- (Scott Sanderson) If you want to see some examples of, like, truly horrific code or strange code, look at the codetransformer tests we did. [laughter] (host) Other questions? Well, please help me in thanking these guys for a great talk. [applause]
Info
Channel: PyCon 2016
Views: 12,307
Rating: undefined out of 5
Keywords:
Id: mxjv9KqzwjI
Channel Id: undefined
Length: 41min 51sec (2511 seconds)
Published: Tue May 31 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.