Python Packaging from Init to Deploy

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
latest best practices and even if you're up to date on that hopefully you'll learn some like tools and tricks you can use to make the process easier for you the slides will be available online I also have some notes with URLs and links with some more details about some of these things I'm going to skim over a couple sections that'll be online the URL will be at the end to so you don't have to worry about capturing it right now I'm Dave for Jack I am a developer at American Greetings I work mainly on a restful web api system before that I was a Linux systems administrator and before that I worked as a support engineer at a web hosting company and in all those jobs I used Python as much as possible I love to eat your feedback and any questions you have so please contact me this contact information will also be at the end so about six or seven years ago I was working at a company in Delaware and I used a lot of Python tools to do troubleshooting some of the tools were installed from the internet using this new tool at the time called pip but some of them were things that I develop myself or that my co-workers developed and we found that the ones that we developed ourselves were kind of difficult to manage we were really just trading scripts around way to remember things like make sure to tell the person to make it executable to drop it in the path and the big thing we'd always forget is to tell people to install the Python dependencies so usually what would end up happening was someone would take the script and then try to run it and then tell you this thing's broken what's your problem so we have to go help them fix it so I was a pain so I really wanted to learn how I could make my tools as easy to use as the ones that we installed from the internet so I googled how do i package Python programs the problem is there was a lot of contradictory advice really there was no clear way to do this I mean there were some guides but if you looked at the next guide it would say that the other one was out of date and it was wrong and the biggest sore of confusion was which one of these should I use disutility setup tools and distribute again everyone kind of argued like this is the way to do it this is the right one and so really I didn't have a definitive answer so what I ended up doing was finding a popular open source project copying it and replacing their code with my code and their settings with my settings for the most part that worked so I was able to make a package that I could install and I could have other people install but it wasn't really clear how that worked but I didn't really know how to go about doing that without doing a ton of research and you know I had a job to do so for a couple years I pretty much just ended up copying that repo over and over again and when I was doing packaging but I really didn't know how it worked so fast forward to about 2013 at my current job I had the need to make a package that was going to be shared internally at the company so I decided to research this again and I found this time that there were answers the Python community realized that packaging was a problem and they created a working group called the Python packaging Authority they're in charge of pretty much all of the packaging tool chain so you have like virtual em and pip and setup tools and pie pie so all that stuff and they oversee that now that's and one of the things that they created was a Python packaging users guide this was a fork of the old Hitchhiker's Guide to packaging but one of the key things they did was on every single page they have a last reviewed date so you have a really clear indication that the thing you're looking at is up to date and it's authoritative so this is great um so the rest of this talk is pretty much what I learned updating my packaging using the new Python packaging user's guide so first we're going to go over some packaging definitions there is a module module is basically any Python code saved in a file this is important and it's generally used by other modules or the main program that starts up there's an import package commonly just called a package this is really just a namespace so it's used to organize your code and in reality it's a directory with a dunder init PI in it so that's all it is but it's used to to organize code and so generally it's just called a package pointing confusion there's a distribution package which isn't the same as that import package but this is also commonly called a package so this is shareable and installable bundled up Python code and that's the thing that you get when you download something from pi PI there are source formats and there are built formats of these distributed packages the talk is about making those packages so then the specific types there's a source distribution this contains the Python source code and if you have any like C extensions that goes in the source distribution and when it gets installed by a user the code is actually run to install it and then there's a built distribution it the be distant stand for binary distribution it's built distribution and what that means is it is code that is bundled up in a way that it is just dropped on the system without anything being run so when you install something from a built distribution it is literally just copied to the right place in the file system there's the old built distribution format called the egg that setup tools started that really isn't going to be supported in the future so you should not be building those unless you have a really good reason and that's one of the things I'm not going to cover but just in general if you're developing Python code and want to share it don't build an egg what you want to use is this new built distribution format called a wheel and I'll go into that in a little bit actually right now so a wheel is the new built distribution format that the packaging authority created and is maintaining and there are three types there's a universal wheel and that is a file that's installable and usable by python two and three and only contains pure Python code so there's nothing that needs to be compiled and the same file can be used pretty much anywhere there's a pure Python wheel which the only difference is that is just Python code but it's two or three specific and a lot of times if you have played on two program and you want to update it use a tool called two two three so you build one copy for Python two you run two two three and then you build another copy for Python three so that's a pure Python wheel and then there are platform wheels currently the only supported platforms are Mac OS X and Windows it turns out all the meta information you need to ensure Linux compatibility is really difficult to manage and so it's not supported by the current wheel standard the disk utility lling list has active there's a lot of active activity or discussion about the new wheel format for linux but it's not there yet so it doesn't exist so pi PI this is the Python package index this is a repository of Python packages anyone can upload to it and just to be clear it is not pi PI pi PI is the just-in-time implementation of Python so if you're talking about the package index it's PI P I so don't say pi PI unless you're talk about the other thing so let's look at what makes up a package these are the the components you can see there is a there's a number of files in them like a directory structure this is what makes up a package and the only part that's actually your code is that sample with the yeah so the sample is the your Python code and the rest is all the meta information that you need to ship a package I mean this is from the packaging users guide sample project so the parts of the module are the package are your code obviously you have code that you want to share and that's why you're doing this the setup dot pi this is the most important part this has all the meta information about your package and it also gives you an executable that you can use to complete your packaging tasks like building a building the package you know uploading it to PI P I and all those things so that that all runs to the set up top pi so it's really important and we'll look at that in a second let's set up dot CFG is a config file that sits next to set up PI right now it's only used for configuration around that wheel thing I was talking about but in the future they're going to be moving a lot of the things that are stored in set up top hi to this config file so that it's in a config format rather than in the middle of a program manifest dot in is a file that has a listing of anything in your package that should be included when it's installed that isn't already a Python code or something that's explicitly listed as data elsewhere and then you want to read me dot rst this is used as like the index page if your project is on github and some people use this as the pi PI listing information as well and you can do that by by using it in the set up top I and all I'll show you that alternatively you can use a description rst and I show it here because the sample project does this so they use the readme rst for your github listing and the description rst for the pi PI listing and this allows you to separate those things so now we're going to look at the set up top hi in detail so you see we're importing a set up and fine packages from setup tools and specifically set up tools so I showed you earlier there was dis utils which is the standard library setup tools and then there's a thing called distribute and distribute was a fork of setup tools but it has since been merged back so you want to use setup tools there are some really specific exceptions but we're going to ignore those because for your purposes you don't need to worry about so you set up tools this has really basic information about your package so the name the version a description these things are listed on PI pi when you look at the package and then your author information you want to include licensing information so you indicate which license it uses and these things called trove classifiers that gives PI P I information about where your package can be used so if it works on Python 2 7 and 3 3 and 3 4 you should list Python 2 and then 2 7 + 3 + 3 3 + 3 4 so that's let's people know and be able to search based on what's compatible with what they're using there are keywords those are just used for finding the package in a search and then that fine packages comes from setup tools and what that will do is it will find any Python code that is in your directory and include it when when you bundle it up but you exclude the docs and the tests and the reason for that is you don't want those installed in the built distribution so when you do the built distribution people don't need the docs and the tests on a production environment but they would be included if you did a source distribution so if someone was gonna be working on your your project those would be included install requires allows you to list any dependencies that you have so if this project uses requests you can specify requests you can also get more specific and specify the the individual version or you can say greater than a specific version but if you just know you need requests in the basic form you can do that so the question was is that limited to standard library and no it's not so if you use setup tools or are sorry if you use pip to install the package it will look at that and will install anything that's in the dependencies first and then package data it allows you to ship extra data with the package so if you have a thing that does address validation you might have a data file that includes a bunch of zip codes so that would be package data it's not Python code but it needs to go along with your package in order for the thing to be used other is your question what would you use so package data sorry the question was when would you use package data as opposed to the manifest generally the package data is something that the package specifically uses so it is a mapping so the package name in this example is sample and the package data is package data and that's that way you know that that data is needed by that package the manifest in is generally used for things like a license file or a readme file that you want to make sure get included even though they're not really code um so package data the package dot or package data at that you see that is a relative path to the package name and then there's this thing called data files which is used a lot less but that allows you to specify an absolute path so if you need to put something outside of the package structure you can use that and then there is a script directive scripts really shouldn't be used unless you know what you're doing so I just say ignore it but it allows you to to specify a script that is included that is runnable but what you should really use is on this entry points console scripts so entry points allows you to specify interfaces for your package this is you most commonly it's used for this console scripts but it can also be used to dynamically find plugins so some packages can say that they Pro by this type of plug-in or this type of entry point and other programs can know to look for things that have that entry point that's pretty complex so unless you're doing like plug-in development or making your program accept plugins you don't need to worry about that but specifically this console script is really neat because it allows you to define a script that will be available in the path and it maps it to a module and method so with this definition in the example we have hello equals PI Ohio 2015 say hello so what this is going to do is it will create a really small wrapper executable and called hello that you can run on the command line and it will import PI Ohio 2015 and run the say hello function and the nice thing about that is it is platform independent so it will do the right thing on Linux or Mac or Windows so on Windows it'll make like a little batch file on Linux it'll make an executable that sits in the path so that is the just basic package the absolute minimum so the packaging guide tries to be really technology agnostic for things outside of packaging but generally you want your mature or you know really um just really good package to include these other things two really important is to have a license file I'm not just specified in the set up top PI but have a license file people do get hub searches based on the license and they'll exclude things that they can't use because it doesn't emulation so just make it really clear what license your code has because if you don't specify license at all even though it's on github it's technically not open-source so just have a license you want to have tests and even if your project is just a toy and you don't think it's serious have a really basic test that just imports something and run something that's it I don't I don't care just have the most basic of tests and included with the test you want this tox ini tox is a test runner that will run your tests in multiple environments so this allows you to specify that you want to run your tests in Python - six Python - seven Python three three three four or PI pi that makes it really easy so I'll I'll show that in a minute - and the same thing with tests you want to have documentation even just the most basic documentation that says here's this package here's what it does here's the most basic use case have that because you never know when something is going to be if you're sharing it on the internet someone might actually end up using it and if they're using it they might want to contribute and if they want to contribute and update your documentation it's a lot easier for them to do that if a documentation skeleton already exists but we're going to do that and we're going to get really easy for you you want to use continuous integration tools so what these will do is they will run your tests every time you update the code so if this is an open source project that you have on github I'd suggest using Travis CI and to do that you create this Travis dot yamo file and you specify just like the talks any what environments you want the tests running and then requirement text this lists all the dependencies of your package but this is for developers the install requires that's in the setup type I is for the installation requirements out text let's other developers know what your package needs and I have a link on those notes that I have online about the difference between those two and it gets in the more detail and then a git ignore on this tells get what files you don't want to have source controlled and generally that's things like in your build directory and compiled things that you want to ignore and then there's some other files that you should have you should have a history or changes or change logs so people know what's happened in your project over time you should have a contributing file so that someone who wants to help has a really clear set of instructions for how to help and you should have an author's file to give people credit so people like having credit for things that they've done for you so do that it'll make them feel nice so let's finally make a package but that quite first we need to install some requirements so we want to install wheel which allows us to create that new wheel format we want to install twine which is allows us to upload things to PI P I in a secure manner and we install talks which is that test runner so you may have noticed that there is a lot of boilerplate and there is this whole complex directory structure that we have to create to make a package and that's really kind of a pain but we're Python programmers right so do we do we automate this except we're lazy Python programmers well let someone else automate this there is a project called cookie cutter that allows you to take a template repo and it prompts you for input and then generates an output directory with all those templated files there are a ton of really good cookie cutter template repos for Python packaging um that was the original intent for this project since then it's it's gone on to be used for other things so people use it actually for other programming languages even though cookie cutter itself is written in Python so what you do is you take a template repo and so you make the output and I have one for packaging what I suggest you do is find one that's close to your needs for kit update it for what you need and then use that as the basis for any packaging that you do ok so now we're going to make a virtual M if you're not familiar this gives you an isolated Python environment so that anything that you install is only locally there I use a tool called virtual M wrapper which kind of helps manage that a little easier so I'd suggest looking into that if not you can use just virtual M or PI VM if you're on Python 3 for but I'd suggest looking a virtual and wrapper they have really good instruction for setting it up so I'm not going to go into that and then we run cookie cutter so the rest of the details aren't really that important you just need to see that you run cookie cutter and then a repo URL and it prompts you for all the variables it needs and then it generates an output directory based on that so now we're going to go into that directory and do a get in it so we can get this thing under source control and we're going to just commit the thing that was output by the cookie cutter so we have an initial commit you do need to get configured before you do that so if you haven't done that that's a separate thing now we're going to go ahead and add our code so this is just the the most basic of things it's going to be a hello world function and then say hello which prints the output of the hello world function and you'll see that I'm using the the future import this is so that my code is Python two and three compatible and this allows me to build that Universal wheel I was talking about earlier so that a single package can be installed anywhere I don't have to worry about incompatibilities I'd suggest if you're going to be developing something new try to do this so that your your code works on two and three it's a lot easier to do from the beginning than porting something that already exists although that is still possible and and worth your time we're going to add that really basic test that I was talking about and like I said I don't care about testing every single thing but just make sure you have something so this just imports it runs that HelloWorld and make sure that it is outputting HelloWorld so that's ridiculously simple so then we use that tox command I was telling you about to run the tests now that toxic I and I we didn't have to create that because that was created for us by cookie cutter so as long as you are using the same versions of Python for your different packages you never need to change this you just run tox and you see in this case on my computer it failed four to six because they don't have the interpreter installed but it passed four to seven it failed four three three cuz I don't have the interpreter installed but it passed four three four so it's going to run for any Python that you have installed and it'll give you the error for the ones that don't this is okay though and I'll show you why in a minute the question was if you're in a virtual environment does talks go out of it and the answer is yes umm talks actually on the fly creates another virtual environment and runs the tests in that virtual environment so now we're gonna go ahead and commit that really basic coding tests that we added and that's it so we're going to talk about online services that you should be using to make this all easier for you obviously there's github you can use bitbucket or you know if you have an internal thing at your company you can use that but in this case we're just going to use github so we create a package and since we used cookie cutter we don't need to include the readme or the license or they get ignore because the cookie cutter already included those and then we're going to add this repo to Travis CI so this is going to give us continuous integration from the very beginning it's really easy you log in with your github credentials and it just authenticates you if you've just created a new repo you need to click the sync button that was something that got me it was kind of hidden at the top and I didn't see my repos and I was a little confused so hit the sync button and then find the repo in question and just switch it from the grey X to the green check and immediately it will start running tests based on what's in there based on updates to that repo so now we're going to push what we have so far to that new repo recreated and so github gives you nice instructions for pushing to an existing repo we're going to go ahead and push and then go back to Travis CI and we will see immediately well actually within about a minute the test kickoff and you can watch this at all update live for you but it'll show you that your test passed in every environment you have listed in that tox tie and I so even though I'm not testing locally on my laptop with two six or three three excited feel like installing those I still have the assurance that it's running under two six and thirty three and PI pi because Travis is doing that for me so it runs your tests in every environment that is listed in that tox file and now we're going to sign for read the docs so we go to read the docs that's really similar to signing up for Travis you link it with your repo URL and really simply because we have that cookie cutter that already gives us a documentation skeleton it has it will generate really nice-looking documentation for you so we have a readme file that gets read in and gives you this home page we have the contributing file that gets read in and gives you the contributing guidelines and same thing with the credits and the changelog so all of that is done for you if you start with a good template so you have absolutely no excuse to not have documentation or tests and continuous integration for your open source projects because it's all done for you so now we're going to go to the package index and create an account it's just a thing in the upper right hand corner you create a new account you can do this from setup PI but I'd suggest not doing it because set up top PI has a register command but it does not use SSL so all your credentials and everything would be going over the wire in plain text it's better to go sign up online and then save your settings so you save the settings in a pi PI pi PI RC file you can leave the password blank and if you do you'll be prompted for it when you go to do an upload so let's go ahead and build our package on the command line you run setup pi s test this may a source distribution that I talked about and puts it in a bill directory and you do set up top high B dist wheel and that creates that Universal wheel I was talking about that will be usable on to n3 if you're in a GPG you can sign the files this is optional it gives people assurance that you are really the person who authored it though so it's recommended especially for more mature projects you register the package on pi PI and the easiest way to do this is by uploading the package info file that the S disk command generates so you upload that and it fills in all the form fields for you otherwise you have to type everything in and then use the twine command to upload and it's really simple you just do twine upload everything that's in that dist directory um that'll include the sign files if you've signed the packages and when you do that you'll see your packages available on PI P I immediately so now anyone can go ahead and do pip install PI Ohio 2015 and they would get your package and so now that you've done that once we want to iterate over this you can install your package in your virtual in virtual am using the develop motor that supposed to be set up that pi knots out that up that's a typo or pip install - II and what that will do is it will create a sim link to your source file in the installed packages directory and this allows you to import the thing test it go back and make changes and not have to reinstall it every time so you can update your code in the virtual lab and see the results immediately just after a reload so we're going to just make changes to our code every time you make changes you want to increment the version um in those two files so instead of top PI and in your in it you should have a version tag then we want to commit to the repo tag repo with this new version and push the tags to github that'll go ahead and kick off a new Travis build and then you build sign and upload again so you do the same thing so use the Estes fetus wheel and twine to upload but that last part where we had to update the version in two places and then tag that was a pain so I use a tool called version ear that allows you to dynamically manage the version based on the get tagged so when you're in a development environment it'll read the tag the current tag that you're at from git and if you're not on a tag it'll add the get hash to it and then when you build the thing it outputs the get tag to a file that gets read so you don't need to manage the version anymore it is stored in the distributions but then when you're coding it in your development environment it uses the get tags so just a couple caveats on the talk we didn't talk about things all I glossed over a lot in order to get it into this time slot but there are links for some of it on that URL that I'll show you in a second specifically I didn't talk about any sort of binary extensions so that's something that's kind of common is to have C extensions that makes it a little more complex but that is covered in the user's guide and I can't think of anything else I think add something else but that's okay so like I said the talk notes and some links are there that's my contact information I'd love to get your feedback so thank you oh so we're all advocates of open source but hypothetically if we have a closed source project how what does and does not work as far as like Travis CI encouraging so like at my company the the thing that I did this for was closed source it's something that's only used internally we use Jenkins instead of Travis CI to do the continuous integration but you can definitely google and find information for building Python packages from Jenkins and we use a Atlassian stash instead of github but they're really analogous so you have the same it's the same tool chain you just kind of like pull out pieces and put in like private versions of the same thing something else I didn't mention actually there is a project called dev PI which allows you to run a pi PI internally they can do a couple things it can either mirror the real pi PI so you have just local cached copies of the packages that are out there or you can have it have a repository of just your own stuff so if you have internal only packages you can create a dev PI instance and put them there and then you can use pip install and just base it on that URL instead of the pipe eiu RL all up to that so but then you can still use open source packages within there so you can have your internal requirements go out and get yeah yeah you can you can do it a couple ways on dev P I actually allows you to do a hierarchy thing so you can have it first search your internal and then go to the internet if you want or you can just give it a list of packages that are allowed to be in the index and do it that way one more jump along those same lines internally we use rpm to distribute our packages in Linux is there a way to tie this process into the RPM distribution yes so I'm actually working on making rpms for Python things uh the tools aren't all that automated right now there are tools to help you with it there's still a lot of manual stuff you needed yeah I'm working on that like right now so yeah so there's a tool called fpm and it lets you abstract it obstructs away all the package creation for you can create Deb's rpms or you know anything just by plugging in directories so it can just take arbitrary directories and create packages out of them and that tool is fpm cool thanks I was just going to mention it till I had heard of that does a WN packages I think it's D H virtual end but yeah I remember I've seen that an extra there's a really good article on that like that tool chain for doing Python packaging and then getting it unfortunately we use red hat and Sano s so that didn't work for me but I'd like to do the same thing but
Info
Channel: Next Day Video
Views: 123,649
Rating: undefined out of 5
Keywords: pyohio, pyohio_2015, talk, DaveForgac
Id: 4fzAMdLKC5k
Channel Id: undefined
Length: 36min 9sec (2169 seconds)
Published: Sat Aug 08 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.