Structuring Your First Python Project

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
can I suggest since with our room how many people here consider some fairly new to Python okay good that's a good it's a good size all right so the way this talks design I'm basically gonna dump a lot of information on you don't worry if you miss something or you feel like you need to get more info on it because it's good have these slides posted at the end of the target are gonna be a whole bunch of places you can go to read more about every topic I talked about here all right so let's get started first I'm going to go over the directory structure we have I'm going to go over isolating your environment which is separating your projects dependencies from other projects dependencies which doesn't always seem important but you'll find the more you do projects the more you wish you had done that earlier next I'm gonna talk about installing dependencies because python is considered batteries included but the battery store is quite nice so I'm going to teach you how to connect to that and call them dependencies there next let's talk about packaging your project which is after you've created something you want to actually share it with someone how do you package it in a way that can be installed by others and finally releasing to the world which is uploading your project up to that battery store otherwise known as pi PI that way other people can pull it down so first let's talk a little bit about our project this is a real project the code is online references at the end and all it really does just call the full cast io an API you give it a couple coordinates and it spits out the current weather it relies on some environment variables for your dev key but it's a real project it's got tests it's got um got a flow it's got everything they need so real world project you can reference later this is the overall structure of the project we're going to go over every file here I'll try to make it interesting as we go along first thing it's the code everyone get wants to talk about code first everyone's really excited about code I'm not going to say too much about the actual code itself but one thing I recommend for new projects is having some sort of style checker so what we can see here is me running a code tool called flaky over my code and it spits out what it sees as style violations in that code in this case I've got line in the project that's too long most of these recommendations come from I come from Python enhancement process process is it bosal proposal thank you called pep eight what I will say is when you pick a style tool and apply it most of them are going to be relying on a thing called Pepe you can Google read about the biggest thing you remember is that not any of those rules are set in stone most of them are good ideas some of them might watch weak line length is one that almost everyone recommends you bump up a little bit but the key takeaway is have a style checking tool configure it to your liking and then stick with that style going forward any new contributions need to follow style they'll just make for your project at the end next thing we talk about is actually the most important part the tests if you have a project that you're sharing out what the world tests are the only thing that will save you if you're writing a project by yourself if it gets big tests are the only thing that's going to save you so listen to Nets talk next and start if you're not already running - that's giving a talk yeah start writing tests and just to show you what that looks like you got the test directory here and I have all my tests in one file because this is a small project and this is me running a tool called PI test and what it does is it goes through that directory it's fine so any tests in there runs runs them all and gives you a nice little output don't let anyone else merge code your project without running tests don't release code unless all your tests pass or at least you understand why they're failing um button pretty important one they might go over talk about quickly here is the license up top a license is essentially what other people can do is your code usually you'll see out there either the patchy license and MIT license or a good new license I'm not going to go into detail about what those are there's a reason there's a reference at the end you can use to learn about different license and pick one for your project the takeaway for new people pick a license pick one that works for you have it ready before you release any movie next let's talk about documentation for any project that you plan to use for medium to the long term documentation is critical for any project you plan to share with other people commentation is critical people will walk away from your project if there's no way to understand what it does at the bare minimum your project should have a readme in fact this project that up here only have to read me there's a directory labeled Docs that's mostly there just so you know where to put your if you're going to have a project that's fairly large and good I'm more involved documentation you put Docs in the docs directory and I recommend reading about tool called stinks reference at the end not going to go into detail about that here but the big thing for any new project is your readme it needs three things a description of the project how users can install the project and then how to basically use the project so people should be able to pull up your page and either github or bitbucket or get lab or wherever you host your project they should immediately be able to say okay I know what this thing does let me try it have an issue Ella brought a couple things pretty quickly straight from rereading one takeaway for Python project if you've done projects on github before you probably have written read these before using markdown if you're using a Python project I recommend against this for Python projects you should use restructured text and the reason for that is when you later go to release your project on IP I think I'll talk about a little later they don't render markdown and you read me so look very weird when you get on to the pipe package for it but but both github and pi PI pi PI and render restructure taxes we read a little about that it's fairly similar to markdown and that's just little things you put around text to render it correctly it's fairly easy to use simple things next I'm going to talk about dependencies and isolating your project almost any significantly sized project is going to have some external dependency in fact if you're doing any project involving going out to internet you're going to have requests as your as your dependency I highly recommend it so we'll go we'll go into this some detail here so as I said earlier isolating your project is a what is a way of creating a Python package I thought interpreter for your project that is separate from any other projects on your system so any dependencies for Project X a dependencies can crush why they all have their own little black box which has everything you need to run that project and the reason why you do this is because you don't want if you have project X that requires a version of package and project water requires a older version of that package you don't want that conflict in there are you going to see weird problems again spend a long time trying to figure out what's going on so when you get right Steger project first thing I do is make sure virtually envy is installed using Python 3 is bubbling in but for Python 2 you should probably make sure it's good installed so just do pip install a virtual envy then you create an isolated project and that commands very simple it's virtually in vivre and then a name for that virtual environment I use the in viv I think that's easy and Vida the fairly think that's fairly standard doesn't really matter that's not going to go into source control you keep it out of there so you can call whatever you want then once you have this environment you run the third command down their source name your virtual environment in activate and what this does is it tells your your shell that when using pipe once you type in Python use your magic black box for that Python when you install things install them into that little black box that way as a way of saying I'm activating my virtual environment cuz I'm working on this project anything I do should go through that environment I'm gonna go over some commands just for installing dependencies it's using a command name called pip which stands for pip installs packages and all very basic installing and uninstalling is just pip install package name it uninstall package name when you need to upgrade its pip install - you the - used for upgrade if you want to see what things you've already installed pip freezes here is your friends it'll list all your dependencies and what versions are currently installed and then finally you want to install packages from a file so you've listed all your requirements you do pip install - are in the file and he'll go through that file line-by-line installing each one of those packages the format for that file is identical to what you see in capris don't worry too much about that these are the requirements for a little test project like I said any project involves going on the internet you're gonna want install requests don't use URL Lib - don't use your all up one oh my god don't just stick with requests unless you know you want to do something else in fact I think the Python official Doc's tell you to install requests at this point you got a question um mindful ice I don't want go to follow on that cuz I don't know a ton but if I remember right and I cut your talk about anaconda right it's it I think anaconda is basically just assistant package it's a bunch of things together for like scientific reason right um I think all that is it am I wrong about just a virtual environment great to incoporate okay clearly I should stay out of that one but Conda is the tolda anaconda Conda is like anaconda zone tool okay cool I'm gonna I'm gonna move away from it though can I honestly I know vaguely what it is but I clearly haven't used it much the short answer from us up here who don't know is that anaconda is a scientific distribution of Python that includes lots of add-ons that you're going to want if you're doing scientific Python and it includes a package manager called Conda and we don't know how it works all right awesome you can still use Pippin here using anaconda so back to my thing the arc ice for this project I separated my main requirements and my test requirements because you as a user only need to install requests but if you're a developer you're going to want to run everything supporting the package and I'm just going to quickly go over these coverages nets package and it basically measures how much of your code has been covered by automated tests Flake a is my style checker that's the thing that goes through your code make sure it follows a consistent style Mach is a library that allows you to basically fake parts of your code I think net stock goes over that if we have time Pi flakes is a dependency of like a don't worry about it and PI test is my test runner it's the thing it finds all tests and runs now for the fun part we have a project codes got reasonable style while I'm too long and we have a set of tests they run they pass let's package this up and share with people all you need to really to do to package your projects create a little file called the setup dump I this is the entire almost the entire except for imports for the setup dot PI for this project and as you can see it's just calling a function called setup and providing arguments that are essentially metadata for your project things like the name the version who wrote it what license and also your requirements go in here you'll notice this is kind of duplicating what's going on a requirement text there's different opinions on how you want to handle it since I only have one dependency I went ahead and just duplicated here but as your dependencies grow you're a while look into ways that may be reusing your requirements don't text here it's easily done but for a small project it's not a big deal just to repeat it the only other thing that's a little interesting is down here entry points entry points is a bit of code that basically says when installing this package make sure there's an executable and when that execute execute was called call this little bit of code which is based on the run method for my program so remember at the beginning rated forecast IO and those parameters it's using the entry point nor do yep do i I don't think so um you're you're asking about the packages that Matt is authoring himself right that's what that's what I was saying earlier so in this project they're in both places in other projects I've done I've had this the file that has this call read that requirements on text file modem in but for the sake of fitting it on slide keeping it simple I've duplicated you don't necessarily need to do so the set up top pi is used when someone installs your package requests requirements text is generally used by other developers who want to set up a working environment to work on the package so they're used for two slightly different things but there are like matt says there are ways that you can share them or or make use of just one but the simplest thing is the way Matt's got it laid out here yep yes so pimp pulls up pulls down my project they figured out the requirements installs those and then install since project dependency resolution is a whole field of study so it won't go too much into that so now I have to find my subtype I'm as we sort of talked about it gives me two things one I can now from that directory where that subtype is run pip install period to actually install my package into whatever environment I have activated or of nuns activate au installed in my global environment you can also so right now at a bare minimum we can put that thing on floppy disk and to our friend and they can install it themselves it's yeah I think an awesome thing I love that idea just pip installed period and then you can run your command you have a working project but given that this is the internet we tend to want to share more digitally and share it out but before I talk about that I want my one hour thing that I don't think a lot of people know about release I didn't know about it when I started this talk is called edible editable installs so now that you have a set up depth I not only can you install the package which normally copies the paas and puts it to your environment you can do an editable install which instead of copying those files it links those files enjoy your environment and what that distinction means is when you edit your source files wherever you have installed is also updated so what this is typically used for is like debugging or you're developing new features and so you make a change in the code it's already reflected wherever you have installed it's very good for quick edit debug sessions so if you're playing around if I read a little bit about edit editable in stalls now we got that side let's talk about releasing a product to the world so this is pi PI pi PI sorry ipod is a separate thing PYP I this is where you release projects that can be shared via one when you run pip install a package command this is where it's going to find those packages it's actually comically easy to release your own packages on here it's it's actually kind of impressive so the first thing you do is you register for your register for an account after that you this is a stuff I hate by the way after that you create a dot PI key IRC in your home directory and what this is is you put your credentials in here I don't love having my credentials in plain text so I often like remake this file basically whenever I actually deploy but you put this file on your home directory you put your credentials in there like that that's not a real password obviously but once you've done that you run your setup PI you new Python set up that PI this is the file where I have that setup method we looked at earlier register and then whatever you want to call your project once you've done that it'll upload a version of your project to PI pi if it doesn't already exist it will create it it will register you as the owner of that project and now you have a package out there that anyone can use and you aren't considered the owner so now you have a package i'm pete pi pi it's awesome all done right there's a the real set up PI deployed a month of the project we saw earlier you can go there right now and you'll see it if you google for it I don't have the URL up there and I think you left out upload when you register it actually uploads the version oh really yeah I found that I found that out with that project school but don't worry we retired by uploading new versions all right but first now that we have our thing on pi PI anyone in this room could in theory I say in theory run this command and install the project it does require a little setup in the depth environment variables setup unit of dev key but it's out there done but now you have your project up say you want to make changes so you make bug fixes you need to update your project and all that requires is updating the version of that setup pi because this is what tells pi PI that this is a new version and that there's no conflict and then you run Python sub-top PI s disk upload that just says upload the new package um pi PI share of everybody that's this stands for source distribution thank you and if you if you're a security conscious you can crypt it you can sign your releases I often do for an open source project I run I don't know how common that is so I don't go into detail and that's basically it we've gone from creating a project from scratch all the way to releasing it to the world these next few slides are not really for human consumption there are references I'm going to post these slides on the Meetup you can go back to any section that I thought I went too fast if you think more about it here and that is all I've got so let's do some questions you must have some questions yeah pilant sorry I I'll tell a little sign I work on a package called gif cover Pilon Chisholm is a great great tool but it's given me a lot of headaches just mostly cuz it's very advanced it um I think flaky it's great for smaller projects for larger projects definitely look at pilant I've seen that pilot gives you a lot of things you may want to ignore it really comes down to how much you want to configure it but they're both fine but flake a for small stuff pilot can definitely be really really obsessive and annoyingly complainy about your code there's no way you can just run it out of the box and be happy with the results there's things you're going to have to turn off they're trying now to make it be more sane out of the box but for now the first time you run it you'll be like oh my god why did I do this is this thing that's complaining to me about all these things that are fine missing module duct string and I well you should have module texture but okay yep we talked about the dunder init files yeah sure so that's a good point actually miss them um so if you look at the back our project structure you see these little of Nick dot pies um they're basically a way of telling the interpreter this is the beginning of a package it doesn't need to be anything in there there are things you want you can put in there some people use it to set up the path for example for this project I believe they're all blank and it's just saying this is a package just say you never mind right and and since we're talking about both the structure of the code and things like set up that pie the word package here has two different meanings the way Matt just said that the packages a directory that can be imported into right so the fact that there is a dunder init pie under display forecast io means that in the Python interpreter you could say import display forecast item if that dunder init pie were not there when Python found the directory and it did not have ad underneath up high it would not allow you to import and that's a common problem people say I set up my tree and I came to import it why not well do you have a dunder knit top pie that's probably the first reason the other use of the word package is in for instance in pi PI the Python package index that's really an index of kits that you can download and install this is one package upon pi pi it all happens to have two packages display forecast i/o and tests so the two different meanings of the word package are a little unfortunate when you're thinking about how to structure your code and then how to get it make it deployable and you have to keep it straight which sense of the word package you're dealing with that anyone talk yep yeah so I think that I think it comes down to it's an advanced usage and I think it comes down to how comfortable you are at these things yeah Altima if you're making a library for other people to use you want to make the imports not annoying from daytime and for your day times terrible uh-huh but be using your edits but that's fine I would our I would say if you're doing your first project don't worry too much about it but it's definitely nice to have I prefer the second method over all the way request doesn't fantastic definitely look at that code and see what they do as projects get larger having code in the dunder init pie so you the dunder init app I can be completely empty literally zero bytes I believe that's what Matt's are if you have code in there it can cause problems as your project gets larger because for instance when you import display forecast i/o app it's going to execute display forecast io / dunder Nick PI and then it's going to execute app pi and that can cause circularity problems if you're reporting lots of things from lots of places into your dunder net PI but it does make for a nicer important bill to import things from display for a cast io rather than from display for cast i/o app so it's kind of a trade off and you'll have to the important thing is to be aware of the mechanisms that are happening there and then decide which way you want to go and you'll have enough information to debug a problem if it comes up later and just just to add to that think about your users versus internally right minimize what you put in there if you're making a public API then think about for internally you almost always don't care yep yeah so I I'll be honest most of time when you stepped up hi it's for exactly what I've described above I've seen other uses including built-in test commands so these are tests that run when you install your package you can have the docs put into the setup I saying here where where the docs are you can do a whole lot with stepped-up pi but in Li and Ned correct me if I'm wrong about this but I think I've seen more and more people getting away from that like relying on talks which is a test runner for all your test running needs having your documentation to separate i've been these these giants have top PI's that you see occasionally I'm seeing less and less of those going forward mostly because it is a very deep thing that doesn't change very often in the world we live in changes a lot so I prefer having separate things we're going to throw that and just leaving Sept up high for metadata about the project so we take one break all of you who are standing up over there there are a few empty seats over here and we can put move these bags and you can set on this counter too yeah it's a opinion question what do you think about rpm versus making the fight which one is better easier to maintain four variables yeah so I'm working on a package and I'm having this done EEMA going which way rpm versus so I I like just making my virtual environments wherever they are and keeping it keeping it simple in that sense I heard good things about the art like things like art but I don't think it's our game in Python I'm saying rpm or rpm an OS level package been oh gee I have not done anything lowest level packages um I stick with Python packages almost unless you have importance easy to see stuff that needs to be compiled then it might want to look in the OS packages but I couldn't go into too much of detail so the the disadvantage of an OS package manager is that it only knows about one Python with the virtual ends you can have two dozen virtual environments set up on your machine for whatever reason um so for instance on my machine where I maintain coverage I I think I have 20 different versions of Python installed and then I've got virtual environments for the dozen projects at work for the three different side projects when people need help with a bug on coverage type I spin up a brand-new virtual line just to run their refocus I have no idea what they're going to need so it could be able to make those all those different virtual environments and keep them all separate and not worry about it is really great and indispensable so is it like cross boarding for people again do not worry about what I'll break system they're using no so you don't want to share virtual environments right virtual you share the requirements text file and then you rebuild the virtual environment is more like oh I want to package my project and share it with the world so which way should I go rpm way versus the okay packet man so if you if you'd only see Adams I'm not talking about testing or developing and playing people who really want to use so almost everyone uses it to install the Python packages there are some that are OS level but almost always it's good and if you don't have any C extensions it works very simply I think the C extensions complicate things a little bit cuz you got a compile based on architecture but set up that PI will compile your C extensions to there is a there's a syntax in there you can say these are my Python packages these are my C extensions and it'll build them for you yeah um the good news is that anyone who said rpm has Python installed already and so you give them a Python project and they're you know anything about Python they'll be able to run it they don't know anything about Python then an RPM or a debit might be a simpler way to get them the package but then you're tied to that operating system thank the Dynamo is more fluid that requirements and the exact versions that are required for a project to stick with Python actions that's the easy answer right on the OS level package manager you'll have to worry whether the Python things you need have already been packaged by the operating system so that the operating system dependencies can be resolved and by the way usually when you bring this up it's sort of like Emacs versus fin people will have an opinion about which is obviously the right way to do it and you're in Dell and the fight and how to fight like honestly we had like a big fight today about that yeah time on I just threw this question and you take a white people take in John but we're Python people here so we all agree that hey um so basically you just have more lease you have the folder the in it in there and then I think in they set us up I don't know if I have this one set up that way so in here I list the package but it by hand so you might need the listing here another thing you can do is I forget what the function call is I find packages so it'll actually go through your code and find all the packages for you I didn't do that in this case because I only have one package and I want to show that full syntax but yeah just add another directory within it make sure it's in the set up high or you're using the package discovery mechanism and you said you had a question back there in the green yeah Oh from the repo directory yeah yeah so like a two ways so like I said earlier once you have a sub that PI you can just install from the directory where that code is so like literally you email me the code the folder I can do pip install in that directory and old smell fun usually um the other thing is if you have if you have it up on a repo like github or bitbucket um there is a syntax I don't know if top my head I I've done it fairly often that will actually clone the repo down run this app top high and it'll work fine as well so both of those are options pip will understand um git and mercurial and SVN URLs including a tag to retrieve from those repos and it will install from those if you need to it's not a super handy way of running Forks a lot of people's projects yep as soon as I have one flood if I got flow someone finds useful I release it although speaking personally I've been released before I have anything I mean not in pipe ibutton github whatever just get it out there but as far as releasing to the world if there's one feature someone might find kind of cool go ahead so I'll tell you one thing that people often feel about their projects is that they're not ready to be released or you know it works well but it's not I have to clean up the code so I'm going to do that later um I've I have felt that and I have talked to people who have felt that and I have never once heard a person say to me I released it too early and someone yelled at me about the way my code looked it just doesn't happen so there's really not much reason not to release it um you can't reuse version numbers on pi PI so if you release a 1.0 and then you decide you might change something you have to call it something besides 1.0 but other than that there's there's no cost to releasing it and you might just find that someone's really into it which will give you the energy enthusiasm to work on it or they might want to help with it or oh my they might write the knowing bug reports for you or something you know you interact with the community yeah I will say about releasing packages you'll you'll find that sometimes you'll get bug reports we really just like no I don't need this right now and but you got a that's that's a whole separate talk right there yep so this goes mildly embarrassing I'm unfamiliar with the fit with the pep any documentation I've done has been through either having a readme file and then or using Sphinx and just having it pulled on docks or internet um do you have anything on that yeah so pep 257 is the pep that the standardizes the conventions for docstrings um I didn't know that it said you should do that that sounds crazy yeah um don't write the same thing twice there are tools that can help with that right yeah I wouldn't do that I've never done that I wouldn't recommend anyone do that yeah so you mentioned PI doc which will do that for you which is with the thing you can do at the command-line to automatically show you doc Springs sphinxes this is the powerful documentation tool it was actually written to create the Python org Python documentation and now is the standard for all Python Doc's and it has simple directives that let you say pull in all the doc strings from this class here and just put them in the docs so there's really no need to duplicate information so woolum eps consider them don't follow them like they're law you'll find a lot of people frustrated by people haven't haven't quite figured that out yet and will complain about pet violations and owed their their guidelines there tend to be good ideas but if it doesn't make sense to you don't do in question um I mean honestly when they run the code check to see Devon stalled if you want to do somewhat pip I never this is just me talking off the cuff one thing you could do is in the set up top I because that filed actually run so you might be able to get away with putting out something like make sure you have these to find or I'll be mad you know but usually just about eight when you run the code yes look got one back leg let's do it yep there's there one of the tags I don't have in this setup i obvious if I what there it's like literally a section for tags and you can say Python then you say it supports my phone too and you put all the versions down in there if you're writing a new project just a little PSA give three a shot it's pretty good big fan if you're writing a library they expect a lot of other people to use two seven is still the most popular and there are a lot of people using two six which surprises me I never understood that but hey um oh yeah you just specify those tagging this up done time right so the tags are called classifiers thank you and they're used for Python version for classifying the project oh this is developer tool this is for art this is for music but it's important to know that those classifiers are not used by setup got PI to prevent you from installing it in a version of Python that it will not support um and and it's a it's a tricky thing I've got an open issue now against coverage on PI which no longer supports Python 3.2 and they want me to put something in there to make it very clear that it doesn't like when you install it that it will not work and I'm torn because I don't want to add extra crop to my setup dot pi to like check this and check that and check that to see if it's going to work and how many three two users are there anyway and it's only one guy so maybe I can just close it as won't fix no but yeah there's no there's no built-in way to put teeth into a setup that PI to ensure that it's being installed on a version of Python that it'll support all right Chris knows how can you not put Python in appliances Gordon your installer fires and specify a constraint that um it's not a pipe installable package so unless it's treating it especially it be cool if they did that as a special case but I've never heard of that happening doesn't mean it doesn't happen it's just I know yeah so it still requires the Python packages specifically um but yeah by the way Matt just mentioned that setup I admit is a Python file is actually executed the one he showed is very declarative but it is one function call with lots of arguments um you can put arbitrary code in your set up top PI to solve problems like this but you should also be aware that bad guys could put code into a setup dot pi and then get you to install that package and that is the reason you should never run pip with Sunu because when you say sudo pip install blah whoever uploaded blah is now running code on your computer as root and you should think very very carefully before you do that what I know about that newest version of PIP actually had some nice security features if you're worried to check them out at last people it allows you to install specific making sure when you install package it matches a specific hash that you have specified I haven't looked too much in the details about it but it's really cool from what I've seen so check that up I saw more hands with weather oh man I'm gonna I have mixed feelings about what I'm about to say but I maintain a package called if covered one thing that difficult does is it says how it applied these tools changes in the code so often one year it in the organs if you're in an organization that running Python for a while using a tool like flake a it's very demoralizing because you run it and it's just pages and pages of stuff and you're never going to do it all really so what you really need to do is work on changes going forward so say hey let's try to set up a standard so any new code that comes in we apply these code style to it that way whenever that way has a code grows and expands and things get edited the code gets better as you go because you're never you're not going to win any friends by saying let's go through the entire code base discover as a tool that yeah so difficult a tool that helps you with that but another way you can do it honestly is just keep the keep count and make sure whenever new code goes in that count is going down rather than going up but just as you go to keep more hands occupation puppy dogs all right I've heard this stuff in a while but it's definitely good stuff um you mentioned read the docs if you're online reading about Python stuff look up read the docs it's a fantastic project for hosting people's their addiction help them out if you can otherwise just use it upload your dog there create a community of documentation great the great thing a little-known thing about read the docs is the reap they over the domain read the docs work but they also have the domain art EFD org documentation slop isn't if I'm right alright anyone else any other questions about structuring projects all right all right thank our global room okay thank you
Info
Channel: David Baumgold
Views: 38,005
Rating: 4.797802 out of 5
Keywords: python, programming
Id: RKHMnevITF0
Channel Id: undefined
Length: 39min 26sec (2366 seconds)
Published: Wed Jan 20 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.