How to Build a Complete Python Package Step-by-Step

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
the most common way to package and share your code is by using Python's built-in set of tools if you package your code publish it then others can use it by sharing binary files source code using a package manager that stores your python package in online repository like Pi Pi which is the official python repository but how do you do that how do you publish a package and what do terms like wheel or egg info actually mean so that's what I'll cover in today's video make sure you watch this till the end because at the end I'll share five things that are really important to think about when you want to publish your code one thing that's important in any case to make sure that your code is high quality so that others can actually use it how do you determine the quality of your code well I have a free workshop on code diagnosis that you can get at 500 goals slash diagnosis this is going to help you become better at review code understanding where the problems are it's about half an hour contains lots of practical advice I Review production code to illustrate how it works so you can apply the same ideas to your own code so iron.codes slash diagnosis that access for free and the link is also in description of this video in this video I'm going to use a simple example application for packaging and Publishing on Pi Pi what this Library does is generates a bunch of different IDs so for example something that generates a password of a given length doesn't really matter what the code is exactly but it just checks that some particular types of characters are in the string I have a function to generate a GUI ID a function to generate a credit card number that uses the Loon checks on test something that generates a PIN number and something that generates an object ID of course we can add a bunch of extra ID generator functions here that's not really the point of this video the only thing I want to do is show you how you can take a library like this and then package and publish it next to the ID generator module there's also a utils module now normally I'm not a big fan of calling files utils because it's not really clear what users is but in this case it's pretty small I could maybe even have called this just loom.pi and it would also been fine what I've also added is software tests so here we have test ID generator and there you see well we import the different functions that we want to test and then I have a unit test test case where a test generating a password GUI ID and a couple of extra tests now these are not very complete you could also build out on this test and test all kinds of different edge cases but it's just to show you how to set this up let's also take a quick look at how the project is actually organized so I a setup file that will talk about more in a minute I have a license that describes how people are allowed to use the code and they have an app folder that contains the ID generator package and inside that package we have all the files that I just showed you on the resource folder and then there's readme file that contains some information about how to use this particular package so you see how to use some example commands you can of course expand on this when you publish a package like this now knowing the structure is obviously important because that's going to dictate the files and directories that you're going to refer to in the various steps of packaging and Publishing when is packaging actually useful well especially if you're working on a better code base you're going to run into situations where you have code that you're going to need to reuse in various parts of your project for example you might want to have a mechanism for generating IDs both in the front end and in the back end because in the back end you might use it to generate database IDs in front end you might use it to generate passwords automatically for a user or PIN numbers or whatever you might use to test your code for example if you want to test payment method you want to generate credit card numbers so there's various areas where you're going to want to have packages that you're going to use and packages allow you to structure those dependencies just like you're depending on other third-party packages or libraries this is also going to make your code easier to manage because you're forced to split up things in neat packages that are sort of independent of the rest of your code and that's typically a good thing you might use it for cloud computing where you need to rely on Amazon these packages to perform certain tasks you might want to containerize your application and then you also are going to need some sort of solution to deal with dependencies properly so in Python there is a bunch of different tools that you can use for packaging and Publishing but the built-in one is called setup tools and this is a standard but python library that builds on dist utils and makes it easier to package and publish your code now the thing to know about setup tools that is going to use some terms that might look confusing at first so the first is a wheel what is a wheel exactly well this is basically a packaged version a binary version of your code that's ready for publication this contains basically everything that the python package manager needs to know in order to install the package it contains all the metadata and if you want to create this binary distribution this wheel then the only thing that you need to type is in your folder python setup.pi so that's going to use the setup tools package and then you write B this wheel so this is the binary distribution wheel so if I run this then you see that it's actually created a bunch of stuff and here it creates a build and a dist folder and in the dist folder we have let me make this a bit larger we see that we have the wheel and this is actually the binary file containing everything that is needed now before you can actually do this you have to make sure that you've actually installed the wheel package because otherwise python is not able to do this job and you're going to get an error but I already did that so I don't have to do it now the second thing that you're going to see is the sdist file and this is a tarte gzipped file that is not a binary file but it contains the source code this is the source code together with the setup Pi file and that's useful because that means with that source code you're able to rebuild it if necessary in order to create a source distribution you run Python and also setup.pi as test Source distribution so now it's created a source distribution file you also see there's a second file here if we expand this then we see here we have the source distribution that has now been added there's a couple of other things here as well for example you see here the build folder so this contains all the built files so this is now empty but in lib we see that we here have the actual python files that are going to be part of this build so it copies that over and then prepares it in order to be able to create the binary and Source distributions and finally there's now another folder called idgenerator.ag info and as you can see this contains things like the dependencies requires the different source files that are going to be part of the app that you're going to distribute so this is more like the um the metadata the additional information that's going to be included with the python project AG info is actually being replaced by wheel so I'm I'm not entirely certain if this is still actually being used if you use the wheel package but it does build it so maybe it uses that as an intermediate step or something I'm not completely sure if you have a better idea of whether egg info is still being used let me know in the comments now very interesting part of this is of course how you set all of this up in the setup.pi file because that's a file that you're actually going to have to create so let's take a closer look at how I set this up so we are importing from setup tools which is the package that I'm using to package and publish this code and what I'm doing is as a first step I'm opening the readme file so that's this file right here and this file contains a longer description of what this package does and I'm using that to fill in a long description variable and then I have a setup function so that comes from setup tools and what this has is basically all the meta data about this particular package so the name of the package is ID generator it has a version so whenever we want to publish an updated version we bump the version number here we have a description and then there is some settings for where the setup tools packets can find the source files of this particular app we have the long description what type of content is this while this is markdown we have a URL of this package an author email of the author we have a license so this is an MIT license and that also matches the license file that we have here so this is going to be published alongside the package so what else we have we have classifiers these are sorts of keywords and we have a specification of what is required for this particular package what's nice about including the requires here is that we really have all the necessary information about the package in setup.pi so in this case our package requires bson which we use to generate object IDs but there's also some extra requires namely there's ply test because we need to test the code and you have twine which is the tool that you use to publish the package to the repository and finally you can indicate that well if you use this package you need python 3.10 or newer what's also nice about this is that actually in this case you don't need a requirements.txt file because all your dependencies are defined in setup.pi there's a couple of things you need to be aware of well first you need to make sure that the app folder and subfolders contains init.pi files so that setup tool actually recognizes this as a package otherwise it's not going to be able to find the files and under ID generator you also see that in the init thunder.pi file we actually import all the different functions on the top level so that when we use this package we don't have to write from ID generator dot Source dot ID generator import and then the function name we can simply directly import it from the ID package directly so that's obviously a lot easier to use by the way you don't have to use the setup.pi route you can also create a setup.cfg file a config file where you specify these things but in this video I've used a pi file because I think it's nice and it also allows me for example to load the long description from the readme file and this wouldn't be possible with a config file by the way if you're enjoying this video so far give the like and consider subscribing if you want to keep learning more about Python and software design so how do we build and install this well like I showed you before we need to use Python setup.pi and I want to have the binary distribution so that's the wheel and we also want to have Source distribution so I can simply write both of them and python set of tools is going to build both of them like so if you want to install the package locally before actually shipping it which might be good because then you can test it you can actually do clip install Dot and then it's going to install ID generate so you see I've already installed it before so it's uninstalling it and then installing it again but the first time you do this it's just going to install ID generator so now I have the package ready for me to use locally so it's not yet been published and repository but I can actually use it locally and I have an example file here run.pi and here you see that I simply import these functions from the ID generator package which is my own package that I just created and then I print a bunch of function calls just to see what's happening of course this is not a test but you can use this to play around with the package make sure that everything works correctly as expected so when I run this then you see it's going to print a bunch of IDs passwords and PIN codes so locally the package is now working we've been able to package it up we've been able to install it locally and then we can use it in our Python scripts so this might already be enough for you if you just have a a bunch of different packages that contain some common tools that you're using in your script this could be a great way to keep things really simple just build them install them locally and then you can use them everywhere in your projects of course things are going to be different if you want to publish those projects and you want other people to actually also use that code because then you're going to need to publish the packages as well now I'm quite sure that you have encountered this website this is pi-fi.org here you can browse and search for packages for example let's say I want to look up e done this package so then you see hey we have here pandas to 1.5.3 which is released well last month and then you can find information about this project whereas the source goes and sets right now the thing is that if you install panelists or any other package it actually retrieves the package from the Pi Pi repository it is the official python repository that contains all of this code so that means that if you want other people to be able to use your package you need to upload it publish it to Pipeline and how do you do that well first you're going to need an account so there's a register button here and this allows you to enter name email address password and then you create a Pi Pi account where you can then upload a package now obviously if you want to publish your package for the first time it's going to be a bit dangerous to just upload a bunch of stuff to buy buy and then you realize you made a mistake and then there's a project that's actually wrong and then you have all kinds of problems so there is actually a test site as well foreign and this looks exactly the same as the pi file website except this is just used for testing so before you actually try to publish anything to Pie by you can upload it to the test report story just to make sure that everything is working as expected how do you upload a package to Pi Pi or the Pi Pi test repository well that's why I use the twine tool so I've added it here to setup.pi as a development dependency so it's right here just like Pi test so it's already installed but twine basically helps you to do some checks like making sure that all the relevant information is there and that allows you to upload the package to the Pi Pi repository before you publish package you should let twine check that all the necessary files and information is there so you simply do that by twine check and the files are in the disk folder so I just wanted to check everything in this folder so they see it checks the wheel which passed and it also checks The Source distribution file so the this looks good it basically means that now we're ready to publish this to the repository publishing your package to the Pi Pi repository is now really easy you simply use twine to do this so you write twine upload and then you simply write this slash asterisk and now it's going to upload the distribution to bypi you have to enter your username and your password and then it's going to do that so I'm not going to upload it to Pi Pi because this is not a real package and you shouldn't do that either because you should first upload your package to the test repository to make sure that everything is as you'd expect it to be and you can also do that with twine it's also really easy it's simply type twine upload and then you specify the repository with minus r and this is test bye bye and we also want to distribute the same files and again this works in exactly the same way you enter your username and your password and then it's going to upload the package once you've uploaded this you can actually see the project on the test by by repository just as it will look like when you upload it to the real repository so you can open this and then it's going to contain the project description there's going to be a home page there's also going to be the license you see that we have the classifications here that I showed you in the setup.bi file before so in principle everything is now right here and you even have the install command from the test repo story so once you're happy with this once you're happy with the description once you're happy with everything else you can simply use the same process to publish it to the actual buy buy reposter it's really easy to conclude this video I'll share five things with you that you should think about if you're planning to publish a package plan to publish difficult the first thing you should do is make sure that it actually makes sense to package your code as you call it then set up in modules that are nicely decoupled that you use abstractions to make sure that people can easily use it did you think about the architecture of your package does it make sense the way that the different classes functions modules are organized if your application requires extra data will that data be part of your package or is that going to be available somewhere externally so something you might have to think about in some cases second thing is if you publish a package well there's going to be not just metadata but you might have to do a lot more for people in order to be able to use the package you might have to write extensive documentation of how it works you may even want to create a separate website where you talk about what the package does and how people can use it tutorials maybe even record videos or anything that you could do to make the package easier to use and of course up to you how far you go with this but the more of those matter of things you do the easier it's going to be for people to find and actually use your package third thing that's really important is that you think about a proper software license if you release your software without any license whatsoever it's not clear whether people can actually use it and they will probably not use it especially if the part of a company or company just wants to make sure that the licenses of the packages and tools they use are in order because otherwise they could get sued or something so make sure to always license your code properly there's of course lots and lots of options of different licenses on each of these licenses have different implications so there's the MIT license there's the new license there's a couple of sites that are useful for this like tldr legal or chooserlicense.com I've put links to those sites in the description so you can take a look yeah you might find that helpful the fourth one is you need to make sure that if it's a package that you're going to be regularly updating which you should typically do if you want people to keep using it you need to up to date with the latest developments obviously so every time you release a package of course you have to bump the version number and do some tasks related to releasing the package and you might also want to think about how you structure that a bit better there are some packages that can help you with this for example bump version that does some of the version management for you makes that a bit easier but in general you might want to think about how you're going to do this and how you're going to organize that so that it doesn't become a lot of work for you using versioning tools actually a good idea especially if you want to avoid clashes with version numbers on Pipeline and finally it's a good idea to really rely on classifiers so here I have my setup file again so you see I've added three of these classifiers one is about the license so I've used the MIT license in this case programming language operating system and you can use more classifiers and this is going to be helpful for developers that are looking for packages that fit what they need for example some developers might only want to use packages that have been released under the MIT license so then they can use this classifier to filter packages so that they're sure that they're not using anything that they're not allowed to use in their company you can find all the possible classifiers on the pi by website on the pipel.org classifiers you see an example so there's classifiers that are related to the development status the type of environments whether it uses a particular GPU that's like as you can see there's like tons and tons of different classifiers and probably makes a lot of sense for you to think about which classifiers are going to fit best with your particular package so I hope you enjoyed this video as an introduction to packaging and it helps you take the steps to publish your code to the world of course another way of publishing your work is by actually not releasing the code but also deploying it to the cloud so that other people can actually use your code as a service if you want to learn more about that here's a video where I show a very simple setup using Docker that allows you to very easily containerize and deploy a python application thanks for watching and see you next week
Info
Channel: ArjanCodes
Views: 174,336
Rating: undefined out of 5
Keywords: python package, how to develop a python package, building a complete package, build complete package, build python package, building a complete python package, building and publishing python package, building and publishing package, building python package, how to publish python package, publish python package, python packages, python packages setup.py, python tutorial, python package development, python package setup.py, python package setup, python package upload
Id: 5KEObONUkik
Channel Id: undefined
Length: 20min 28sec (1228 seconds)
Published: Fri Mar 03 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.