Tutorial: Publishing your first Python package

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome welcome to um this tutorial in transform 2022 this is the latest edition of our sort of festival of what we call it digital subsurface stuff um and what that really means it's sort of a content creation uh festival basically we've got um a bunch of tutorials happening this week they're all live you can tune in uh on youtube live like you may be doing right now um or they'll be on youtube for um the rest of eternity so you can catch up with them there too um there's also some meetups happening we've got the software underground um annual general meeting happening this week so it's a big week software underground is this community of folks who are into this sort of thing rocks and computers basically um so welcome if this is your first experience with supper underground you need to know that um i've turned off the chat in youtube because there's an awesome slack that you need to hear about uh software underground.org slack or i think it's actually on software underground.org homepage as well you can go in there if you're not in the slack already and create an account and you're looking for the channels inside that are slack starting with t22 okay so t22 general and then there's a special channel for this tutorial called t22 for tuesday packaging or something like that uh what else uh oh thank you very much to curvenotecurvenote.com for sponsoring this event um the whole the whole of transform they're doing amazing things with um scientific communication basically the written word along with mathematics uh plots and images and code so fully executable documents is that sort of goal go and check them out if you're not aware of them already and thank you again to them uh let's see yeah oh i guess if you want to find out about the other things going on in transform softwareunderground.org transform will take you to things where you can see the schedule uh you can also just like follow our twitter swung.org um and keep keep in touch with what's going on but the slack is the place the slack is the best place so welcome uh with we're gonna spend the next um well more than an hour less than two hours talking about packaging python packaging um a way to get your code into other developers hands really so we're not talking about creating an application for an end user to use with menus and things like that this is a way to get your code to other people who are writing python code who might want to use your functions and your classes or whatever it is you've built in their projects and packaging it is packaging your project is going to make it way way easier for people to use than say you know emailing them a python file or a zip file or pointing them at some scrappy github that you uh keep for your stuff you know those kinds of things which um you you know which is you're still gonna do the packaging is not gonna solve all those problems but it's gonna give you a really nice channel for putting your stuff in the right place where pip can find it basically so you're going to be able to do pip install your package and uh so is anybody else and that's how you know i think it's going to if you're not doing it already it's going to really transform how you how you write code how you maintain your code how you share it and potentially even how you think about yourself as a developer right there's a kind of um i don't know with great power comes great responsibility kind of thing going on uh okay before i say anything else i should just like i had to change my t-shirt today because i well i basically i've discovered that pretty much every single soccer underground t-shirt has green on it you know obviously i don't have a giant screen behind me that's a green thing and uh even this one which is this is from a um the geothermal hackathon that kind of we had to rapidly move online because covid um so i never really got to hand these t-shirts out unfortunately but actually if i move uh that way i guess you can see that you can see straight through the middle of me uh which is kind of cool but now that i know this i'll probably make some t-shirts that don't have green but maybe we'll make some t-shirts that definitely do have strategically placed green and um can do some interesting things on online when you when you're streaming your uh nonsense tutorials okay so let's uh let's talk about packaging i've got i have got a few slides um we are going to do lots of coding uh but i found as i was going it's like oh wow i've generated a lot of notes that i thought well i might as well share uh share these with you so that you don't have to write down a lot of notes um and i've put them in here so the url to this deck is right there uh it's a google slides so excuse me if you can't use that if you can't get to that for some reason um [Music] well i'll tell you what i'll put the pdf uh what i put the pdf right now in slack um i didn't think of doing that before i might have to do it again at the end because i may add to this as we go just um you know boom so there it is in the slack wallop now i've firefox has done that let's go back to this let's close that okay so hopefully everyone's got some version of the slidey things going on now um there's me i'm matt i'm agile uh i'm a co-founder of the software underground as well and um you'll find me easily in the slack uh if you need to get in touch or you can email me if you don't want me to get in touch because i i'm not very good at email these days okay um i'll let you read that i should have looked up where this borrowed um and butchered quote comes from this is actually originally a quote about regular expressions um where you've got a text processing problem uh but you know python packaging has definitely historically and arguably even today been a bit of a nightmare basically um i i i'm not a good enough programmer to know why uh exactly or what about it is so difficult um but it just seems to be a super tricky problem and so don't feel bad when you start looking around and feeling really unable to figure it out even with or perhaps because of stack overflow even with youtube even with videos like hopefully this one and even with your colleagues you still can't get to the bottom of it because but basically it's just really hard uh the good news is that it's getting better and better all the time like where and a few years ago like this is even from 2018 but i mean definitely you look back to 2012 to 16 i mean just maintaining your python environment was super super difficult and unpleasant it was super easy to break and you'd end up not being able to run stuff you could run yesterday and if you're on a mac you've got homebrew in there like this funny comic um intubates and you know there's just all these sort of pieces floating around and it's really out of this miasma that conda emerged from the uh company that's now called anaconda and um conda did solve a lot of problems and make a lot of things much easier like i was using environments really haphazardly before conda some things like anything that involved gdal for example was really tricky to install before conda and uh so yeah it it changed a lot of things for especially scientific python developers um i feel like python and the the core tools in python are coming back up to the point where maybe we won't need condo for much longer and actually i don't build my stuff now for conda because i don't want yet another thing to maintain and pip seems to do an awesome job so um it's not like this anymore that's what i wanted to say there by the way i should say you know i am i am not um a packaging expert i bit you know i have built quite a few packages in various different ways over the years and um i'm not a great maintainer either so part of me is like well maybe you shouldn't be learning from me but i've also i feel like um i have tried to find the path of least resistance and least maintenance overhead so um i could be totally wrong about that and so one of the things i wanted to say was you know if you're sitting there um you know i'm seeing lots of awesome devs in the slack chat for example um i mean i really hope leo uh ueda's not listening to me because there are tons of really fantastic package maintainers out there in our community i hope that they will chime in and correct me when i'm talking garbage suggest alternatives and point you and me at other tools that are out there um that are going to help you out but know that they're in the community so if you do run into trouble i'm always more than happy to try and help where i can um but the folks who are maintaining these other amazing packages uh out there pi gimli and simpeg and fatiendo and um devito and all the rest of it uh will also swing in and help you like this is a great community for that okay so next thing i guess was you know there are some alternatives to packaging that i just wanted to sort of highlight um you probably already share uh i my python notebooks do we still call that i guess jupiter notebooks we call them now i'm always a pretty long way behind the curve fyi um emailing around.py files sharing code snippets of course in places like software underground um using gist i really like gist i think it's a nice add-on to github for when you've basically just got one or two files to share and it has kind of a super lightweight version control so you can go back and see old stuff or old versions um and and you get a url of course for your for your code so you're really easy to share that around so that's a great option for little snippets the thing is you can't really send along dependencies with that the person has to read the imports at the top go and install the things that they need um and there's really no guarantee that they're going to be able to run your code you may or may not have documented what versions of python you're expecting to run on things like that and of course there's also no way for them to install that code or that's not true there are always ways there's no obvious and easy way for them to install that code so that they can you know download your py file from get from gist or whatever and then go over to another project and go you know from matt's awesome package import foo that's not going to work making functionality available you know without code so in other words helping other developers by giving them a tool to get to get stuff done with python or any other language could be achieved through something like a web api so you could stand up a web application that serves requests for such and such um one of the things that totally was a real epiphany for me back in like 2011 evan and i were playing with making um android applications sort of and we needed to make charts and google has this amazing or had this amazing charts api you could literally just send a url that contained your data and your configuration for the chart uh the parameters and it would send back a png with your with your chart um you know that's really cool that means you don't have to expose any code to anybody but it does mean that person needs to be online to use it though of course and then i've also got in here freezing your code um which basically means building an exe or a dot package or something like that to install on a mac and i'm including in there although it's not quite the same thing building something like a deb file um or package for basically a linux distro um that's what say dropbox does you know dropbox is written in python and you install it just looks like a regular application but the thing about that is you can make certain guarantees about the environment because the application brings the environment with it um but it's not i i think you can completely hide the code from from people when you do that there's no reason um in general why those sorts of you know those users desktop application users are going to want to see your code anyway anyway if you get to the bottom of that list and you're like yeah i still want to build a package cool um here's what i'm sort of assuming for today sorry i've i'm i'm still uh getting over covered seems like it's moved right down to my chest um where that hole is that's probably related i don't know i wish that was bigger now i really wish i could really see stuff through there you can even see through the chair which is um okay so i'm assuming you've got something written in python already that you want to distribute and like i've been saying in the chat if you haven't see if you can dig one out now while i'm talking um i i also provided in the chat in that thread you'll have to scroll up a minute let me just pin it so it stands out in yellow um i said time to think about you've still got time well you haven't anymore got time to think about a tiny project ideally like a single notebook or python file and then i've given you a completely rubbish function um if you really can't think of anything i'm sure you can uh you know just implement i don't know gardner's equation or something um we're giving you a little hello world function which you could totally just put in a python file and use but we're not i'm not going to talk about inc including stuff like syphon like c code that's in your python project and that sort of thing um i guess my assumption there is if you're doing that you've probably got the chops already to figure out python packaging um if you haven't then good on you and please stick with it because i want to see your package for sure but um i know i know nothing about making that sort of stuff happen i'm assuming you want it to be installable by pip and but for the purposes of today we're not thinking about conda um i'm assuming that you want to follow the kind of best recommended currently recommended by the python foundation and a sort of thing and the python packaging authority their current best advice um because and i'm saying that because the current best advice as far as i can tell as a kind of a noob is not the same as the prevalent advice on the internet uh you know if you want to follow the prevalent advice go for it there's tons of it out there you can definitely make something work um i'm assuming that you prefer sort of getting into the nitty gritty a little bit like i'm not you know don't don't need to go all the way with this idea but i think it's a good way to figure something out and understand what the you know what the tricky things are what various words mean um it's just a good way to get started kind of going in at the ground floor even if you then abandon the ground floor and go you know what i'm just going to use poetry which i'll tell you about later and um not worry too much about all of this other setup tools stuff and i guess the other aspect to that first principles is this is the stuff that comes with python on the whole for the most part we're not going to need to pip install very much to get this done uh you're gonna need patience like it's just it's just how it is you're gonna need to google things um but i've got a good trick for google that has really helped me out over the last few weeks as i've been looking into packaging again uh let's move that on okay so there's a various pieces of bad news unfortunately um basically everything's changing all the time constantly so there was a big change on the 25th of march what a month ago um exactly a month ago that is great great news um that i'll explain shortly uh but i'm not using that feature of python today like we are gonna we're gonna slightly use it i know this doesn't make sense it will make sense but we are gonna use a bit of the good stuff that came out on the 25th of march but when i tried exclusively doing it that way um you know it comes with the notice which i'll show you comes with a thing saying you know this is brand new functionality is we're testing it um it may break we may change it uh it didn't quite do what i wanted to do so there are these two peps um peps are how the python community figures out what python should be and what it's going to include in the future um [Music] so you know you've probably heard of pep 8 which is the sort of de facto standard for laying out python code and how it should look uh in your editor um you probably haven't heard of pep 517 and 518 and maybe you're surprised to hear that the peps go up this high but they keep going um and there are new ones all the time and they're quite interesting to read through because they do capture some of the well basically all of the salient discussion that happened around any given item but these two peps describe the goal for packaging in python and my understanding of poetry which is a package you can use for packaging is that it tries to sort of be the future of python packaging so in a way poetry is like a sort of canary um in the in the mind trying to figure out you know if we redesigned everything from scratch and it all looked beautiful how would it look um and i think that's the goal of poetry i have not used poetry so i don't really know um but peop you know just last week graham gansel was telling me to use it basically um okay so most people uh from what i can tell from the videos that i see and the repos that i look at um most people are still using setup.pi uh you've probably heard of setup.pie or seen it around and um for for a long long time it was the way to get stuff done with packaging but the current advice is oh and the goal is to have no setup.pi whatsoever um or at least it will be created automatically for you in the background but you don't really need to know that and the reason for that from what i can gather is basically you just don't want executable code being used to build things because nefarious things could potentially happen um i suppose that's one consideration you know you're executing a python file in order to install something but the other thing is that it means that all sorts of dynamic things can happen so things can change dynamically at install time and that makes packaging a headache and it means different things might happen for different users and basically that's just the scenario that they're trying to avoid so the goal instead is to have a single um metadata file like you've probably seen init files or yaml files uh config files this sort of thing um and have a completely static deterministic way to build a project that only contains a bunch of metadata and um and there's only one such file okay so we're not there yet the goal is for that file to be pi project dot toml what the heck's tommel it's another markup language i don't know where it came from somebody thought somebody did the old xkdz xkcd thing of looking at all the standards and that wasn't the right standard so they made another standard um you know they looked at json which some people might think would be an obvious uh choice given the way the web is these days um but they decided for example it's not human editable enough so they didn't want that they looked at yaml but other reasons uh rejected that unfortunately we're not at the point where you can just put everything in pi project dot normal and personally i find the format in there a little bit weird and also the way i've got my editors set up it doesn't it doesn't highlight it properly i'm certain that's something i could easily fix but it's these ridiculous little things that slow you down right um so currently you almost certainly also need this other thing called setup.config which is a bit like setup.pi used to be except it's not a python file it's one of these configuration files so you can't run it it's not executable okay i'm so you know i apologize for all this i wanted to give you some background uh i guess you can't fast forward it yet if you're watching this on youtube next tomorrow later you could just skip all this um until we get to actually typing some code but for now you're stuck with me um if it turns out you can fast-forward it already that's cool please tell me what i'm gonna say later on um okay so i've one of my goals for today was to figure out because i hadn't really done this for quite a while like what's the absolute minimum you need to do to put something on pi pi pi pi is the python package index or the cheese shop used to be called the cheese shop i haven't heard anyone call it that for a while but anyway um i'm resurrecting cheese shop um which is where pip by default looks for stuff when you go pip install foo it goes to the python package index pipi.org to look there for that package so i've so what we're going to do first is do the absolute minimum and then we'll publish something and then we'll go back and say okay now that we know what the minimum is how do we make this actually a you know viable community friendly maintainer friendly um contribution friendly open source python package now how far are we going to get into that second half i'm not totally sure um i mean well i guess not very far because it goes on for infinity so basically we'll get we might get a little way into it um but you know we've got some things to talk about potentially like testing like documentation and all of those things are giant subjects that you know we can't do justice to but i can point point you at things and hopefully once we've done this you'll you know have a clearer picture of like oh okay that's what i need to pay attention to because you know if we go to um well let's go to welly on github this is a package that i maintain you know what you you may have done in the past is look at a repo that you that you know exists and is installable et cetera et cetera and just go flip a neck like that's a lot of there's a lot of files there and only one of these folders actually appears to contain code so what on earth is going on like what's the other sort of 80 of this repo about and um sorry i should blow up github a bit um so that's what we're going to start unpicking and we'll start with the things that are necessary so here's what we're going to do i think we'll just do this rather than me kind of going through this so let's just start we're going to um well i'll tell you what i'll start i should have put a 0 on here because it's python let's let's do that because i don't think i've uh yeah oh you can't well surely i can start with zero format former options oh i should have looked at this shouldn't i uh okay anyway i zero one that's not what i want is ridiculous why am i spending time on this i just like the idea okay have some python code okay or a project or whatever it is if anyone figures out how to um bullets and number here we go here we go here we go here we go restart ah oh yeah okay there we go all right so well i'll show you what i'm going to start with today um i've got a notebook okay so it's it's dynamic time warping and i've called it dynamic whatever warping um because of course you don't just have to have signals in time as we all know you can have something like a well log um or seismic in depth or whatever um and i'm sure you can think of other bases like a spatial basis where you've got like a signal in um 1d or 2d space so i read a bunch of blog posts this was a couple of years ago i think at the beginning of covid and this awesome wikipedia page which has a bunch of pseudo code on it and here's my um here's my function to build a dynamic time warp cost matrix out of two signals so rather than tr i'm not gonna explain what it does um but you know this classic notebook oh it looks like already run this i suppose probably checking everything worked um there we go is that big enough and then i'm going to give it these two signals right so i've got a bunch of ones with two threes in it so it's like one one one three one one one three and then i've got a very similar looking signal but it's kind of com uh compressed if you like contracted um because it's just got two ones in between the threes and the idea with dynamic time warping is we would like those threes to be correlated because if you imagine there's sort of two signals they've both got two spikes in them and the idea is that the spikes correlate with each other so i want this algorithm to detect that now the way it works is you start off by computing a cost matrix which looks like this thing we can plot that as a 2d matrix and that's what it looks like in in 2d and what you're seeing here is the two signals if you like on the axes so the longer signal um with the extra ones in it is on the vertical axis short signal on the other one and um where there are ones we're getting this dark color means low cost um we're getting a big blop of low cost and the idea there is that the algorithm doesn't know how to correlate those things so it has a sort of default strategy for dealing there but where those ones are sorry where those threes are um it's only got one option there's only there's one square there with low cost so when we try to navigate away a single path which is what the correlation essentially amounts to is a path through this matrix it's going to be constrained there to to be at that point because it knows that that work thinks that those points correlate okay so now we have to build the path and the way that that works is we use a sort of backtracking algorithm so it starts at the end and then it just looks and tries to find the lowest cost option in front of it and it takes a step and then it looks for the next lowest cost and so on so that's this algorithm which i built from another bit of pseudo code um i know that it's super loopy and stuff but i tried a lot of numpy things in here and the loops were fast but i haven't tested it on giant arrays or anything like that anyway here's the path um and what you can see if i just show you the signals again whoops this one is um basically we've got the index numbers of the long signal here and the index numbers of the short signal and what the path is doing is it's trying to take steps and it's saying this correlates with that that point correlates with this point uh so what we're interested in is like zero one two there's the two in the short signal and it's saying that corresponds to index three uh this one in the long signal and you see it's constrained there whereas in between it doesn't really know so lots of points correlate with these other points it gets a bit confused basically uh same thing for the other uh constrained point sorry which is this one here five right zero one two three four five yeah okay and then we can combine those two things because basically you always want to do both of them um by the way i wrote this because i found a lot of there's lots of dynamic time warping packages out there but many of them only give you the path and i wanted the cost matrix so that was what got me into this in the first place there's others out there that also give you the cost matrix don't get me wrong um okay so let's just put them both in a function uh which calls both things and passes back both things uh in the return and now we get uh both the costs don't really need that do i um and what i can do now is say okay we'll give me the path as a bunch of coordinates and then i can plot it onto the cost so you could also make a cute plot where you actually show the correlations but that's quite a lot of code or at least the way i did it so i haven't haven't included that so that that's going to be my that's going to be my package basically um the but let's let's do this um excuse me first we're no i'll tell you what i'm just going sorry i'm just going to leave it at that for now we'll go to step two or one or whichever step i'm on now um we are going to make a repo so i asked you or suggested that you make a github account the other day maybe use gitlab or some other repo management tool doesn't really matter but the idea is you want to i tend to make the repo first and then put code in it i find that easier um and you'll probably see why in a second so i'm going to click on this new button here and i'm not going to use a template we'll talk about templates later if i get a chance but basically the idea is you may have a kind of empty repo that you use and github can use that to start off new repos which is kind of cool so i do have one actually that i use sometimes to start new repos because i know it's got all this stuff that we're going to do today in it already um okay so the owner is going to be me i'm going to call this uh oh we need a name so i'm going to go now to pipei so go over to pipi.org because you want to start off you want you don't really want to change the name of your project uh you definitely can like it's not a big deal you can search and replace or you can there are even tools excuse me to help you rename things um but you might as well start off with the name you're going to keep and just like if you're thinking about you know a company you need to check if the url uh the domain exists um when you're starting a python project you want to check that the name is open for you on pi pi so um you know if i search for well let's search for pandas you'll notice there's a lot of things with pandas in the name um and some people nefariously put out packages with very similar names or typos in them to try and get people to download um evil code but pipeyi's done a reasonable job of recognizing that the one you probably want is there so that's the one you recognize and then there's all these other things dtw um they'll be yeah these exist and even things like pi dtw not there either um but i was calling mine dxw and i think that was available okay so i can use dxw um and that's what i'm going to go with now we are going to put stuff on pipeline you know don't worry about doing something banal on pipeyi it's totally fine there are no rules um you can always delete it as well so if you put a hello world package on there it's fine and if you're having trouble finding a name or you know you're going to delete it just sort of put your name at the end of it or something like that um just to sort of you know so that you if you forget you're not taking up a cool name or whatever um up to you but just settle on something doesn't really matter what it is um if you don't think you're going to keep it so let's call this dxw shorten memorable i guess github suggests names i don't really get their suggestions like i would never use any of them but anyway um so this is going to be simple dynamic time warping i think it's always good to fill these things out because it sort of shows you where they show up you're like oh that's what that is um it's going to be public by all means keep it private if you want there's no correlation between the privacy of your github project and your pipi thing and then i want you to do this add a readme and you do want to add a git ignore you know you're almost certainly going to want it eventually anyway and the git ignore that you get comes with things that should be in there kind of thing so just make sure you use the python one and then i would suggest adding a license i mean if you're putting code on the internet you should probably choose a license for it now this code that i've used probably contains dna from those blog posts and the wikipedia doesn't really my goodness knows how what they think of code snippets in fact it's not coded pseudo code but anyway i'm going to use the lgpl for it um because that's how the wikipedia stuff is licensed the other stuff you know bit of a gray area potentially but i don't think my code's that similar um anyway i guess if you're having trouble choosing a license i do have i have a blog post here i'll give you this um which basically just tries to sort of spell out what some of the features of the most common open licenses are my view on licenses there's also this awesome choose a license.com which basically just says use mit or use the gpl but it does have lots of information about all the other licenses you can imagine i mean basically it's saying if you want a permissive license in other words one that has the minimum requirements of the licensees um they really just need to say where they got the code and carry that license with the code they use which is probably what you want for most open source stuff most of the time but it's totally up to you or if you want the share alike component where if someone changes and distributes the changed code they have to share the changes back to the community and to you as the developer a lot of companies use this for their code because they want to guarantee that they get improvements that are out there um there is another thing kind of halfway in between these two things called the lgpl which is what i just used which has these features but they only apply to that code base itself um the thing with the gpl is that in principle it spreads to the other code that's being used in that project so your your sort if you use a gpl library it basically turns your whole project into a gpl project which may or may not be what you want anyway there's long historical debates about that stuff in software underground so just don't google gpl if you want some entertainment um i normally use the apache 2.0 license because it contains some specific things that try to prevent on down essentially stamp out poor behavior by patent trolls in a few different ways and that's why i like it but it is a much longer license than something like mit or the short bsd licenses anyway read the post if i guess if you want to know more about that stuff and then you can click on create repository and here we go here's what we get we get actually get for free um well not really the git ignore that's just forget but we get two of the things that you need in your project anyway so that saved us a little bit of hassle and what you can do now is go to this code thing and go copy the code you need to clone it so if you use ssh this one or if you're using the http method then use that or you can download the zip file i guess if you're not sure what to do or don't know how to clone a repo yet okay so now i can come over here um so i am in a um i'm just where i keep my my projects uh i'm not in a environment or anything yet we'll create an environment next so well first let's clone um sorry git clone that thing that you just got and there it is and i suggest also making an environment for your project at this point so let's do that i generally use the same name for the project uh as a or for the environment as a dupe for the project so that i don't go crazy trying to remember what goes with what um let's do python 3.10 and be up to date and i know that my thing uses numpy so i'm going to add that right away and anything else that it turns out that we need we'll just add it later excuse me the repo name does not need to match the piper name but i you know i think it'll help you keep things straight in your head if it does one of the i mean one of the underlying sort of um i guess reasons why they're constantly changing the python packaging universe is to try and be they're trying to become completely agnostic to the tools so you can build a project without even using python you can build projects for different targets you can integrate conda more easily like the goal is to be totally interoperable and i guess the idea with the old way of doing things with disk utils and setup tools was that it was too baked into make everything in python put it on pipe i um so that was one of the things is they're trying to sort of dissociate these so okay my thing finished making so we're going to conda activate i'm using conda but you could use vm for whatever you like to make your environment um like i wouldn't i mean i'm going to go so far as to say you definitely want environments for developing um a package unless you really know your way around your system um you know i know there are some lots of fantastic developers that don't use environments you just going to make things much easier for you basically um commander activate dxw okay cool so now we're in our project we've got the files that we added in github and um now now we need some code and we need to sort of build the basic framework around our stuff so that we can package it okay so let's go back and let's go i'm going to go back now to the notebook and see how much of this we can see at the same time just about okay i've still got a space for me to vibe around anyway here we go back to the notebook i'm gonna go file um i don't usually do this i usually would just actually copy and paste stuff but this is probably a mistake actually i love departing from my usual workflow in the middle of a live tutorial um so export as a script or a py file or whatever it says in your client um there's the file it's in my downloads now i need to put that in my um project so i'm just going to copy it but i'll move it now i'm in linux here so you may want to use the file explorer if you're not familiar with the command line you're on windows or if you're on a mac you can do the same as me downloads what was it called simple oh right it's got the name of the notebook um to here okay now i've got my my file um now i'm actually going to move it um and i've been really debating now what to tell you about this next bit because the you've just got to know that there are two ways of doing this um there are two kinds of tree we can have let's maybe i think there's something right in here i can use to show you the two kinds okay so there's there's two kinds of layouts for a project there's what's known as a flat layout apparently also known as ad hoc this is what i was a brief survey of packages yesterday left me feeling like this is what most people do okay so most people have their project root directory which in my case is this dxw okay and then they have these config files and whatnot that i was just talking about earlier we'll make in a minute and then they have a folder called whatever their package is called which for most people is also going to be the name of the root folder and then they have their files their actual python code in there and this is actually how i do all of my projects as well all right now that some people get as you know on the internet people have very strong opinions about things and one of the things that python packaging people have really strong opinions about is this layout and a lot of people don't like it um they're mostly uber programmers as far as i can tell um so they in other words that's another way of saying that they don't like it for reasons which might be quite hard to understand but which are probably totally valid so i thought you know i trust them i trust these smart people the geeks but i don't fully understand the the issues there's a little bit of description here about about it if i scroll up let's look at the alternative this is what they are advocating is having a package called sorry a folder directory called src in your distribution in your project and and that thing contains the sub packages modules python files etc etc okay now you might be sitting there going who cares what difference does it make and luckily the answer is actually now as far as i can tell it doesn't make too much difference if you'd asked me a little while ago because i have tried using source in the past i ran into trouble with getting my tests to work with getting paths sorted out and just getting things to work properly probably because i was confused about things because i was used to the other way maybe that was it i don't know but i can also tell you that as of very recently setup setup tools which is the scheme if you like that we're using to build our project and it is a python package python project um has made it much easier to to choose one and it'll work okay so in the spirit of doing things for the first time during a live tutorial i'm going to use the source layout because like i say i believe the people who say this is how you should do it um and we'll all be on a journey of discovery but i will tell you how to do both okay so i'll show you um uh what you'll need to change because so far i've only discovered one thing now in about 45 minutes i might backpedal rapidly and go the other way because things are completely broken but uh until that time i'm going to try the source out and yeah there's only one line of code that we change okay in theory so let's let's go with it so i'm going to make a source directory and i'm going to make another directory in there that has the name of my package now oh by the way if you already know that your package is going to have lots of sub packages so you know what i mean is like if you think about numpy that's got numpy dot this and numpy dot that but it's also got numpy dot fft and numpy.linaug and if you think about something like psychic learn that's all sub packages right or super modules i don't know exactly what to call them um you know you've got like sklearn.linear and that contains all the linear stuff you've got sksklearn.svm and that contains all the support vector stuff um if you already know that you're probably going to want that um you definitely want to choose the source route from what i gather now i did go down that road in bruges with the multi stub packages and so on and i do still not totally understand that repo which is pretty worrying uh anyway uh so i'm going to create mcdear source and mcdeer saw slash dxw and then i'm going to move simple dxw into source dxw and i'm actually going to rename it at the same time because like i know it seems like everything's got the same name it's like hang on the pi pi thing that's got the same name the github repo the folder this is in the folder in source and now i'm going to use the same name for the python file you don't have to do this but if you've only got one module or one python file you you know you'll probably just gonna call it that okay now let's um well tell you what since we can do this let's do tree so you can see there what the um the where is that there uh the structure of my folder and you can see that it looks more or less like uh that one except it hasn't got as many files or it's got different files now um the alternative you can see there um [Music] maybe i won't do it right now but i'm sure you can imagine for yourself just move that dxw folder up one or down one or however you think about things um i think if that was going down towards the root uh yeah i'm not having a source folder that would be the other way okay now cd source uh dxw i guess the one thing is you just you end up with these longer paths like if you don't like that stick to the other one i would say now i want to edit this file because i know that it's got a bunch of garbage in it from the notebook well i say garbage you know oops how many times did i hit that i've turned my co-pilot off i don't know i don't know if i can live without it but i'll try uh okay so um i'm gonna get get rid of that i don't well you know what i might keep some of this because i don't want to lose attributions and things although that's going to need some reformatting we can worry about that later you know the general advice i think still is that it's not a terrible idea to have encoding information at the top of your python files if your file is not executable you don't need that hashbang python executable line that it put in at the beginning um and you don't generally want that in a package anyway so i would just take that out the other ways of making your stuff executable with the package but you do want um authorship information um and you you do want a way for people to get in touch with you and so on so i would put those in and you want the license now why put that into every um file well again the prevailing advice just seems to be and actually i you know i know this from my own experience the code is gonna get um it's gonna get split up people are going to use individual files you're going to use individual files it's not necessarily going to be a contiguous piece of code um it just makes sense to put your copyright notice where it's very easy for people to comply with your license okay now i guess actually i only i only want the functions from this i'm going to delete everything else i don't want all this other stuff and i don't want the plotting in there um i don't want this i want that function i don't need any of this stuff that runs kind of in the interpreter don't need this just that function two returns in between your functions in a module according to pepe and i can get rid of all of that okay now we are going to revisit this file but for now i'm going to say that that is good i don't like the way it does that i'm going to save that now that's our that's our file we are ready assuming you know we were running that and it basically worked um we are ready to add some um so for now that was my test was running it in the notebook um but we're ready to add these other files that it's talking about over here now oh i i didn't really uh talk much about this yet but you will find websites all over the place right for advice but here's what has been working for me let's start at the beginning there's a thing called the um ipa the python packaging authority rather grandiose name this points to some useful things like the python packaging user guide is actually pretty good but i find that it's um very adaptable to pretty much any situation and most of those situations don't apply to most people most of the time okay so it it's great it's super comprehensive and i think if you came to this especially like this bit here this tutorial on packaging and distributing well i don't know actually this this document's pretty good to be honest um and it's more or less what we're going to do but so that's that's one great canonical solid place to get advice that's up to date and really like how they have these you know is what you do on different platforms and here's what you do for different um approaches and so um on best of all i find the setup tools you can see it's a pipa project um the setup tools documentation is awesome i think uh and it's super up-to-date uh and fully aware of all of these new pep 517 518 business so uh where you want to go in here is user guide and then just go to the setup tools quick start okay so i can give you the url to that thanks martin for uh um helping out with the uh with the questions there yeah absolutely we'll i'm going to put the tests outside the um source directory but uh certainly people do other things too okay so um let's have a look over here let's do what it's suggesting we're going to go pip install minus minus upgrade setup tools so that we know we're using the latest version now if you just created this uh you probably are looks like i i've upgraded by one point sort of thing uh with that and i can tell you that um well no we'll wait till we get to it so um here we go with the pi project.tommel um we are gonna use that pattern i think unadulterated let me just double check that's right yeah we can actually start with exactly this so let's come back over here and go um codes i project dot tomml and just paste that in there and this is going to be called oh well it's already got a name and then i'm gonna make a new one i'll just do that from here let's go back to the device now it wants us to make setup.config okay now um let me just make sure i do this the way that's compatible with what i've chosen to do i tell you what let's start with what they're giving us now what they're saying here is that basically there are three ways of doing this next bit which is we're passing in metadata about the package and we're passing in some stuff about like here's where my code is and here's what i need to be able to run my code um and they're giving you setup.config setup.pi and pi project dot toml now the new bit is pi project dot toml this did not used to be an option to put all this stuff in there but as of apparently the 25th of march you can potentially put everything in pi project dot tomml um i tried that it didn't work so i've come back to this method for now but that's my expectation is you may be able to do that within the next six months uh okay so let's change this stuff dxw you start with i mean i would start with zero i think but i suppose you could start one i don't know um now this is the line that's going to need to change if i save this i'll get some highlighting this is the line that needs to change so it's like use this for flat layout but i'm not going to do that remember so i'm going to comment that out um use this for what's called the source layout so um what did i do oh in fact you use nothing for the source layout and actually this thing uh obviously you would change this dxw but i i see this is the sort of thing that i don't like i don't like having two places where i need to change something like that you know you want to try and avoid that kind of thing because if you do change the name of your package or whatever or you use this as a template for the next thing that's going to trip you up so what you can do and again this is all relatively new is you can replace this with find colon okay let me just sorry let me just uncomment that so you can see exactly what i've written so if you're using the flat layout where you don't have a source folder use this find and it will automatically scan now how does it get by without naming the package for a source layout it's automatically scanning i don't really know why it can't automatically scan the flat layout in the same way it needs this hint but the source layout doesn't need this hint okay so um i'm going to say for now this worked for me now this is super minimal right this is we have literally just copied exactly what we were given in these this documentation except we've even uh deleted this line if you're using the source layout so um with that i've saved that um there are my files let's look at the tree again it looks like this is everyone okay you're a little behind on the vid okay mata yeah no worries um cool so um let's see what it says in the docs because i can tell you if we need yeah so that's it believe it or not we're actually done and um oh i guess i could have said if you try this this pie project.tommall big letters experimental um and by the way this actually did work for me when all i was doing was this i ran into trouble when i added more metadata um so so it wasn't quite yet that i fell off the pi project dot toml horse um so by all means give that a go because if if that works you're in the wonderland like you've you know basically you've got one file that builds your project and that's the dream um so uh the by all means give it a go but like for now i felt like i needed to stick with the config file as well it's too bad but like i say this could be the year where python packaging becomes amazing okay now we can do python minus and build don't worry too much about this this is a way what you're doing is you're saying run this package as a python like it knows how to run something essentially um it's kind of like ipykernel and some other things out there that you might have run before in the same kind of way pip works like this too you can do python minus m pip even though pip's a pip installable package it's got a way of running as a program uh so let's try it it's not going to work because i'm not sure why but build doesn't come with python you need to install it so um it did install some things along with it like a package called pet517 bypassing okay so um now that it should work let's go back to uh python minus m build and stuff is happening yeah good question michael about highlighting the config and the tommel if i go back to my um file there if i can see what's running to do this i'm sorry i can't remember maybe i don't know how old you well it self updates doesn't it okay well it's finished this thing over here and what what you know if you've done a bit of packaging before it's quite interesting to read this because you'll find that it's doing quite a bit in the background it's actually setting up an environment um it's installing your package in this isolated environment in order to build it sort of cleanly it's looking for all sorts of files like it added the license file by itself um it's making this stuff in egg info um and it's building up your package and finding all the files uh and so on copying the readme like it knows about all this stuff so there's a ton of like automation happening here super convenient and in the end it comes down and says it's building an sdist or it's built the sdis at this point so it builds the sdist that's this egg info business um sdists are sort of the old way of building python packages they essentially contain all the files needed to build and the instructions for building the project the new thing is this wheel and the wheel is faster um to install because um pip can literally just copy it into the person's python environment it doesn't have to run anything so it's much more kind of streamlined um and the wheels are the new thing or new i mean since about seven years ago but uh the only reason that it still kind of builds and gives you the sdis is sort of for backwards compatibility in case someone's running some super grungy python environment so because you can see at the end we get both atar ball so tar.gz that's the sdist and we get a wheel so if you google like pipe and sdist wheel you'll get a bunch of explanation of what's going on there i don't really follow it all that closely in fact we're going to upload both of these things to pipeyi so um are we ready to do that already i think we are basically so it in order to do that oh no i'll tell you what we can do first actually because right now if i run python in my environment um i can't do that right it doesn't know anything about that oh i have forgotten something actually hmm just realizing that i didn't add an init file um so i'm a little bit surprised that it built let's do we do it or should we just not do it and see what happens okay let's see what we can do we can do this um [Music] the both of these files are pip installable so you could email this to someone and they could pip install it i'm going to pip install this um dxw i'm going to pip install the wheel um oh i tried to import foo not dxw that seems to have worked um i don't know what i called my dtw that's probably a poor choice of uh name yeah so i think what's going on there is it actually is not working so we need to add the uh init file i was i think in the back of my mind i felt like something was like it couldn't be that simple um so we're going to make a new file here now you'll read a lot of things saying this file can be empty blah blah blah but actually it usually isn't empty and um i've found that normally what you want to do in here is sort of this is where you basically tell the package and the import mechanism what you want to be able to see from what you've written so uh what i'm going to do is say oops from dxw which is my file um import um dtw you know what i'm going to rename i'm going to rename that function because i'm going to get that wrong every single time i type it and i can decide at this point if i want a user to be able to easily use these other functions or not like i if i don't import them they're not going to be sort of exposed in the same way if i do want just everything i could do this now i know you've probably seen and read people saying don't use import star i i feel like i see this a lot in init files and i think it's a if you've got a lot of functions in a file the last thing you want to do is have to just list them all so i feel like that's reasonable um so i'm going to leave it like that now i'm going to save this now it has to have this special name it's underscore underscore aka dunder init.pie so dunder init.piso that's double underscores around the word init all right okay sorry about that forgetting that was kind of crucial now i think if we go through the build again now while it's doing that this time uh i will just want to make sure that you have an account on let's do testpipeyio.org first so go to this url and make an account and then there's another one called justpipei.org so drop the test and make an account there as well i would use the same password on each of them i guess that's not strictly um sort of best practice but um it'll help you not go nuts and um once you've got so just make sure you know the username and password because you're going to need them in a minute now the next thing you can do is actually make a file um let me just see what it's called uh it's called dot i p i r c but i can't remember it now this may or may not be sketchy um but what you want to do is make a file like this except scroll down until you can see the pi pi ones so yeah okay so this is a safer way of doing it um so that's the code you want i'll put it in slack but this is using tokens now you can make the tokens from your account once you're in there uh actually can you make them i've always made them for individual okay api tokens so you want to come down to here and go add api text so this is once you're logged in go to manage your account um scroll down to api tokens add a token and then call it i don't think it matters this is just a label saying here's what this token is for um so you're going to do it for dxw or for your project and then the scope you can't use yet actually you know what i think it's fine to use this just make it for all projects and then your dot pipe irc file kind of makes sense so choose entire account now obviously that just be aware that that exists so at least you're not putting your password in in plain text so it's not that bad but yeah it's either that or you're entering your username and password every time so if you prefer to do that that's fine too okay hopefully that made sense if you're just entering your username and password for now that's let's do that this has built um let's just check the thing i did before works now so i'm going to pip install [Music] this dxw okay so now i can do stuff and there's our results okay so um lovely this is the current state oh yeah so let's talk about this you're like hang on a minute there's more stuff there at least that's that's my reaction there is more stuff there because remember the build process built this dist folder and put our two assets in it the sdist tarball and the wheel and we only really strictly need one of those but we're going to upload both of them to pipeli because that's what you're supposed to do and it built this thing called dxw egg info which contains a bunch of things that we had nothing to do with and yeah that it does um if you use the flat layout the this folder will be in your main project level not in source um i mean basically you're just going to ignore that folder you can delete it if you want it doesn't matter it'll rebuild next time you build um just just ignore it it's in dot git ignore um setup tools knows to ignore it uh it's setup tools ignores this in fact there's a whole list of folders which setup tools will ignore and not put in your code in the wheel um so there's lots of things that are sort of automatically being detected in the background and it's going to ignore this stuff so you know essentially just don't worry about it i think you can tell tree to ignore stuff if you like the tree view you can sort of give tree like some regex to ignore things but i'm not gonna do that um all right so i declare that we're ready now there's a bunch of ways you could probably upload stuff to uh pipi but by far the easiest one is using twine um let me just dig something out here because i can't quite remember how to do that with the test pipe i history okay so it's so we're gonna so we won't have twine yet um so you're gonna pip install twine finds one of these just wonderful packages that just works beautifully and saves you i haven't said this it's gonna do something horrible to me isn't it it's just one of these things it's just so great uh that does exactly what it says on the tin um now so it uploads stuff to so we're going to go twine upload minus r test pi pi and then we're going to go dist slash star and if you've got a pi pi rc uh okay if you've got a um a pi pirc with a token in it or with your username and password in it um it'll it should just work it or if not it'll prompt you i think for your username and password hopefully that worked for you i can go over to test pipeline now and look up dxw and there's my package half a minute ago not much to see um and actually we're used to seeing some things down here like home page and a description here um and those aren't there and actually if you noticed when we did the build there were some things there were some warnings here warning missing required metadata url uh missing metadata author author email or maintainer i think they should probably just put those in the examples since they're apparently required um although i guess nothing actually required them in the end but they didn't so we're gonna we're gonna add those now we can actually pip install from i'll just uninstall my local dxw um we can install it now from pi test this is one of those things i can never remember how to do um oh yeah i know oh in fact you know what i put it in the i put it in here because i can never remember how to do it um it's here we go it's this so the thing is if there's a couple of ways we can tell hip about another repository because by default it's going to go to pi pi and we haven't put our stuff on pi pi yet um but if you just tell it get my stuff from test pi p i then it's going to try and download numpy from there if it doesn't have numpy already and you don't want that so if you use this extra index url then it'll only use pi pi for the things that are on pi pi i guess or something um so now let's try this so pip install extra index url the test pipe i slash simple and then my package dxw and um [Music] did that use a no no it downloaded it there we go downloading i was gonna say did it use like a local cache or something weird um but if i go into python i can now import dxw cool so um that we did it let's do twine now for twine uh to upload to pi pi you don't need the extra a bit so we can just go uh dist star like that um in fact actually i feel like i think you can miss that off no you can't okay fine now be a bit careful with this style because as you add versions you may have different builds in dist um and you know so you there might be a lot of things in there um some of which are broken or you didn't care about or whatever out of date so just maybe either keep dist clean or tell twine exactly what you want to upload all right so now that should also be on pipeyi which is this one and there it is how are people doing all good sounds like disable ssl certificate yeah if you're on windows michael did someone help you with that because there's a thing you can pip install that i feel like often seems to work um it's called something like python wincertify python certify win32 i don't i don't know what i'm talking about by the way i don't use windows but um it seems like when we have trouble with certificates in https that sometimes fixes it when we're teaching and whatnot all right so that was twine uh we are in good shape and in fact i think if i come back here yeah we're done we're in the profit rocket stage wicked now what next okay so let's let's go take a look at some things that we skipped over a little bit and the first one i want to look at is more metadata so i'm going to i'll give you some boilerplate here in a sec so this is going to go in setup.config and it's going to go underneath here now you saw that it said earlier we had some stuff that was missing so one of them was url and for this we should give it the um i'm going to give it a github but whatever you consider to be the url you notice no quotes around these things you can give it author or maintainer whatever you like author email mmm now with some fun stuff we have a description simple dynamic thing warping time that whatever i guess i could put in space and then let's do this now just in case i don't want to forget we're going to do long description because you're almost always going to want to do this and this is actually going to be a file and this is going to be your readme dot md what we have right now whoops i really don't like it when uh when it does that and you just want to say this long description content type is uh markdown so that it knows how to process it and then there's a bunch of classifiers now i'm just going to give you these in fact why don't i add it and then give you all of it now these as i understand it just help people pip crawlers i don't know figure things out and there is something in there i haven't talked about yet and that's python versions and we're actually gonna we're actually gonna add something to our options uh on versions as well because you know it may be that you're using some new feature maybe you're using a walrus operator um or i don't know some of the new stuff that's in 310 or you're assuming that dictionaries are ordered or something like that and you need to constrain the version uh of python that you're supporting now this is just metadata this does not have any effect um what you really need to do is come down here and say python requires um and i'm going to say greater than or equal to 3.6 because i think i can support uh anything like that oh uh sorry i just noticed something this is wrong actually i should have spotted that earlier apologies um i just copied remember i just copied this from the thing uh in fact i need numpy right um so sorry about that i need to change that bit there to the thing that i actually require so my package currently is essentially broken um because it if someone tries to install it into an environment that doesn't have numpy uh it's not going to automatically it's going to automatically install those two things that were in the boilerplate but it's not going to automatically install numpy so um that was an error but it'll be fixed in the new version i guess so we could bump the version here um in fact uh you will have to because pi pi won't let you i'm pretty sure i'm writing saying won't let you install a version on top of another version so we're gonna have to move that version up there to point one but when we upload it next time now one of the things about packaging and what makes it so tricky is that python has no convenient way for it to know its own version a package has no introspection for version number like that now there is a way it's fiddly uh but there is a way to do it automatically from github tags and i recommend that you do it that way uh in fact what i just before i forget show you a place of mind that i know where i know that that's been implemented in a way that works i won't say correctly um so sorry just go to agile's github and you're looking for a package called snow fake and snow fake you i think i'm right i'm saying yeah snowflake uses um so you see what i'm talking about it's kind of horrible um but there's a get distribution from package resources and blah blah so you need the bit that goes in init.pi in your package and you need a bit that goes in setup.config which is this here so that version number gets replaced with this special thing so in other words it's looking in the package for dunder version which is which is all um this init dot pi is doing is setting up dunder version right it's it's all to set that up and of course you've probably used this in other people's packages you want people to be able to go import dxw dxw dot done diversion and get the version number this is how you do that um or this is one way in which you can do that but it also doesn't need anything in here yeah you also need these lines in the uh the tommel and you need this i warned you right it's it's it's not nice um but this gets the tags from git um and just to make it all really dandy at least when i built snow fake back in november or december uh you also needed setup.pi with just this one thing in the setup function super annoying now i think that bit might have gone now i think that the new version of setup tools let's let's at least know the name of the package you'd think it could do that right because it's right there in setup.config so i mean it's kind of weird that you can't just use that but um i i think that might have changed so other than that little footnote there's how you get the version number being done automatically but honestly the first little while while you're maintaining your package to be honest i would just just just remember to bump the version up um now bumping the version reminds me of some other kind of metadata that we want right well let's do them in order sorry vs code okay so i think that's all we need in there for now um we're also going to want new file the um well tell you what no let before we do that let's do uh open the readme so we haven't done anything with that yet right now it has basically nothing in it so um time depth space or whatever warping and in here you know it's the usual stuff right um you're going to put things like installation is often what people want to know you can install with pip documentation coming soon lol um contributing please see i'm just gonna write it like this for now uh well we can we can link you should be able to link to a file um misspelled um not totally sure if that path is going to work later now obviously i'm just hurriedly filling things in here um go check out other people's projects i mean i i mentioned it earlier but uh leo and santi's work in fatiendo is amazing so you can see how they've done all of their stuff of course it's all open source um that's that's where i get my inspiration from it's those projects because they're so nicely done but there's a very simple kind of readme for now now i just mentioned contributing we'd better make that thank you for considering contributing here are some things you can do um make issues when you find them okay maybe i'll do this okay that'll do so there's that now they will call that contributing dot now github and various other people sort of try to measure the openness and awesomeness of projects and these are the sorts of files that they um look for now there was another one well you can go and look at one of my projects or leo and santi's project to see what other things people include but they're things like lists of authors what else how to cite a project especially if you've got a paper about it or something like that or you want people to cite it in a certain way oh what i was going to do was the change log so this is an important one so we'll just say first release broken do not use i mean if there wasn't so i guess one thing i would say is well no it's probably a good idea to be somewhat stressed about what you put on pipeyi like check things run your tests try not to break things put your requirements in and so on but don't let the stress become so much that you don't publish your package right someone can always go back to your github and try and figure it out they can put an issue in and say hey by the way this didn't work are you missing a requirement did you accidentally leave boilerplate in from the help page um like it's not so important that it should get in the way of getting your stuff out there right and if you don't get it out there you might never discover some of the errors so um it is like i i mean i you know from my own experience sitting there sweating about oh have i run everything does this actually work am i just about to do something stupid um and you know yeah 20 of the time i i am and i and i push a new version and then go and tweet about it and then five minutes later someone goes um actually that's completely uh garbage and five minutes later it's fixed and it's like it never happened so it's all good um all right so there's my change log i think we've got all the metadata we've got a readme i've just fixed something important let's um there's all my new files and now is the url so that will be on the pi pi so things are getting a little bit more usable python minus m build now this should add new files right because they'll be called zero zero one now what's all this 0.0.what's name um this is probably poorly executed semantic versioning um semantic versioning is quite a rigorous formalized way to version packages and i recommend reading at least a little bit about it before you start putting versions on things you know you need to understand um a bit i would say about what does it mean when you break the api um are you allowed to break the api with this version what's a bug fix version that kind of thing most of my packages are in zero point something so in other words they haven't even reached zero uh they haven't even reached version one yet and um i don't know if how closely this matches the formal uh description but in my mind that means that when i come across a zero point something release i can break the api things can change now you try to deprecate and do it nicely but fundamentally you know um you're in alpha essentially or maybe it's beta i'm not sure do go and read that page and don't listen to me okay we've got two new versions um i'm gonna go twine upload this slash now i have to be a bit more careful now because there are four files in here um oh sorry start and we can go check it out on here okay so here is now my bumped version um there's my project description remember that's coming from the readme now because i did that include readme business and i've got my project link homepage thing here which goes back to github um very nice let's talk a little bit about documents and sorry documentation and testing i'm not going to have tons of time here but we'll try and do something just to point you in the right direction because and and any amount of attention paid to this stuff will be appreciated by both anyone who tries to use your project and by you okay so um let's write some quick docs here i'm going to have to do it really quickly because we're coming up on um on our on our time too um but i mean i i'm not going to lecture you on how to write good docs and because i'm not going to write good docs right here um i loosely follow there are two main formats for doc strings there's google style and there's numpy style the numpy style uses sort of i'm sure you've seen them online but uses rst basically this restructured text format which i don't like um the google style is a bit plainer uh it tends to be the one that i use i recommend choosing something and sticking to it um or using some sort of formatting tool people like black a lot of people seem to like a tool called black i've put some links to that in the rest of that presentation you can check it out if you want um and i'm comparing this to numpy style i guess the other sort of you could see it as a form of documentation that a lot of people use these days uh type hints um obviously up to you if you go and implement that kind of thing so i'm going to go args this see this is where i wish i in fact oh oh i've got a sign back into co-pilot co-pilot is awesome for writing docs because it just basically fills them in for you um um yeah co-pilot has saved me so much time i would say since i started using it all the time basically about six months ago and um i wouldn't say i can't live without it obviously maybe if i'm writing javascript that's true um but that's something like a google style thing now it's always nice to have examples and i'll show you a trick if you write your examples like this with the little carrots as if you're a rebel you can do something nifty so let's say s1 equals i'm just going to use lists i think that's going to be okay use something like the one i did earlier actually i can just make it even shorter can't i and then i'm going to call this function get cost on s1 and s2 and then since i can do this in my head let's assign that to something and let's say that the cost. i'm going to ask for cost.shape and that's going to be um well actually it doesn't matter if i get it wrong because we'll all learn something so let's just say it's going to be eight five i can't remember which way around it's gonna go okay there is an example in my docs of how you might run something like this now the cool thing is that when you've got that kind of thing you can use uh something called doc test now i could go into that file doc test comes with python i could go in there and gosh can i actually remember how to run on that file maybe okay i was right let me show you what happens well i can run this with uh verbose um so what's happening is uh so this means python run the module called dot test as a program uh run it in verbose mode and run it on this file and it goes and it looks for and it looks for these lines that start with the greater than symbols and it runs them and then it checks that the thing that it got you see it says expecting eight five and it says okay in other words it passed if i'd written five eight um which is the thing i couldn't remember and then run this i would have heard about it oh sorry even if i didn't run it in verbose okay so um there i am running it just like normal and it's like oh i expected five eight but i got eight five so um that's cool but actually we might wanna write other tests more complicated tests that we don't want to put examples in for so let's do that i think i've just got time so i'm going to cd back to where we were and let's make the tests so someone was asking earlier michael's asking where to put this and then i'm gonna make something in tests called um test dxw dot pi this is gonna be a python file and in here i'm gonna i'm gonna write a different kind of test um tell you what let's do so everything has to be a function it doesn't need anything and i'm gonna let's say i'm gonna assert well let's do this um path comma sorry i can't remember which way around i pass stuff back in so it's path comma cost path comma cost equals um actually do i sorry just checking something can't remember if i need yeah of course let's do this um sorry import uh i think i can no let's do from from dxw import my function dxw okay we're going to need that and then we can do dxw on a signal and an empty signal and then let's try and assert something now we have you're probably thinking well we haven't handled this but yeah that's right um cost.shape um well actually i discovered something totally weird yesterday which is that you can have zero uh a numpy array can have a dimension of zero size um let's try this instead let's go path this is like empty or something sorry i'm kind of making this up as i go along so in case you can't tell so now i'd like to run this test now there's an awesome suite for testing called pi test it is not built in so we're going to need to pip install it and if i run pi test now just right off the bat um it goes out and it tries to find your tests and it's basically just looking for things called test something in folders called test something and you can see that it found my test and it got an exception um so i need to figure that out basically this sort of broke so what i might do instead is for example say okay well um if s1 i better use len uh because you know i haven't guaranteed that these are numpy arrays and i'm not going to implement that right now so if len is zero or um len of s2 is zero then raise a value error and say something like you know signals cannot have zero length okay so um because my stuff breaks basically when it's empty now that's probably a use bit more user friendly but it doesn't fix this because now it just gets um [Music] i was expecting it to get my exception i'm not sure why it didn't now okay if this is using my installed dxw it's normally i like to run pi test on my code base not on my installed file and it's used it was using xero so i feel like this is a thing with um my source directory and that might be the thing that breaks my favorability of that method so just exactly like i predicted i think i said 45 minutes and it was closer to 55. um i'm gonna backtrack on that decision i'm going to uncomment that let's see if i regret how badly i regret this i'm going to move source slash dxw to here i'm going to rn minus rf source let's just check that i did that properly first all right this is great uh i'm gonna i'm just gonna rm those rf the this i don't know i don't need to do that it was the egg i was worried about um i thought my prediction is that this pie test will work now did i change my config it did dw hmm i don't totally up well no i've got i've got uh [Music] a knit try one more time huh okay if anyone can see what i'm doing wrong uh maybe you've already told me running pie tests with doctors modules yeah that's what i was aiming for santi but now i don't know why my point test doesn't even discover this test because this test seems to be let's just get rid of that play cache as well yeah this is how i normally do it so i'm i'm probably just missing something incredibly straightforward um but i can't see what it is well that's gone of course um but i can't see what it is so i'm not going to worry too much about that because i've only got a few minutes but yeah what i was heading for was as santis just pointed out in the chat if you add um this doctest modules to pi test pi test will run both the tests that it discovers if you can get it to discover them yeah i mean i guess like yeah fair enough i i could install it um [Music] but normally i'm running against my code base not but this will solve the problem uh i hope oh i need to rebuild and then install that so now i'm running against like i have to build and install which is not very convenient for while you're writing code and trying to test um there are other ways of solving that problem too but that's just my normal workflow so i'm not totally sure what i've broken but let's um let me at least show you this this utils business so all code does mateo is it opens in vs code so just go and open it through your um through a thing now what it's so weird that it's getting it's not even getting seeing that other error because it should be it should be raising this error whoops i haven't done anything weird there have i with my um logic i don't think so so um sorry i got rather sidetracked by that that was uh doing things a bit out of order and so on but um with luck this one will do it for me and i can just show you that one thing that santi was trying to explain so okay all right okay excellent it through my error now i want to show you this um we can handle that um [Music] if we import pi test in here it has a a mechanism if you like um which i guess i will put here and say with pi test dot raises specifically the value error try oh sorry that needs to go inside this context so we don't need to assert any things we're using that that context if i run this now that passes and like santi was saying i can also do pi test minus minus um dot test modules and now it also runs my other test that was in the docs right so this one which i changed so that it was wrong let's change it back now i've got two tests that are being discovered one by doctest one by pi test itself and they're both running and they're both passing um now we can also do that's quite this is two more things i'm going to squeeze in one of them is this business here with this option because actually there's a better place to put this this doctest modules option so um it should really go in our configuration right because we want ideally to describe the entire process with just our um just our configuration so i'm going to put it in setup.config and it goes in its own section it goes like this tool my test and i can say add opts and put that dot test modules and now if i save that's all i have to do is save it i'd have to rebuild anything or well actually with the way my pi test is working i guess i probably do no i didn't so i just ran pi test on its own and it's now picking it up from setup.config so that's cool but it does mean notice that we've started using more tools like i had to install build i installed pi test i may use other tools so i recommend doing this which is adding some extras for pip so this is um extras requires and what i usually do is set up have like a test setup which says hey if you're going to be doing testing tell pip that and it will also install pi test for you as a dependency and the other thing i usually do is have a dev build um and i put build in there and pi test and that's sort of for other developers who might you know want to know what they need to actually build your project um i guess we also you could put twine in there as well if you wanted to and now a person um once i push this would be able to do this pip install install dxw square brackets um dev and they'll get all the things all the dependencies that they need to run developer type stuff and this also turns out to be a really convenient way to build like continuous integration because later on you're going to want to install your package um locally so that your continuous integration tools can run and this is a nice way of saying oh if i'm building all my docs then i'm going to need sphinx and blah blah and always lots of ways to crack an egg but i think the concept is you want to put as much stuff in your build files soon to be hopefully filed just by project.toml and um everything's in there so there aren't these little and the nice thing about having stuff like this adopts is now i because some people put that in like a dot pi test init file um there's another tool that i use a lot called um coverage which i can show you quickly actually i'll do a pie test cuff um if we come over here and go pip install pi test curve that's installed now and now if i do i've got the option right there so now what it's doing is saying okay in your files you're testing you've got 31 statements or lines you've tested half of them basically so your coverage of your test is only 48 and if i want to know what which 48 i can do a little html report with that coverage html and then let's open that file that it's just made which is in html curve index and we get a really cute little report here just as a static file locally where i can click on here and it actually shows me my code and then in red is the stuff that i haven't tested now actually i did test this with the dock test but i guess i hadn't noticed before that pipe that coverage can't tell that um so this is in red because my pie test doesn't all it's doing is testing that exception right so if i if i make another one that's basically like um well i won't do it because we're out of time but if i make another test which does run my actual code and stuff happens and i make a cost matrix and a path um then i'm going to have i'm guessing 100 cut test coverage because there are no other conditionals in here wow okay you know i'm pretty used to the wheels falling off at some point um i know that's not totally comfortable for everybody i apologize uh i i'm actually still a little bit unsure what it was that i broke because normally pie test works for me like that um but like i say probably something super simple that i missed uh i will tidy this repo up a little bit and i'll push well yeah i haven't pushed for a while so i'll push it um so you can take a look at it in github if you want to um back in my uh presentationy thing here i've got a few more slides on like the stuff that i showed you that i linked to the blog post that i showed you some of the things i dropped into slack and then a whole list of things that you might want to look at next right now you can go a super long way with beautiful doc strings in your functions and pie test those two things will take you miles and miles and miles you do not need to worry about i think i'm right in saying basically anything on here until you get more comfortable with the fundamental tools but do you know do poke around once you do get comfortable with it please please please publish a package like we need more python stuff out there especially for subsurface like even if it's just doing one thing or making a cool plot one bit of analysis one equation it's fine um get it out there share it on swung on slack tell everyone about it you'll get tons of great feedback i promise i i mean i know that i'm speaking from a fairly privileged position um but and so this isn't everybody's experience but i can say universally no matter how rubbish i suspect my package is somebody somewhere appreciates it or at least says nice one it's a you know and that's lovely that's a great we have a great community so please jump in get involved get it out there good luck come back for help if you need it and uh let us know how you get on don't forget to subscribe to our channel in youtube hit the like button if you liked most of what i did and didn't hate the brick where the wheels fell off too badly um otherwise i'll see you soon i hope take everyone and i'll awkwardly press my stop broadcasting button
Info
Channel: Software Underground
Views: 624
Rating: undefined out of 5
Keywords: python, packaging, tutorial, beginner
Id: fnmIHKyNRIA
Channel Id: undefined
Length: 121min 36sec (7296 seconds)
Published: Tue Apr 26 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.