what are python wheels? (intermediate - advanced) anthony explains #371

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome to another video in this one we're going to talk about wheels and what they mean for python and let's jump into it okay so the name wheel seems a little bit weird and admittedly it's kind of a long-running joke that even i don't really understand uh then the nature of this joke comes from calling pipi which is the python package index it has the nickname the cheese shop and uh a wheel is like a wheel of cheese from the cheese shop i i don't know i'm not gonna even try to understand what it's what it means there but i'm gonna talk to you about what wheels are and you know what the the file name is special about what they're structured inside i can go about installing them and uh yeah i think we'll cover that now before wheels there were two ways to install packages there is from a source distribution which is either a zip file or a tar.gz or any other compressed archive and you would install from source meaning anytime anything you had to build you would have to build it yourself when installing there was a pre-built package that was called an egg eggs are actually very similar to wheels in a bunch of ways but eggs fell out of fashion because they involved weird python path hackery and didn't really set up stuff in a super well defined way on disk and wheels provided a much more stable and reasonable way to do that the current modern approach to packaging you typically you typically can install pretty much everything from a wheel uh and rarely need to use a source distribution i say rarely there are actually quite a few examples where you would use a source distribution uh we're not going to go over that today but what a wheel represents is it's a pre-built package that's installable just by unzipping the archive and not running any code which is what can be which can be really important for both performance because you know running code on installation building stuff from source takes a long time that can that can slow your stuff down but it's also useful for security so if you're installing packages from wheels you tend to not be running arbitrary code from the internet you're really just unzipping files into the destination state now i've downloaded a bunch of wheels ahead of time to kind of show you how they're named i happen to pick a bunch of different ones hopefully highlight some some various things about them and we'll also go over the structure of what's inside of these wheels i mentioned earlier that source distributions can be zips or tarballs or any sort of compression state wheels are always always have the file extension whl and they are always zips so these are just fancy zip files that have a different extension and a very particular naming convention now the naming convention is split up into components each component is separated by a dash and the first component is the name so you can see this this is this first wheel is the ast pretty wheel note that if there are dashes in the name they get normalized to underscores so you can see your baby grammars is actually you know baby dash grammars according to pip but you're always going to see an underscore in place of those dashes there the second component is the version number so this is the version number of whatever library so in this case i have asd pretty equals equals 1.7.0 you can see the version numbers for some of these other ones and then the last part of the wheel or the last three components and these are usually kept together uh and referred to as a platform triplet um you know other people call them other things that's that's what i've heard them referred to as uh the first component of this is the python versions it targets and there's a couple of different things you'll see here in this case this is pi2 dot pi 3 and uh note that a dot in any of these three platform triplet components means kind of an or relationship so this supports both pi 2 none any and pi 3 not any uh and this dot is kind of like a kind of like a curly expansion in bash where it can you know can be any number of those things and sometimes these dots get a little bit ridiculous like you can see here this one is many linux two five x eighty six sixty four or many linux one x eighty six sixty 64 or manual linux 217 x864 or menu linux 2014 x864 and so this third component can be any of those things now i talked about the first component the first component is the python version that it targets you'll see py for pure python things this means it doesn't have any binary artifacts doesn't have any c extensions uh that sort of thing so you'll see py for that and typically you'll see either pi2 you'll see pi 3 and very rarely occasionally you will see something like this where it has a specific minimum version number and note that this this platform or this i guess language version tag implies any version after that so pi 36 is going to represent you know is going to be installable on 3 7 3 8 3 9 3 10 etc on into the future in the same way that pi 3 is installable on any python 3 version now there is a bit of a you know asterisk there pip does adhere to major versions here so if you saw just pi 2 here that wheel would not be compatible with python 3. and in the same way you know by seeing pi 3 here it would not be well we don't actually know because python 4 doesn't exist but presumably if it followed the same pattern that python 2 and python 3 did the pi 3 wheel would not be installable in python 4. you will also occasionally see well pretty frequently see a implementation type here so this was pure python with just py cp here means that it's a c python specific wheel this usually means that it contains some sort of binary artifact a you know a dot so file a dll file on windows a pi d file on mac os oh wait that right i don't know either way a binary extension and so you'll see that you'll see this as the language version here so this is saying at least see python 3.8 i know what i say at least here and and we'll actually get to some fiddly edge cases in a bit uh actually we can just get to them right now but we'll talk about abi there are actually ways that this uh does not mean the exact version but only the minimum version the abi is actually what determines with this what the minimum version is so in this case like the abi is exactly the same as the language version and so this is only installable on c python 3.8 you'll also see ppp is for pi pi and i think only c python and pi pi currently have a language version or language tags i don't think any of the other implementations are allowed to upload wheels at the moment but presumably you could imagine one for like jython or whatever other you know platform you're targeting but generally when you're looking at things you can you can assume that pi 2 pi 3 that means it's pure python if you see cp or pp that means it has non-pure components a compiled component inside of it okay so that's the first component the second component is the abi the application binary interface and for wheels where it doesn't matter that they don't have any binary components so you don't have a binary interface you'll see none as this second tag here and this is very common if you have like pi two pi three nine or pi three none this is pretty much what you're always going to see for a pure python component as the second the second chunk of this triplet now for non-pure uh things things with compiled extensions such as this one that has cp36 as the python you will see the abi usually matching that or having some flags after this in this case this is the cp36m m being pi malik which was made irrelevant in python 3.8 because it doesn't actually participate in the application binary interface i do a video on this i think i might have maybe not um but yeah this this will say that this has to be installed on something which supports the c python 3.6 m mb pi py malek uh binary interface you may also see d if i don't think i've ever seen someone distribute a debug wheel but you might see d in that case or you might see something like avi3 and i did another video on abi3 so i will link that in the description but what abi 3 means is that this wheel supports the python minimal abi at at least this version so this this is saying at least python 3.6 but targeting the minimal api and this basically allows you to install this wheel on any particular c python 3.6 3.6 or above assuming that they don't break the abi through the ebi which may happen at some point we may see an abi 4 and then you'd have to have another iteration of this for pi pi they have a bit of a special abi platform tag so pipette currently doesn't support abi3 so there isn't really actually that might have changed by now i haven't i've researched that topic in a while so it may also it may already be solved but pi pi has both a language version that it targets so in this case it's 3.7 python 3.7 but they have their own internal version number as well and so this 73 here is actually pi pi 7.3 which is a particular release of a three seven language because pi pi has kind of two separate version numbers and so their their api tag is a little bit different okay so that's the the language version tag as well as the abi version tag the last component is the platform tag and again if you have a pure python without any binary components you'll typically see any as the tag here and this is saying you know we don't have any non-pure components you can install this on any platform windows linux mac os any um but if you see a particular platform so for instance we have the many linux 1 x86 64 platform here uh this i did a video on many linux so i'll link that in the description but the tl dr is this is a platform tag which says any particular linux that has this minimal symbol set can be installed there we also see a mini linux 2010 here which is a slightly newer version of that standard or the newest version of the standard which has the the um glib c version number here uh we also had a mac os wheel here there's also platforms for windows as well the third component is kind of that that platform component and again like if you see a dot in these this supports many different platforms uh but you know any the the installer can choose which of these to pick okay so that's the file name and admittedly the file name is probably the most complicated part of this and probably the most part to understand the rest of it is mostly just the structure of these so i'm going to take a look with the unzip tool to list the contents of these and kind of go go through the various things that you'll see in here so let's start with asd pretty this is kind of the simplest case we do unzip dash l this it's going to show us of course it's going to show us the contents of this wheel now generally an ast pretty is pretty simple it just has a single python file that it distributes uh generally the top level of your wheel is what's going to end up inside of site packages when you install this so when i install asd pretty i'm going to see astpre.pi i'm also going to see this dist info directory this dist info contains kind of installation metadata things about the version the readme the license entry points is how it decides how to make console scripts as well as record which uh pip actually modifies later to indicate what files are installed or not so this i believe actually also contains hashes for each of these files as well actually let's let's just unzip that and take a look at it asd pretty oh the tab complete does work uh unzip used to not understand that wheels were zips and so it wouldn't tap complete there okay so this is what we get after unzipping and this is actually very similar to how pip would install this so pip is going to treat this as a zip unzip the contents into site packages and then move some stuff around basically and if we take a look at the dist info here we'll just take a look at all these files the first one we see is console scripts this is what tells pip to set up a binary called asd pretty which is you know the the main function so when you type asd pretty in the terminal it'll run that that command line tool uh here's the license file this is the metadata file that's talking about before this gets generated from either setup tools metadata or whatever other packaging utility you're using i guess fields that don't get filled out show up as unknown which is kind of interesting um you can see down here we have the the python requires i guess this was an old wheel of asd pretty which still supported python 2 as well as you know you get the long description down here as well which in my packages is just the readme and you get all the classifiers and yeah here's the record file that i was talking about before which basically lists all the files as well as a checksum and i can't see it hold it down a little bit but it also contains the size of those as well so you could this is kind of a internal checksum of all the files record of course does not have this because you can't really know what its checksum is going to be before generating it and so these fields are kind of empty so it doesn't it doesn't know the size it doesn't know the checksum but you can't really have a self-referential checksum so that's why it's that way uh this is toplevel.txt i think this gets used for uninstalling and reinstalling but in this case there's only one top level module and wheel contains metadata about what actually generated this in this case i used wheel 0.33.4 and you can see these are the two tags that it generated and that is kind of the expansion out from this dot here so we got we got two tags there uh we will probably actually be able to see this one have a whole bunch of tags actually let's do that right now uh yeah unzip this star slash wheel to o nope it's p yeah yeah yeah so you can see here that those four platform tags got expanded out like this and we actually generated this we don't need cool so that's the contents of a normal wheel so you'll typically see something like that let's take a look at some wheels that have some more complicated things so those are only two of the three components there is a third component called the data component and there are three types of data let's take a look at this i've i've specifically picked a few wheels that have those three types of data it actually took me a bit to hunt for this one which we'll look at in a second but if we do unzip dash l this and pipe that to less uh you can see this this uh package actually has a bunch of files and interestingly i can't really see it on that screen zip that shell baby grammars but interestingly this package actually doesn't provide any python files there's there's this actually only installs data so this is that third component that i was talking about the data directory and this is treated especially by the installer in this case and it's based on the the path segment after the data directory so dot data slash data means that these are going to be installed directly into the prefix of your environment so if we were to do vf and we did pip install baby grammars uh you'll see inside of our wheel we got the share directory that contains all of these files that are listed here and that's just directly at the root of this prefix and it didn't actually install anything into vmvlib except for that dist info directory kind of the metadata of that package but that's data data and you can see here there's a whole bunch of data files there's also a whole bunch of license files because this happens to aggregate a bunch of grammars from various different sources and also contains that wheel metadata that we talked about before so that's dot data slash data let's take a look at greenlit next which contains a little bit different of the setup unzip l this this has a bunch of stuff that's going to get installed into sync packages so you can see there's a shared object here that provides the compiled c extension and that shared object again is compiled for a particular platform and python version there's a bunch of pure python stuff we also have the dist info directory that we talked about before but we have this this data headers and this is a little bit different than data slash data this is going to get installed inside of a include directory on linux it's going to vary based on the platform but on linux if we do hip install greenlit you'll see that that greenlit.h ends up inside this includes site python 3.8 greenlit greenlit.h directory um it's actually kind of weird i kind of expected it to just end up in the top level but i guess it doesn't necessarily work the way i expect it but yeah that that greenlit.h ends up in this special include directory and again the installer tool manages this directory especially uh but beyond that all of this ended up in site packages and the site packages and so you can see the greenlit directory as well as the green light dist info and the last thing is scripts um u.s clay happens to have one of these so if we go down the again these are just going to get installed in site packages they're just normal files and we have a lot of stuff here shift g to go to the end uh and you'll see we have that dist field directory talked about before but we also have this special scripts directory and this is going to get put into the bin directory on posix-like platforms and it's going to get put into the scripts directory on windows platforms so that's kind of the the third special case so there's basically five five ways things get installed they're at the top level they end up in site packages they're in dist info that also goes into site packages and then there's data data which is just from the prefix downwards then there's data scripts which gets put into the bin directory and it also has some shebang normalization and then there's data include or data headers which goes into the include directory and so that's basically all there is to uh the wheel structure and installing them tends to follow two steps at least with pip like one unzip the files and put them into the right place and then the second thing is to take those entry points which this one does not have entry points but this one does hp star slash entry points.txt is to take these entry points and write out the little script wrappers into the bin directory or exes on windows uh so if we look inside vmf bin we already did we not install it we did not pip install asd pretty if we look in vm bin asd pretty this was generated from this console scripts here and you can see that i have a script that just imports the entry point ast pretty main but this is here and then runs it at the end and so that's that's how a wheel gets installed um yeah i think the last thing that i want to talk about here is there are some platforms that are going to look a little bit weird uh like before i mention that pi 2 pi 3 typically means that it's pure python but you'll see i actually have an exception here this wheel is pi 2 pi 3 9 which typically to me says you know pure python it doesn't contain any python extension components none means that there's no python abi specific components but then at the end it says many linux one which means that there are binary components in it but it doesn't matter like the python version doesn't matter and actually what's happening here is is kind of kind of clever this is distributing a different binary um but using the python packaging system another one that does this is uh download dominit which is a docker in its system and so we see this this this also has that pi 2 pi 3 none so it doesn't contain any python components but it targets these particular linux like platforms and so that's that's kind of a weird one you'll see i also picked out this one because it's weird but we already talked about it earlier uh where this was supposed to indicate that it was python 3.6 and above but typically python requires is used instead of that specific tag nowadays do i have any other weird ones no i think that's mostly it cool uh yeah so that's wheels i know there's there's a lot a lot to take in here um but most most of the complexity is around just the file name and understanding what each of the components do and then you know they're just zip files they're not really anything magical beyond that uh but hopefully this was interesting if there are additional things you would like me to explain leave a comment below or reach out to me on the various platforms but thank you all for watching and i will see you in the next one you
Info
Channel: anthonywritescode
Views: 558
Rating: undefined out of 5
Keywords:
Id: 4L0Jb3Ku81s
Channel Id: undefined
Length: 22min 46sec (1366 seconds)
Published: Wed Dec 15 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.