Five Amazing Python Libraries you should be using!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

all right since my last Python video was so popular Thank You reddit let's do another one in this video I'm going to cover five Python libraries that I feel don't get their deal in order these are arg which is an automatic CLI generator TQ DM which makes progress bar is really really easy message pack which is a data serialization protocol very similar to json except it's binary schedule which is the Python version of cron and read a simple cache which uses the Redis in-memory data store to easily memorize and cache your Python functions I've put the timestamps for each of these libraries on screen right now and there will also be in the description so let's go the first library that I want to cover today is Arg Arg is a wrapper around art parse which is dead simple to use given some function definition Arg will create a command-line interface for you automatically so here I have a function called do the thing and it has a few different arguments the first argument does not have a default value so it will be interpreted as a required argument the second one does have a default value which happens to be an integer and Arg will recognize that same with the third one but this is a bool and art will handle this differently than the integer as you can see this function is doing some really important things so it would be good if we can make it available for somebody to call from the command line to do that with Arg we just import Arg and in the place where your script is supposed to run you put Arg dot dispatch command and then pass a pointer to this function now if I go to my terminal and run this script I get this nice usage prompt similar to how you would get a prompt from something that's using art parse reading through this command we can see that you can actually request for help right now this is going to only print out the same usage but with a little bit more detail but as you'll see in a minute you can add documentation for your functions as well as for your arguments very very easily to this now as you can see by the usage prompt we can supply an optional argument that takes in a value this is our integer we can also supply an optional argument that does not take in a value and this is our boolean flag and then finally the required argument that we have to pass through so we can go ahead and run the command and only pass in the required argument and you can see the required argument gets interpreted as a string because they type was not supplied the integer argument is an integer and the boolean is a boolean if we now try to supply let's say a string to the integer argument it's not going to work it's going to throw an error saying it expected an int but you passed a string and then of course we can you set the boolean to true by just using the flag now note that if I go back to this function and add a doc string and then request a help on it the doc string now shows up now this is all great but what if you want a little bit more control over how your args are presented you can't necessarily represent all of that control by just your function definition to get around this and to give you about the same level of flexibility that you can get with arc parse Arg also gives you this decorator called arc using the decorator you can specify options for a specific argument note that dashes and underscores are interchangeable you can specify a shorthand for these arguments as well as some help also sometimes it's very nice for command-line interfaces to give you the list of options that they accept rather than just take in any input that can be very easily done by passing in this choices argument and giving it a list of values now as you can see we have two functions in this file right now a sister command to dispatch command is dispatch commands and this accepts a list of function pointers and now we have set up our CLI to effectively expose two pieces of functionality or other than one so if we go ahead and get help on this now we'll see that the way we call this has actually changed and we first have to specify the function that we wanted to call and then we can get the specific usage and help for the function looking through the detailed help we can see that the help message for the bool argument is showing up here and if we try to pass in a positional argument that isn't one of the options it throws an error so that is argh other than it being just dead simple to use one of my favorite thing about aargh is the fact that it forces you to represent your CL eyes as functions I feel that this in the long run leads to better code structure and better ap eyes I've seen a lot of people use art parse in this way where the parse a bunch of arguments and then get that object and pass it along down way too deep into their code as a configuration object and I feel like art parse sort of implicitly encourages that behavior argh on the other hand does not give you the convenience of some global configuration object but instead gives you the convenience of very easily exposing your functions and this is one of the reasons why I really really like our again preferred over art parse okay moving on to the second library it is called TQ DM t q DM is my go to progress bar generator it is very very easy to use you can nest loops easily and you can easily customize your progress bar as well from the module Tico TM the two functions that you're most likely to be using are tqd m and T range here I have three examples set up the very first one simply has a for loop that runs for ten iterations and then just sleeps during that four loop note that the T qdm function accepts in an iterator this can be a list or a generator and functionally it simply passes the output of that iterator to the loop so if we go ahead and run this first example we see that we get this nice progress bar fantastic now anytime you're using range you don't actually have to use tqd m and then range you can simply use T range as I'm showing an example number two in addition to this whenever you're using T qdm or T range you can supply some description and when you have progress bars nested like this T qdm will use this description to arrange them in the best possible way so if you go and run the second example where we have this double loop with the nested progress bars as we can see we get one progress bar for each time the inner loop runs and then at the bottom of this progression we always get a progress bar for the outer loop so so far in both of our examples T qdm has had access to the length of the loop here I'm showing a use case where you're using T qdm a while loop where our index in this case Thoth is not necessarily going to be always incrementing by one in order to make use of TQ DM here we can actually hold on to the output off the Tiki DM function in this case storing into the variable bar in the if true branch of this if-else statement I'm also setting a description for this progress bar as well as setting the total argument for this function and you'll see in a second what this means now in a while loop we're getting some remnant number by which we're going to increment our counter so all we have to do is update the progress bar inside this loop and tell it how many more iterations have passed or how many iterations should consider as having passed which in this case is the number update ITER so if we go ahead and run this you'll see that we get a perfect hundred iteration loop but you'll notice that it doesn't actually tell us how much time there is to go or how many iterations there are left because it doesn't know this we ran this without setting the boolean that I was showing before if we set this boolean to true and to jog your memory all that boolean is doing is optionally setting this description and then also supplying the max ITER which is the number of iterations that this while loop is going to do to the total argument of the tiki DM function now if we go ahead and run this file you'll see that we get a normal progress bar where the total number is present and you'll see that we're also getting the description of the progress bar right next to it now the final thing that I want to cover about T qdm is how you can set this custom description in a dynamic fashion in T qdm example for I am running a simple for loop over a hundred items and just like example 3 I am holding on to the output off in this case T range but this would be the same for T qdm and then inside the loop I called the set description function and set it to any string now in this case that string just says what iteration we're on which is redundant information but for the sake of example this works now if you go ahead and run this file you'll see that the description dynamically changes as the loop progresses this can be very useful when you're trying to give some sort of a feedback for a loop that's running for example if you're processing files this could be set to the file name and that's T qdm it's short it's sweet and once you start using it you can't stop using it it's a wonderful little library so the third library that I want to cover today is just the Python client to a serialization protocol called message message pack is essentially what a binary version of JSON would be it makes it very easy for you to serialize and deserialize data and save it into either a byte array or a file and almost every other language has an implementation of message pack so it has pretty much the same kind of portability that you would expect from something like JSON however because message pack saves data in a binary format you can actually get a lot of space savings especially when this data needs to be transferred over a network I have set up two examples of message pack in the first example I am simply creating a dictionary of floating point values and then dumping it into a JSON file and then reloading it and also dumping it into a message pack file and reloading it now note that because message pack is a binary format when you open the file it has to be opened as binary and then you can simply call the pack B function from message pack and pass it the data structure that you're trying to serialize to load this data back from the disk we use the unpack B function B of course here stands for binary and we simply read in a files contents and pass those to the unpacked B function finally I'm doing something what I'm calling a data integrity test basically I'm just looking at the type of the keys that are loaded from both the JSON file and the message pack file and I've done that on purpose because the key in our dictionary is of type integer not type string so if we go ahead and run the first example we'll see that it outputs two lines the first one saying class stir and the second one saying class int I've done this on purpose as I've said because I know that JSON requires you to have string keys so even though the actual data structure that we see realized has integer keys in the dictionary JSON is gonna forcibly turn them into strings and then when you load them back they're gonna be strings and not integers I've run into this subtle issue multiple times with JSON and this is something that's not even a problem with message pack secondly if we look at the files that we saved the JSON file is 275 kilobytes and the message pack file is 117 kilobytes this might not seem like a lot right now even though it is more than a factor of two difference but as your data gets bigger and bigger and as you involve the network these savings become really really important another important thing to consider here is that we've been saving these double precision numbers but when you save them in the JSON a lot of your floating-point data actually gets truncated however since message pack is saving the data as it exists in memory this double takes the eight bytes that it needs which is still less than what you get when you save the JSON and retains as much precision in that floating-point number as possible now if you did not need all the precision of a double you can actually force pack B to pack your data as single precision floats and if we go ahead and do that the size of the message pack file is now a mere 78 kilobytes now if all of this data saving and ease-of-use has not convinced you to use master pack yet the second example that I'm showing might in the second example I have set up a list of dictionaries now we could just go ahead and serialize this entire list into file as one list but instead let's iterate over this list and write each individual element inside the list which is a dictionary separately when we do this we're essentially appending a new dictionary to the existing file every single time and you can actually just close the file pointer and come back to it at a later time open it in a pen mode and continue writing more data to it this makes both writing and reading message pack data in situations like this really really flexible and memory efficient now if you wanted to read this data back we can use message pack dot unpacker and pass it a file pointer for this file and what that'll give us is an iterator which we can use to then retrieve these list elements one by one and load them back into memory in this example at the bottom I'm just doing another data integrity test just making sure that the data that was loaded is the same as the data that was saved and looks like it is the same so yeah that's message pack it's efficient it's fast it's really flexible I hope you check it out the fourth library that I'm going to cover today is called schedule and schedule is essentially a Python version of cron but instead of being managed by your operating system it's being managed by one instance of the Python interpreter schedule is really really straightforward to use and I found it to be fairly reliable for usage you can simply import schedule and then I have these two test functions defined all they do is just print something and then in order to set up a schedule for them I can just say schedule dot every and then give it some number dot some unit of time so in this case that unit of time is seconds and then on that unit of time called the do function and pass it a function pointer so if you look at this you can just sort of read the schedule every one seconds run test function every three seconds run test function two and then in order to make sure that all this stuff runs we run this while loop and just sort of have schedule check all the pending commands and run them if their time has come so if I run this test you'll see that the test function is being called two or maybe three times between intermitted calls of the test two function so that's roughly once every second and once every three second note that instead of seconds you can do things like days or months as well and then you can even specify the specific day of the week that you want something to run at as well as specific time that you want something to run at and that about does it for schedule there's a lot more stuff that you can do with it but these are the basics that you would need I encourage you to check out their website and their documentation for more information okay the fifth library that I want to cover is very near and dear to my heart for two reasons one I have never met any other person that uses this library other than my colleagues which I convinced should start using this library and two because it uses Redis and Redis is amazing so obviously since the library is based on Redis you have to first install Redis I have given a command here for installing Redis server on Ubuntu and then make sure that the Redis server service is enabled and started once that's done you can use this command to install read a simple cache now it does exist on pi PI but that is an old version that no longer works with Python 3 and it doesn't seem like the maintainer is really working on this project anymore however this fork is updated and it works with Python 3 this is really not a library that I recommend you use in production but it is very useful for data science kind of work or just normal developer workflow so as the name suggests this does caching using Redis for those that aren't familiar with Redis it is an in-memory key value store that has very low latencies for data access to use Redis cache you can use either this decorator called cache it or another decorator called cache it JSON the only difference between these is that cache it uses pickle to serialize your data and cache it JSON does not use pickle it uses JSON to demonstrate the usefulness of this library I've set up this function as a proxy for functions that might take a long time to do their computation or to grab a resource from the web in this case all it's doing is printing that it was actually called and then returning the square of the number that it was passed and then in a loop I'm calling this function ten times so if we go ahead and run this you'll see that the function is actually being called every single time however if we go ahead and put the cash it decorator here and note that there's two arguments that I'm using here the first one is limit this means that only a thousand unique input values will be cached and then if the function is called with more different input values the old ones will start getting pushed off and the new ones will start getting added this is very important because Redis as I said is an in-memory data store so you have to watch your memory usage the second parameter expire is supposed to help with cache invalidation in this case anything that has been cached for longer than five seconds will automatically be deleted in my workflows where I pull a lot of data from databases I tend to set this number at something like a few hours because I know that after that time the data that I have will have gone stale and I should pull it again now if we go ahead and run the test you'll see that the function is only really called once and then every other time because the input is the same we simply get the information from Redis and print it out if I call this repeatedly you'll see that we went entire runs without the function ever being called even once until the expiry time of five seconds was reached and then the function was called once again to refresh the cache and that does it those are the libraries that I wanted to cover today I hope this video comes in handy to my fellow Python developers in these trying times I want to give a special shout out to my first two patrons Enrico choir's and Alice Dennis thank you guys your patronage means a lot also thanks to everyone that watched this video please remember to Like and subscribe and consider checking out the other videos on my channel until next time bye

Info

Channel: Jack of Some

Views: 75,733

Rating: undefined out of 5

Keywords: python tutorial, python, programming, great python libraries, amazing python libraries, python libraries, argh, tqdm, schedule, msgpack, msgpack python, redis, redis cache, redis simple cache, python library, better python, simpler python, simple python

Id: eILeIEE3C8c

Channel Id: undefined

Length: 17min 19sec (1039 seconds)

Published: Fri Mar 20 2020