MLFlow Tutorial Part 1: Experiment Tracking

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone welcome to this tutorial about Ammo flow and experiment tracking for tensorflow and some other libraries as well feel free to follow along all the code is in the GitHub and the link is down below just keep in mind that some parts might take a bit too long to run so it might be best if you take this as an example follow the syntaxes and then apply this to your own project okay now let's begin so first things first let's go into the GitHub page of tutorials that I have here and I'm interested in getting these two notebooks so what you can do you can just clone it like this and then to git clone and then paste the base the link that we've just copied as you can see I already have it so here I have on my local machine these two notebooks and I even have them opened in the jupyter notebooks so let's just do a quick run through you don't really need to understand much that's happening here but the most important parts are that first of all I'm getting the data I'm doing some pre-processing and then I'm running a bunch of different models trying to predict the test data set and then recording the test metric which is root mean squared error in these separate variables and so I'm doing a random Forest I'm doing a cad boost and then variations of this failure like very fancy Transformer architecture for tabular data I have a whole like Block series about it if you're interested in but basically I'm doing the same thing here so I have some sort of training happening and then I'm doing the predictions for different variations of it and storing the results in kind of yeah in just different variables and the main output of this notebook is this table here well not table but like the printout as you can see here I'm attempting to compare different models and looking at their rmses so why is this bad is because I have no recollection of what happened before I got these values so data run experiments multiple times which parameters did I use I can kind of like just try to guess them from like the parameters that are specified here but nobody would tell me that like because I I don't know I changed it to 50 here and then just saved it because I wanted to run the experiment but I didn't so nothing really tells me that like which parameters are responsible for this specific metrics so this is what experiment tracking is supposed to fix for us it's supposed to tell us which parameters have led to this specific results and even it can save us some model artifacts maybe models themselves so it's really useful and really comment using kits and this notebook has refactored this code and started using the ammo flow experiment tracking so which I'm going to tell you more about let's go to the command line and here I'm using a virtual environment called blog so for me it is already activated so what I'm going to do I'm going to just pop install ammo flow if you don't have it yet I already have it so it's going to be you know quite simple and quite fast I would say like it's a quite a friendly package like it doesn't have any weird dependencies so it should be working with your environment as well now when I've run it the first thing that I want to start running is the actual ammo flow UI which is quite useful in when you're tracking the experiment and to call the ammo flow UI so it's going to be the command ml flow UI and then we also need to specify which database we're going to use to store the experiment results and different artifacts that we specify so I think the command is something like backend store URI and then you need to specify that it's a sqlite and then ml flow dot DB so this is going to spin up an ammo flow server which is going to build kind of showing you and tracking all of your experiments which is super neat so let's copy and paste this link to a browser and there we have it okay so this is like the ml flow UI which is yeah which is going to be like your entry point to to the experiments that you run as you can see it has two tabs experiments and models we don't touch models for now uh but what we can see here is that we have one experiment which is called default and it has no runs so this is created automatically for you now what we want to do we want to create our own experiment run and to specify first of all uh how it's what its name and which primers do we want to store in it and now let's go back to the code and I will show you how to do this in this refactored notebook so as you can see we import it just like this import demo flow and this part is important okay one second let me specify which kernel I use to run it so as I said I use the virtual environment blog so I'm going to use the same kernel to run it now let's import all of this good stuff might take some time there we go imported everything just setting some parameters so this is the first thing that is important to get right so the set tracking URI so this is going to be the database that we've kind of specified before and as you can see mlflow has created for it it for us already it is here same folder where you run the experiments so we specify it here and then this set experiment income so this is the name of the experiment that we want to run I call it income because this is the income data set um so but yeah like you can call it whatever you want and it has created so as you can see it's smart enough to know that there is no income experiment before so it has created it for us which is super nice so let's load all the data let's pre-process it and first thing you can see here is that we have this option to auto lock so mlflow has different Integrations with different libraries so it is a scale learn live GPM CAD post tensorflow Pi torch and there is a lot of like parameters which you might want to track and it lets you to drive them automatically but in this case we don't want to run it so yeah we're not going to auto lock it and instead we're going to do this manually so how to do this manually is first of all every experiment is run in the context so when you specify with ML flow start run you specify context of the ml flow and the first of all the and the first parameter to specify is the runway so the brand name for us is going to be here it's going to be RF Baseline so that's a random Forest that you can saw as you saw in the previous notebook which was not that well formatted and you know just like did the training and the predictions so get all of these steps without any of the logging so now to do the logging there is few things that we need to do so first of all all of the logging starts with ammo flow and then we call different methods of ammo flow so the first useful method to do is the set tag so this is kind of like a custom elements of the of the of the experiment run that you want to specify so here I'm going to set tag of the model name and it's going to be random Forest so because my main purpose here is to compare different architectures between each other so Random Forest CAD post people are unique models I want this tag to be there so that's going to be the first one and then log params lets you to log multiple parameters if they are stored like this if they're stored as dictionaries so I can so I can log automatically number of estimators and maximum depth again so this is done so that I remember which parameters led to a certain result if I don't do this then it will not get locked and this information will be lost and I will not be able to recreate the experiments or to communicate it clearly to the to my colleagues now okay so I locked the params I do the random Force I I specify the random Forest I specify the fit do the predictions do the calculation of the metric and then as you can guess it so log metric logs the metrics that you want to your experiment to have so this is going I'm going to call it test rmsc and then this log model so this as you can notice this is a specific to SQL learn because there is different ways to save models and a lot of them are different between different different libraries so what you want to do you want to usually see if there is specific methods for the library that you're going to use to save your model and in this case it's an sqlr model so I want to lock this model using sklearn and first argument to specify is random forest and second arguments is just the path of the artifacts um it's nothing fancy so I'm going to just run it I'm going to wait until it finishes running and then we can see in our UI how the experiment looks like okay so it has finished running now let's go to the ml flow let's update it so as you can see the experiment income was created and here we can see the Run link that we have also created which is called RF Baseline so there is not much here but what we can do we can add metric and we can add a model name because that's what I'm interested in so yes test rmsc is 0.5 model name is RF you can also click on it and you can see different parameters that we stored so the max depth was 20 number of estimators was 100 metrics tags and the artifacts as well so because we chose to save the model we can actually load it into our environment later if we choose to uh they even give you like a few useful snippets on how to on how to do it uh so yeah we can explore it a bit later but for now it looks really awesome right we didn't need to do much just specified a few parameters and it has locked a bunch of stuff for us so that's uh I would say it's a really good it's a really good workflow to adopt and it also makes you a bit more consistent with the way that you specify the parameters and the way that you log your experiments so let's go back down to the training and try another model so this is going to be catboost now for the cad boost we need to specify specific data set type again not that important for this and here we can see again so we're specifying a new experiment run now this one is going to be called called catboost um again I'm setting a tag to the model name cadbos because I want to compare them and want to train it I want to predict I want to test unlock the metric and save the model so the same workflow as we had before and as you can guess it's if we go here you can see that it will already knows that something new is here so we can refresh it and here is the cad boost so the duster MSC is lower and we yeah we know that this model is better we can even yeah click on it download it if we want and use it in our kind of next next steps awesome okay so now the other other things is so we want to know how to use it with tensorflow and luckily for us the procedure is mostly the same right so we still want to do tags we still want to do parameters we still want to save the model the only thing that changes here is you need to be a bit more um how to say you need to be a bit more consistent and a bit more structured with the inputs that you provide so let's start this entangling this bit of code here so first of all when I run this function I want to specify a name so this is the experiment run name as we did before right it was CAD boost RF Baseline so this is going to be something like MLP one mlp2 now the MLP parameters so this is the first set of parameters that I want to store and these are the parameters of the architecture itself so this is going to be a number of neurons in a dense layer this is going to be an activation function of a dense layer and it's going to be a dropout rate between different dense layers and as you can see here so we're going to end up having one two three four parameters and they're mostly so yeah three of them are about no sorry we're going to have five parameters and three of them are going to be about these number of neurons in the dense layers so this is the first parameter now the second parameter is going second set of parameters that we want to store is the training parameters and because uh you know to train neural networks they're quite sensitive to the training parameters such as like learning rates early stopping grounds um white Decay and all of the good stuff so we would also want to track it and to store it so that next time we can recreate this experiment and as you can see here so we pass them into the train and MLP function together with already kind of specified MLP parameter MLP struct architecture and as you can see it has a loading rate it has a weight decay it has number of early stopping rounds and it has a number of feedbacks so that's the parameters that we want to store maybe we would want to tune them as well in the next runs so yeah that's what we want to do here and now let's see what we do here so we store these parameters as well so we log them they're going to be locked together with this MLP params then we want to set a tag again because the main purpose is to compare different architectures then we're going to build the MLP we're going to train the MLB then we're going to make a predictional test and then evaluation of the rmsc on the test data and then the same thing as we saw previously so log metric and then the tensorflow has its own log model methods so the same way that cut boost had the same way that ESC alert has and we're going to save this model so yeah that's basically it uh to do this this required us to think a bit differently and not just like Define model and then do training and then forget about it but instead try to try to modularize it so that it takes the parameters that we want as inputs and does what we want as an output because it makes it so much easier to store these parameters ah if if the function is defined really well yes and let's define some of the parameters so first of all this thing just specifies what is the training date or what is the validation date and what is the test data it is a bit different from all the other ones because tensorflow has its own data set structure now the MLP parameters the ones that I talked before so this is the size of the first layer size of the second layer size of the third layer dropout rate and an activation this hasn't been done using some sort of hypervamp parameter tuning so feel free to adjust it yourself uh so this is the train parameters again which are going to be stored so the learning rate with Decay early stopping patients and number of epochs and the only thing that is left to do is just to run this function that we've defined so specify that it's going to be MLP base now and specify all of the parameters that we did here so yeah let's just run it it's going to take some time to train perfect the model is saved so as you can guess we go here we can see something new and this is the MLP base perfect um so yeah we can see that it performed worse than even random Forest Baseline so maybe we want to change something about it because I don't know we're not happy with the results and we believe that well MLPs should be performing much better on this task so what we would end up doing because we've set up so nicely is basically just copy paste this one specify it as MLP prompts too and MLP base 2. so this is going to be second run and now let's I don't know let's try something simpler it should be like quite a simple task to do so let's just do all of them like 64. and maybe reduce a Dropout Trace I don't know I'm just guessing but using this would it should be so as you notice we just changed a few a few things here and um it should keep trading it until it's done and if we go here so we can see that it has started the second one and and and and and and it's when it's done with it uh it should save the model and it should show us the metrics then we will be able to compare both runs and we can see what is different between them so that's quite a neat thing that mlflow does for us let's wait until it saves the model it's also not really mandatory to save all of the models because if we save all of the kind of parameters properly you can recreate it but I just found yeah that's the saving functionality is quite neat so maybe let's uh okay let's update this one should show that it's been yeah it took a bit longer to train because the model was simpler as you can see okay so here test RMC got smaller so that's a good sign and now what we can do we can just select two of them and compare them between each other there's like a quite a few from quite a few plotting functionalities to do but it makes more sense to do when you have multiple experiments like 20 or 30 then you are going to end up with pretty nice plots so in this case what we want to do we want to just look at the table so as you can see yeah we close the visualizations close the Run details so just the parameters and let's just show the differences only basically we can see here that so the layer 1 and layer 2 size is different and then the dropout rate differs and then the tester on the C is lower model name is the same so that's what we wanted and this way we can consistently evaluate experiments and make sure that we end up with the best model so quite neat right now okay I went back so we have all of them still here now let's go to the models that's kind of this tutorial before was focusing on so they're called ft Transformers feature tokenizer Transformers the really powerful architecture models and we want to see how they perform so again this let's transform the data into the appropriate format and now we are ending up doing the same thing like we did for MLPs so we have separate functions for building the model because we want to save the trade because we want to save the architecture parameters we have a separate function to train the model because we have our trained model parameters except in this case we're also going to have parameters to skip so these are the parameters that are not that important and I don't want to be storing them but as you can see they're still required to build an FC Transformer encoder here so we have these two functions let's define them and again the third function is going to be ft Transformer ml flow run so that's the same function that we did for uh for the MLPs before and we specified the name we specify the model yeah we specify the attack we specify the parameters and log the metric and log the model as you can see here so I was expecting I was trying the auto logging as well and it is quite a neat thing to do I just didn't find it that useful in this case but you can explore it yourself or maybe I will do a tutorial next time about it so let's run this and let's actually specify the parameters to train and parameters to skip and then I'm also going to specify parameters to save so yeah lots of different parameters all of them kind of get saved and uh to find out more about them again please go to the blog post that I did about these Models Super interesting they might be you know the future of tabular deep learning who knows and let's just train the model it's going to take some time so you'll probably see a Time skip here all right so while the final model is trading let's go here we can see that there is four new runs so the first one is linear uh the second one is periodic quantile and the target one so the target one is not done yet but what we can see we can see that the test test rmsc looks actually pretty decent especially for the ple quantile it's 0.456 which is just a bit lower than catbus um so what we can do now we can actually select three of them and let's again just compare them so you can see that having three models is already more fun you can see some interesting visualizations so first of all the test rmbc you can see this is our favorite the ple now what we do wants to like this is actually yeah again we don't have enough models to compare them visually but instead we can show the difference in parameters so you can see that the linear doesn't really take into uh doesn't really take numerical bins as a feature and we can see that ple does and periodic does so yeah just by looking at this we can see the ple indeed delivers better performance here we can also wait until the last one the last Model finishes training before we can get like another set of features and we can see which one is going to be uh better um but yeah like this is pretty much uh this is pretty much all we wanted to see so if we can go back we have this visualization at glance with different model names with different parameters so in this case I would probably want to add a a to add the embedding type we go just to differentiate between the Ft Transformers and having this uh having this view of the experiments is kind of similar to what we had in this notebook right in the old buildbook but instead now we also have tester MSC you have all of the models stored we can get all of the detailed parameters if we ever wanted to we can also see the duration of training once it's done training I will also show you how to reload these models from the from this ml flow repositories our final model finally finished training so we can go back here and we can see that the ple targets is now listed as 0.447 that's the RMC so it is actually the best performing deep learning model but you can see that it took significant time to train um so okay this is it basically for the training and logging and now what we might want to do once all of this is done is to load our best model because we've been saving all of them it's quite easy to do with ammo flow so what we need to do we need to go to the model and then in the artifact section you're going to see this example of how to load it actually tensorflow models have their own methods so what we're interested most is just copy pasting this ID and then going here and as you can see I already loaded it here you can just copy paste it here and then load the model so this particular model if you actually decided to go and train it with me it's going to take quite a long time to load just because it's quite a chunky model and my laptop is really bad so you might want to choose other model to load or maybe yours will be much faster at it but basically once we've loaded it what we can do we can use it as our regular tensorflow model it has a predict method it has everything else and yeah so once this is done predicting we will be able to get our outputs so it gets stored in the output and then maybe we can even do at this plot okay it is done predicting now we just plot it and here we have it okay perfect well this is it for this tutorial I hope you enjoyed it I hope you learned how to log your experiments so it's just to read the rate there is a few things we can log there is we can lock the parameters we can lock the metrics we can lock the tags and we can save the artifacts the models artifact can also be basically anything like for example as a standard scaler that we defined in the very beginning of the notebook it can also be a nice artifact store if you want this if you want this experiment to be truly repeatable so this is it for this tutorial I hope you learned something new about demo flow I definitely have seems like a really great tool and I will try to use it more in my next project maybe I will find out new things and I will share them here but if you liked it uh don't forget to give it a thumbs up and to subscribe I hope to start posting quite regularly so there is going to be always content for you to enjoy take care bye
Info
Channel: Anton T. Ruberts
Views: 15,544
Rating: undefined out of 5
Keywords: MLFlow, Machine Learning, Data Science, Tensorflow, Tutorial, Experiments, Experiment Tracking, MLOps, ML
Id: RnYa3QsXRAc
Channel Id: undefined
Length: 27min 3sec (1623 seconds)
Published: Wed Dec 28 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.