Is Plotly The Better Matplotlib?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] what is going on guys welcome back in today's video we're going to take a look at plotly which is a framework for data visualization and graphing we're going to use it in python but as far as i know it's language independent so it's also available for javascript and maybe also for some other languages but i haven't looked that up to be honest we're going to use it in python and we're going to compare it a little bit to matplotlib because i have used matplotlib in countless tutorials on this channel and we have never used plot these so we're going to compare them a little bit we're not going to use matlab in today's video at all we're just going to talk about the differences we're going to focus on plotly so we're going to cover the basics here but in the end i'm going to give you a conclusion for when to use what what the pros and cons are of the different modules and uh yeah we're going to talk about this today but before we do all that we're going to talk a little bit about the development environment because today we're going to do something new we're not going to use pycharm or a simple editor we're going to use a jupiter notebook now for those of you who don't know what a jupiter notebook is it's basically an application that is running in the browser and you can edit individual cells you can code in individual cells you can run individual cells and it's you could say optimized or very good for data science in machine learning uh coding because then you can just you know execute a cell where the model is trained and then you can use the trained model without having to run the whole script all over again so you split the code up into cells you can run them you're going to see what i'm talking about in a second uh what we're going to talk about now is how to get the jupiter notebook and as far as i know it's uh installed by default but if you want to run it by default you need to say python-m notebook now i'm not sure if that works if you haven't jupyter if you haven't installed jupyter yet uh if you haven't installed it just say pip install jupiter or was a jupiter notebook one of the two just play around with that but what i would recommend you do is you set up an anaconda environment now if you want to have a tutorial on how to set up anaconda let me know in the comment section down below i'm not going to do it now because today's video is focusing on plotline not uninstalling an anaconda environment but if you have anaconda installed you also have a jupyter notebook by default you can just activate an environment that you have created data science is what i called mine and in here you can just say jupiter notebook like that and then it's going to open up your web browser and it's going to open up jupyter notebook and then you can just navigate to the desktop for example or to your working environment and then you can create a notebook there so as you can see it's running on localhost it's a web app that is hosted and now i can go to the desktop and i can say new python3 and in the notebook i can start coding once it's loaded now besides that we're also going to install some libraries so as you can see this is a new a jupyter notebook what i can do here is i can say print hello world for example and i can run this and you can see hello world then i can have a separate cell here or let's define a variable up here let's say a equals 10 and down here i say print a print a like that first i run this cell then a exists also in this cell but i can run this independently so i don't have to always run the whole code all over again and this is what we're going to work with plotly in i also as you can see have installed vim bindings you don't need to do that it's uh quite messy and quite complicated if you do it for the first time uh but the libraries that we're going to need today are plotly obviously we're going to also need so let me write it down plotly we're going to need pandas we're going to need numpy and optionally if you want to work with financial data i don't think we're going to do it at least i haven't prepared any financial data applications uh or examples here but you can install wi-fi nance to get some different data you can install yahoo dash fin you can install pandas data reader to get some financial data we're going to just use a data set of plotly but again if you have some basic experience with pandas you can just swap the data sets out and use something from those finance modules here if you want to so one thing that i would recommend you is that when you install something in anaconda use the conda install command if possible this is because the pip by default that you use in the anaconda environment is going to be the global pip uh now if you don't use anaconda if you don't want to spend any time on this installation process here and you already had i have set it up and you also or you only want to learn about plotly then just skip that part but if you're using conda this might be important so whenever you install numpy pandas plotly whatever use conda install or at least say conda install pip and then use pip because otherwise you're not going to use the actual pip maybe i can show you that real quick let's open up a second cmd and uh when we create a new con environment so conduct create name and we're going to call this test end for example and now i say once it's done come on yes then activate test and you're going to see that when i say where pip is going to say that this is the local pip it's not the anaconda pip whereas if i deactivate this now and i activate data science which is the one that i have already set up and now i say where pip you can see that this is actually the anaconda data science pip so it's a different pip so if you want to use pip for installation make sure you have pip installed by saying conda install pip and besides that also install con uninstall numpy pandas and plotly this is what you're going to need for today's video all right so let's get right into it we're going to start by importing a couple of things things first of all we're going to import pandas and we're going to import actually pandas spd we're going to import numpy snp and we're also going to import plotly.express spx oh i use the bim binding there so let's delete that and let's delete that and say px there you go so let's run this cell and now once everything is imported once you see the number here we can start with the coding let's do something very very basic uh to see how plotly basically works we're going to say x data equals mp random dot random 50 and we're going to say y data is the same and now we want to visualize that using plotly what we're going to do is we're going to say the figure is going to be px dot let's do a basic scatter plot for that data scatter and we're going to say x equals x data and y equals y data and then fig dot show this is how you do a basic scatter plot in plotly if i run this now you're going to see in a second that this is a plot and one thing that you can notice right away is that the plot looks kind of good already without us having to do a lot of or any configuration at all i can just zoom into a certain part here i can auto scale again i can do all sorts of things and it's very easy to to configure a very basic plot that looks kind of good already without having to specify too many things in advance which is not the case for matplotlib but we're going to talk about that in the end anyway so let's go to the next example let's uh load some data from a data set about europe we have uh some data about the european content so we're going to say europe uh europe data 2007 so we're going to get some data for the u2 2007. we're going to say that this is px.data.gaapminder and here we're going to query the year is going to be 2007. now this here is not plotly uh plotly content here this is just loading a dataset so you can choose whatever dataset you want we're going to get continent equals europe now i think we need to put that in quotes here and we have an error what is the problem oh of course we need to use double equal signs because that's a query there you go so once we have that we can print the data set as well by just saying print europe data 2007 and you can see that we have the country names we have the continent we have the year in this case obviously only 2007 and we have the values life expectancy population gdp per capita and then some other information that's not really important for us at the moment and with that data we're just going to play around a little bit and we're going to visualize some things let's start first of all with uh some basic data inside of a scatter plot but we're not going to just have some basic data points we're going to include some information in the size and in the color of the individual bubbles you could say as well so what we're going to do is we're going to say figure is again px dot scatter and we're going to say the x axis is going to be the life expectancy where first of all we need to pass the data set europe data 2007 the x data is going to be the life expectancy then the y data is going to be the gdp per capita and then we can define the size of the individual bubble to be derived from one of these values so we can say okay the size of the bubble it makes sense if it's the population so this the bubble is going to be bigger if the country has more citizens and then we're going to also say the color this is going to be a little bit redundant but we don't have much more information to work with here in this case we're going to say the color of the individual bubble is going to be the life expectancy as well now of course if we have a fourth column here with some data this is not really data if we have an actual column here with i don't know a degree of happiness or something we could use that in this case we use the x value and the color for the x value and for the color we use the same column so what we do now is we say figure dot show and when we run this here you can see that this is already a very good looking plot we have the individual bubbles you can see that here we also have the color scale so this means a lower life expectancy this is a higher life expectancy and you can see that this bubble here is quite big because we have uh a large population now one thing we might want to do is also the country name when we hover so we can say hover uh underscore name is going to be the country so that we know what country we're talking about now if i hover we can see this is turkey we can see this is poland this is germany this is the united kingdom and somewhere below here we have belgium as well in finland we can see that norway has a very high gdp per capita and a country with the highest life expectancy is iceland as it seems yeah iceland and switzerland you can see the colors as well and you can see the size as i said is the population so you can see that just by specifying three parameters we have a very professionally looking plot we can also zoom in we can see okay uh what are the details here what do we have here okay we have belgium we have finland we have united kingdom then i can auto scale again and this works with one line of code in matplotlib this would be way more difficult now let's look at other types of plots as well we're going to start a new cell and in here we're going to load not the europe data 2007 but the whole europe data so we're going to say europe data equals px data gapminder and then just dot query continent actually we need to pass a string continent equals equals europe so if we print that we're going to see that we have each country multiple times and you can see that we have different years so we can plot the evolution of the population the evolution of the gdp per capita and so on we can plot evolutions and because of that we can use a line chart now we're going to say figure equals and then px dot line and we're going to say that the x value is going to be the year or actually first again pass the data set the x value is going to be the year and the y value is going to be for example the life expectancy and the color is going to be the country and besides that we can just say figure show and there you go you can see this chart shows the development of the life expectancy in the individual countries in europe and the good thing here the interesting thing here is that this chart is already interactive so i can say for example i don't care about belgium i don't care about croatia i don't care about the czech republic i don't care about denmark and so on and i can just check off the individual countries here and i can say i only care about austria for example compared to uh switzerland and spain and serbia and portugal and that's it that's what i care about and now i can see that i get this graph here and i can just look at this data and compare it i can of course uh then again change the countries and of course i can also plot different values so i can say okay i know i don't care about the life expectancy i care about the population development and if i do that okay this graph doesn't look very interesting maybe if we zoom into the area down here we can see more nuances um but this is what we can also do here now we can also go ahead and compare the individual uh the the population development of a single country but not with a line chart but with a bar chart so we can go ahead and say for example uh austria which is my country this is why we're going to look at this one we can say austria equals and then or let's call it austrian data equals px data gap minder query and then the country is going to be actually let me use double quotations here and single quotations here country is going to be austria now we can just display the data here you can see that country austria continent europe and we have the values for the individual year so now we can just go ahead and say figure equals px dot bar and we can say austria data is the data set x is the year and why is the uh what was it population wanted to see the population development and we can also say for example the color of that individual year we also want to know uh how good was the life expectancy here so we can say life expectancy is the color of the individual bar now we can say figure dot show and you can see that the population is uh increasing this is indicated by higher bars but we can also see that the life expectancy is getting better and better because the more we go to the right so the more we go into the present or towards the present uh you can see that the color of the bars changes now i'm not sure if that is not true for every country which country had a bad life expectancy uh turkey or yeah was it life expectancy yes let's look at turkey if that's the case there as well let's look at that no it's also increasing so it may be low but it's also increasing is there a country that has decreasing life expectancy i hope not actually but maybe we can see the differences there romania no here you can see the population is actually falling but the life expectancy is increasing maybe we can look at a different one maybe we can say the color is not the life expectancy but the gdp per cap and then you can see for example here the gdp was rising and then falling again then rising again so you can see the development let's see what it's like in austria there you go so this is how we can plot a very simple bar chart a very simple line chart now let's do something else let's say for example what was the country with a very high life expectancy we had or with a high gdp per capita in norway has a high gdp per capita so let's go ahead and look at uh norway's development of the gdp per capita um as the y value so gdp per cap and the color is going to be the gdp per cap as well and then we're going to also say in this case uh that we want to have a different color scale so maybe you don't like a dark blue or purple to uh to yellow so you can say color continuous scale is going to be ice now you can look at the different names in the documentation but in this case we would have these colors here which are a little bit better i think because black means or dark means very uh low gdp per capita and the more you go to brighter numbers you can see that the gdp per capita is increasing you can also take blue red i think was called then red would be a high gdp blue would be a low gdp you can also reverse that by adding underscore r and then you have blue being high and red being low so you can play around with the different color schemes now last but not least let's also look at a pie chart to see the distribution of population of the population inside of europe so what we're going to do is we're going to take the europe data we're going to say figure equals px dot pi for a pie chart we're going to take the europe data 2007 and we're going to say that the values are the population and the names are the countries and we can also add a title by the way europe's population distribution for example uh of course we need to show that as well and then you can see we have a very simple pie chart with the individual countries of course here again we can also say okay let's say some people say turkey is almost in asia so i i'm interested in the european only countries for some reason so you can click off turkey for example or you can say i don't care about specific countries i don't care about switzerland austria and germany because i don't like the german language so i want to know how many non-german speakers do we have here for some reason maybe you also want to exclude belgium then because i think they have some german speaking people uh but this is how you can just use how you can design simple pie charts with plotly without specifying too many things here simple and straightforward all right so let's get to the conclusion how does plotly compare to matplotlib and i was curious myself so i went online and i looked into quora into stack overflow into reddit into forums i wanted to know when it is better to use plotly and when it is better to use math.lib and it's interesting because the two frameworks are kind of similar they're both for data visualization after all but then they're quite different because plotly is more focused on the web on the browser and on interactive charts that are easy to make whereas matplotlib is more professional and more scientific because it has countless customization options now in order to create some very good-looking math.lip charts you need to take some time oftentimes you need to write your own little mini framework to get it to work but when you want to have the full customization you can compare matplotlib to be a little bit like arch linux you can do everything you want but you have to customize it in a little bit more complex ways whereas plotly is just you run it you set some parameters and you can create wonderfully looking charts wonderfully looking uh visualizations now matplotlib also has some additional libraries like seaborne that are a wrapper around matpotlib and make it simpler easier to use and easier to create just better looking charts but at the end of the day matlab tip has an extremely huge amount of options and customization possibilities it's more professional used for scientific papers and you should definitely use it um you should definitely know how to use it as a good python developer however plotley is perfect for web development in web visualizations for interactive visualizations it runs mainly in the browser uh and a few lines of code are all you need to make wonderfully looking charts also it is very awesome for exploring data sets if you have a data set and you just want to look into it you want to plot some different charts you want to play around a little bit to see what this data set is about plotly is way more convenient than matplotlib because in math.lab you would have to write more code um in more complex code to get the same result the same interactive possibilities that you have with plotly for plotly of course there are also some additional libraries um many different additional libraries that allow you to do things easily and more intuitively and both libraries are at the end of the day very good you should use both of them matplotlib more for scientific professional stuff if you want to have more customization and plotly more for interactive uh or for web applications for web visualizations or for just interactively exploring new data sets so that's it for today's video opinion journal if you learned something if so let me know by hitting the like button and leaving a comment in the comment section down below and of course don't forget to subscribe to this channel and hit the notification bell to not miss a single future video for free other than that thank you very much for watching see you next video and bye [Music] you
Info
Channel: NeuralNine
Views: 59,915
Rating: undefined out of 5
Keywords: plotly, matplotlib, data visualization, data visualisation, data analysis, data science, datascience, machine learning, visualization, python, plotly python, plotly vs matplotlib, plotly tutorial, plotly course
Id: GzUVacnrgFc
Channel Id: undefined
Length: 22min 57sec (1377 seconds)
Published: Fri Sep 17 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.