17. Python to make nice figures. Part III: advanced plots

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everybody i'm taylor sparks again and this is video three on how to make beautiful figures using python uh the first video was the basics of what are good figures versus bad figures the second video was the basics of using python to make fundamental simple figures this third video is the more advanced class on more challenging figures so we're going to show you how to make figures that look kind of like these ones three-dimensional plots how about this one where you do fill betweens with lots of different data sets this is just a generic figure we'll show you how to start with heat maps we'll talk about multi-panel plots um figures with multiple axes so the regular y-axis and then separate y-axis with the same x axes repeat felt refract refill refinement using diffraction data so these are some of the things we're going to dive into let's start with the simplest which is just a generic figure how this is the one that i actually i go to whenever i'm plotting anything i just pull open this file and this is where i start with so when we look at it uh this code is already here we're not going to type the code together instead i'm just going to go chunk by chunk through it and talk about what the chunks are doing so we're going to grab the libraries and you've seen this before the only thing that we haven't talked about yet is the seaborne library as sns that's going to give us colors right so that's what that is we're going to grab our data so the data is stored in the generic single file singleplot.csv this is available so you can follow along if you want to you can pull this up and try it for yourself we're going to do that with pandas we're going to read in that csv into a data frame once we've got that data frame that allows us to grab the different data sets x1 well x and y x2 y2 x3 y3 because based off of the knit the labels of those columns in that data frame okay so we've grabbed our data now let's go ahead and grab some colors right i love the seaborne rocket palette i think these look great and since we're dealing with three different data sets i'm going to go to the rocket palette and ask it to give me three colors so along the continuous color strain there it's going to give us three one from the ends one from the middle okay if you wanted to you could define your own colors like my postdoc advisor had this really great collection of colors that honestly look really good and here's their hex numbers and you can actually use those i store them into a string so i could access the zeroth index the first the second the third if i wanted to pull these colors out and they look really good too okay let's go ahead and make our plot you've seen this before fig equals plt.figure we're going to give it a fig size we love square plots so a nice five by five inch canvas to start with now on this we're going to do semi-log plots so semi-log it's gonna that means it's gonna be logarithmic along the x-axis but linear along the y so we're going to go ahead and plot our three different data sets here we're plotting them with lines right we're doing markers that are different triangles squares and circles we're giving them labels we are assigning them different colors from the rocket palette we're making the inner colors white marker fill color is white and we're giving them a size so great let's go ahead and run this just as it is this far so we're going to hit f9 on that section and let's see how the plot looks if that's all you do it's not terrible but we could do more it needs a legend it needs i like the tick marks to be pointing in right pointing in towards the plot from top and bottom the sides and not pointing out and then we need to add some labels and maybe mess with these a little bit like i don't know if i like how it automatically chose the limits so we can manually do all of that let's start with the ticks we want it to go the the limits we want to plot from 1 to 10 000 on the x-axis that's going to stretch it out a little bit more and then on the y we want it to go from negative one-half up to 16 okay now where do we want the ticks you notice that by default it puts them on the bottom and the left but it didn't put them on the top or the right hand side well we can change that we can manually say all right point them in and on the right hand have it set equal to true and the top set equal to true that's going to put ticks all the way around right we can change the the label size right so that make sure that this is all 14 we can yeah so this is gonna look good we can and then we can put the label we can choose where we want these labels right now you notice that the x label is labeled on the bottom but if you wanted to you could put the bottom and the top i don't think you should and if you wanted to you could have it left and the right i don't think you should so we're just going to explicitly say that we only want it true on the bottom and true on the left we don't want labels we just want ticks but not labels on the top and right so we can set that now where do we want the ticks to occur right now by default it's putting them every two right what if that's too crowded for you what if like for example this separation between major tick marks is pretty big what if we skipped a couple right what if we did every four well we can do that we're going to create we're going to come over here we're going to do y ticks is equal to numpy of a range so remember that creates a list of values from the start to the stop not inclusive so it's going to start at 0 it will not include 16.1 it'll go up to it and it's going to go every four so it's going to give us ticks at 0 4 12 16. okay so that's going to give us a list that we're going to use as our ticks instead of the default thing that it chose to do we can tell it more info about the ticks we can make sure that they're both pointing in that both the minor and the major ticks we can make the major ticks 10 long and the minor 5. these are really small the default values those are kind of hard to see we're going to make those a little bit bigger so let's just with this portion of it let's go ahead and run it again from the beginning and see how it changes things when i run it now you see great right we've got ticks going all the way around on all four sides of it they're big enough that they're easy to see this only no now goes in increments of four which is better and this goes from one all the way to ten thousand it's just it's better i think so you can manually fit that to whatever you want it to be but that's what i've got so far now the last thing we need to do is go ahead and put labels on the x and the y axis right so for our x we're going to label it particle radii and it needs to be micrometers so i'm going to turn on math font using the dollar signs and write backslash mu that's going to give us the micro symbol and then give us a y label and then tell it to plot a legend so as simple as that let's go ahead and run the whole thing now and you can see that we are able to generate with this generic plotting file a lot of really great stuff and control it exactly how we want and it looks terrific okay last thing we did was plt.save fig it's going to save it in the data for exercises folder within another folder called plotting it's going to call this generic plot now there's other things here it's going to put we can set the resolution i do dpi equals 300 and b box underscore inches equals tight that makes sure that it crops it just to where there's content and it gives it high resolution so you can use this in a figure for a paper and that is what produces this beautiful figure which is now ready for action okay notice that it did the greek letters and everything so there's really nothing left to touch up on that okay let's move to our next one okay okay on this one we're going to show you how to do a plot where it fills in between the different lines so this is the one that created this figure it's the one where we're showing how resource production changes over time right as you go from 1998 to 2015 these different countries produce different amounts so how do you produce this right okay we're going to start with the libraries matplotlib and seabourn just colors and the ability to plot okay we don't need pandas or anything because my student actually didn't know how to use pandas and they brought in the data by hand which is painful this is why we learn how to do python so you don't have to do that but they've got it so we're going to use it so we're going to create our figure nice five by five square figure then over here we have a list of all 22 countries that were in the list right in the order in which we're going to plot them right so that here's all of our countries okay now we grab colors this time i'm not going to use the rocket palette i'm going to use seaborne but from seaboard i'm going to grab the cube helix palette from that q helix that's one that goes from this nice sort of greenish through blue you can change it like for example i've got mine starting at 0.5 for the start point it's it goes through different bands of color if i change that you see now it goes from like this kind of teal through a purple so toy around with this and figure out which one you like i think it's good to start it at 0.5 with a rotation of negative three quarters just plug in different numbers here you can read more about it at seabourn but i think it makes a nice palette the point is we want 22 different colors sequential to represent these different countries and the main ones we care about are the drc and zombia okay we grab a list of all the years or which we're going to plot which is 1998 all the way up to 2015 and then each one of these y1 all the way down to y22 represent the annual production of cobalt by these different countries so y1 would be democratic republic of congo's production of cobalt for all these years okay the next one would be australia's production and botswanas and so forth and so on and so forth okay so to create this plot where we fill between them we're going to do what's called a stack plot so plt that's pi plot and we're going to do the stack plot function we have to give it the x value which is the year that's what going on our x axis and then we send it all the different y values that we want to be stacked between them with colors okay and then we label it send it the labels which is this string of countries has to be the same size as these that we've given it and then the colors we have has to be the same size of all the countries as well that's it that's pretty simple right then we just tilt the limits plot it from 1998 to 2015 from 0 to 100 on the y-axis give it some labels and it's as simple as that that's what produces this now something to be um you'll notice that i've got like zombia written here and i've got the democratic republic of congo here and i've got this vertical line and this thing rotated how do you add that stuff that's pretty easy to do so to add the vertical line we do it with the plt.axv line so that's going to put a vertical line at some x location so it only wants the x location we're going to say at x equals 2003 that's when the war ended we want it to be a dotted line i think it makes it clear and then black color great you could change other things you want to change the thickness or whatever else you could change it right in here um here i wanted to label the democratic republic of congo so i went to 2010 2006 and 10. so it goes to the year 2006 and it goes up to 10 and it starts to label this here notice that i put this slash n and that forced it to do a second row instead of having this dangle off the map off the plot it makes a second row zombia we changed the color to white we set it up here by plotting at 1999 and 92 so 1999 92 and it plots it up there um and then this end of the congo war we were able to type that using this rotation equals 90 that rotated it 90 degrees up so we got this nice rotated text then check take a look at the legend the legend was a little bit gnarlier right there was 22 different data sets so we had to shrink it a little bit smaller so it would fit and then you'll notice that democratic republic of congo is actually our first data point but we want it to match the colors on this plot which is dark purple going towards lighter if we didn't reverse this legend ordered it would be light going dark going down so we don't want that so we can reverse it we do that by grabbing the handles and the labels right of which is the handle is like this little block of color and then the label is the one going with it we're going to grab those from the current legends we're going to go plt.gca and then get legend handles and labels that's going to grab those and store them into these two things it's a it's a tuple so it unpacks those into these two different variables and then we're going to use those when we plot our legend we're going to say take this list and reverse it reverse the handles and take this list and reverse it reverse the labels and then we're telling it to plot it outside of the plot we're using this b box to anchor command that allows it to plot it outside of the figure which we want to do because we didn't want to cover all this up that would be maybe not the right choice here and again i think that that's a little bit small i think there's probably still things we could work on with this figure to make it more legible but not a bad start okay so that's how you do this fill between function let's go to our next one okay let's talk about how you make multi-panel plots okay so when i'm going to run this we want to eventually be able to make a plot that looks like this right maybe has one set of data here another one over here and another one down there how do you do this right okay well let's start with the libraries we're going to grab the same stuff as before but a couple new things from the matplotlib library grid spec we're going to grab grid spec okay and then matplotlib itself we're going to grab that so that we can change some things here right here this first line two lines of code i put that in tons of my figures what it does is it changes the font size universally so the rc parameters this is a dictionary entry so this is the key word for the dictionary and this is going to be the value we're setting the value in that dictionary for all the font size to be 14. that's just going to make the general font size to be 14 unless we say otherwise okay then we're going to do pdf font type 42 that allows us to if we want to open this in illustrator we can save it as a pdf and then open it in illustrator and modify it if you wanted to move things around by hand okay all right let's go ahead and grab our data this is spectroscopy data for electrochromics electrochromics are cool in that they are they change color when you apply voltage to them so we're going to see a change in transparency as a function of wavelength when you have it on versus off so you have one data set over here for when it's on and one for when it's off first off you see that there's these three panels i'm going to go ahead and delete it so it's just two panels first and then we'll add it back in in a moment so i'm going to delete out that one of those panels okay so let's say we started out with just a two panel plot but you decided well first off how do you make the two panel plot and then how would you add that third panel that you saw before let's go through the steps okay so we grab our data on this one it's kind of interesting if you actually grab all of the data from it it's actually really really dense with data this is what it looks like these data points are right on top of each other and maybe you don't like that maybe you like to be able to see the data points and make a little clearer you can actually down sample your data with this line of code here so we're going to take the data frame which we just read in from our csv and we're going to say that the data frame is now equal to the data frame before but we're using the i look and we're saying grab every 10th row grab the data point right so we're going to chop out every we're only going to keep one tenth of the data if you do that look how much it reduces this it makes it much more you can see there's individual data points you'll make you'll want to make sure that you say in the text that you did this so that they realize what you actually did it's always important to say what you did if you modified your data in any way but i think this is fine you can say like look the data was the data shown is actually down sampled by one tenth and they'll understand that i think it makes it a little bit clearer to see okay we're going to grab the six different data sets right here's from 30 60 and 90 when it's turned on and then 30 60 90 when it was turned off so you can see the transparency as a function of wavelength changes here it's much more transparent here it's much less transparent so here's where we're grabbing those data sets again we're reading them and we're doing drop n a so that we drop any empty columns or empty rows as we read them i'm going to manually set the limits for the plots up top here i'm going to say that x minimum and x maximum go from 400 to 800 so it's plotting from 400 to 800 on both of these plots and from y it's plotting from 60 to 120 plotting from 60 to 120 okay just like before we're going to grab some nice colors you can use whatever you want to define yourself or you can have seaborne produce some nice ones i'm going to use the rocket palette again but this time since there's six data sets we're going to go ahead and ask for six colors from the rocket palette and you see that it sequentially goes from this dark purple lighter red from red it turns orange and it eventually turns pink so it's this nice sequential series i think looks good now we go ahead and prepare a multiple plot so we're going to do this looks just the same as before this first line but here it gets different we're going to do gs equals grid spec dot grid spec four by four so it's producing instead of just one figure it's chopping that up into a grid where you've got four rows and four columns that we can create figures in in that sort of grid i'll show you how to do that in a moment and then it has this gs.update w space and h space that's width spacing and height spacing so for example you see right now that there's this gap between these two figures we can change that with the width spacing what if we want it to be like a really really big gap let's put one there and it's going to put a big gap there right so it'll really separate these or if you wanted no gap at all you could put zero and they'll just butt right up to each other and that maybe you want that i tend to think that things look a little bit better with like a small gap so i usually do like a 0.2 or 0.25 and they look pretty good so you can clearly see it's two different sets of data but they it's obvious that you only need this one label over here you don't need another label over here for this one because they line up nicely okay that's my opinion do what you want all right let's generate this first panel over here okay so when you do this we have to call we have to create a subplot so xtr underscore subplot is equal to our figure which we created up here above and then figure has a function called add subplot where it's going to create a subplot and then where is it going to create it it's going to create it using this grid spec right grid spec the two numbers represent the rows and then the columns so we want it to occupy all four of the rows remember it's a four by four grid so we wanted to use all four of these rows but we only want it to go from zero to two on columns so it's only going to use half of our columns that's going to make this figure okay then you just plot your data we're going to do plt.plot again you've seen this before it's just the x the y it's whatever other values you want we're giving it markers we're giving it colors you know very straightforward you've seen this before we're going to give it ticks i do this a couple ways by default this will create ticks from zero until the maximum that you want to go up to divided by four so it'll do four ticks right on that one what that ends up doing it ends up creating 400 600 800 and then this one starts at 400 and the 400 and the 800 from those two different plots sort of pile up on each other and it doesn't look good so here i chose to manually set where i want them to happen and you can do that too you can manually create where you want your x ticks and y text to be so here i say i want my x ticks to go from 100 up to 701 in increments of 200 right remember if i'd only gone to 700 it's going to get rid of that 700 there right because it's not inclusive it would go 100 300 500 but it won't include that 700 right so i'm going to go to 701 and you don't have to start from 100 we could start from right in this one we could start from like 300 and it's not going to look any different because it's off the map anyways okay so and then for the y i want to go from 60 to 120 1 60 to 121 in increments of 20. so 20 20 20. so that's how i set my my major ticks on this and i just did it by hand because i thought it would look better that way okay i'm going to tell it to turn on the minor ticks i'm going to say information about these i want my minor ticks to be length 5 my major ticks i want them to be length 10. i want both minor and major to be all the way around like left right top and bottom right for major and minor ticks and then you just tell it to bottom right plot the x sticks and y ticks using the ranges that in this case we built by hand okay and then we tilt the limits plot from 400 to 800 from 60 to 120 that's what this section of code is we give it some labels we tell it the transparency using y label now i do not use this plot x label for wavelength because look what happens if i do if i tell it to plot it as an x label it moves it over here to the left it's no longer centered but really this wavelength corresponds to both of these data sets and so i'd like it to be centered you can make it center it you can try and like move it over but i find it's just easier to plot it by hand so instead we're going to turn off the x label and i'm going to plot text and i'm going to plot it at 6 50 and 52 so that's going to be x 650 so that 700 is going to be it's going to start somewhere over here and then it's going to still it's starting y at 52. so if that's 60 it's going to come down to 52. so when i run that i find that it centers it pretty nicely takes a little bit of trial and error maybe you don't think that's quite right maybe you think it's supposed to be like 645 you can scooch it over just a hair and maybe that's better right but i think that's that's how i do a lot of these multi-panel plots because i want this to correspond to both of those and it's easier to type it myself okay now let's go ahead and create the second panel okay again we're going to create another xtr subplot using fig.add underscore subplot using grid spec and again we want to use all four of these but we're only going to plot two of the columns the columns at the end so that's going to go from two to four over there we grab our data just like before except that we change the colors and change the labels to represent 30 60 90 clear as opposed to 30 60 90. we do the ticks and tick parameters the exactly same as before everything is exact same except you'll notice that i did not turn on my y label what would happen if i turned on my y label for this other multi-panel right if i turn this on look what it's going to do it's going to put that y label right there and that just looks ugly we don't want that so we're going to turn it off because it's obvious from this that it corresponds to both so when i run it you can see that this transparency percent is clearly acting on both of these we don't need the tick labels we don't need the y labels okay now what if you wanted to do like we had it initially and have like a small box here and another small box here how would you further modify this well easy enough let's go ahead and grab this whole second panel right this whole second panel here and we're gonna copy it and let's make a third panel okay so i'm gonna call this generate third panel okay up here let's go to our second panel and change things instead of plotting all four rows coming all the way down four let's have it just plot the first two so instead of zero to four let's have it go from zero to two third one obviously we have to do the opposite this one has to go from two to four okay now let's run that and when we run that yeah there you go we've created this figure we basically plotted it twice but once on this smaller grid and once on this smaller grid now what's wrong with this there's lots of things that are wrong with it the like these numbers don't need to be there right there so let's go to panel 2 right and turn those off so we're going to go to our labels right now this is panel 2. the label bottom is set to true let's set that equal to false like we don't need that there right but what we do need it's not really obvious that this 60 to 120 is the same thing that goes 60 to 120 there that may not be obvious so maybe we need to turn on those labels for the y-axis let's turn them on on the right-hand side by setting this equal to true and we'll do that for both panel 2 and panel 3. so i'm going to come down here to panel 3 and i'm going to say plot those on the far right hand side yeah so now you can see pretty clearly that this goes from 6120 6120 i think that this is a pretty intelligible graph at this point so you could divide it up however you want here we did a square grid which was four rows by four columns but you could do whatever you want in fact in some of the videos i'm gonna in the as this video goes on i'll show you other examples where it's very different three by fifteen or whatever and that's no problem okay so that is how you do multi-panel plotting really really powerful tool one of the main reasons why i plot things in python okay let's go to our next one okay let's talk about how to make a heat map heat maps are really cool they're a great way to show three-dimensional data in a compact way so in a heat map you plot not only x and y but you plot z as color right so you can see in this one for example with diffraction how things change right the lattice parameters change because the peak positions shift as we increase the pressure that change right so how do you go about creating this okay here's our libraries we're going to grab everything we did before numpy pi plot seaborn pandas grid spec and then we're going to grab this colors thing i'll show you that in a moment okay again i'm going to set my universal font to be 14 for everywhere here we're going to grab our heat map data if you look at where this data came from let's go ahead and grab it in our data for exercises folder if you look at heat map data this is how it was organized when my grad student gave it to me so you've got this first column over here representing q q think of it kind of like two theta if you're not familiar with q you know that in diffraction you plot the intensity against two theta q you get from two theta it's a transformation of it so think of it like that the position of the peak and then here's all the intensities for the different pressures right so you've got 20 what one different pressures and all of their intensities associated with them okay so this is one way to organize the data the problem is that when you plot a heat map in python using cborn what it wants is three columns it wants a column repeating x y and intensity right so we need to transform this data into three columns what we would like like if i go ahead and insert this and insert this what we want is a column representing pressure and a column representing intensity and then q so pressure if this first pressure is at let's say like 13 gigapascals we want it to be 13 for every single one in that series for all the ones where it's q is with uh for all of these q values which go there's thousands and it's like 6 000 in a single data set right a single data set goes there's 6 782 entries in a single data set so we need to repeat that 13 all the way down to here and then we need to go to our next pressure maybe it was 14 gigapascals and we need to repeat it another 6 700 times so that's tricky and then we need to take all these intensity values right we need to grab these intensity values this guy all the way to the bottom we need to grab him and we need to we need to move these over to this column and then we need to grab the next one all the way down and we need to put it underneath it right so this is tricky this is some data manipulation but we can do that and python is a great tool for doing it so i'm going to not save this and i'm going to exit out and let me show you how i went about manipulating that data okay well first things i did is i grabbed that data from this from the comma separated value sheet using pandas and i stored it in a data frame then my grad student told me hey even though we ran q from you know 6 000 or 6 700 data points technically we only want to grab from 504 to 4066 or q values from two and a quarter up to four because that's where these all are in common sometimes we don't we only measured two and a quarter to four so some of these might go a little bit further out another direction but we only want to grab the block where they're all comparable otherwise you'd have missing spots in your in your heat map we don't want that so great we can do that we can grab the q values by going to the data frame in the q column and grabbing it using the loc function the rows 504 to 4066 we're going to drop n and we're going to convert that to a numpy array because the seaborne heat map once uh it wants it to be a matrix right of numpy arrays no problem so we're going to do that and we're going to grab all the different pressures all these different y1 through y21 represent different pressures right so all the intensities at different pressures okay so we're grabbing all those intensities we're grabbing all that data okay now we need to take that data and turn it into three columns so the first column needs to be q q needs to have that same range of you know incremental values of changes in q right all this tiny resolution moving down all its thousands of data points we need to be that q value repeated 21 times in a row for the 21 different pressures that we measured great we can do that we're going to say all right 4 q in range of 21 that means it's going to be a for loop that goes through 21 times it's going to take q which started out as an empty list and it's going to append to q this x1 value which is just our q value it's going to repeat that 21 times it stores that as q it adds to q over and over so by time we run this let's go ahead and run this first section when we run it and we open up q q is something again it's going to be thousands long but it starts out with two and a quarter and then it increases up till four you can see it get very close to four and then it starts over again and it does that 21 times as i scroll down it does that 21 different times okay for the 21 different pressures that were investigated in this study awesome now let's move on to the next one for pressure here's the 21 different pressures that they had okay our 21 different pressures we need to start out making a list of those we do that now we create a new list which is empty called new pressures and for each pressure in our pressure list like we're going to start out with this first one we're going to create a new numpy array filled with that value and it has to be as many long as we want in our as many queues as we're going to consider which turns out to be 3563 right so it's going to put that same pressure all those times to label that those are all part of the same pressure group right then we have to take our intensities that's easy we just take all these individual intensities corresponding to the different pressures and we concatenate them together that just stacks them up okay sweet now we've got our three columns so we've got q pressure and intensity so we have three things here i go ahead and i uh i scale intensity down so that the color map can more easily tell the highs versus lows that's common in x-ray diffraction is that they'll scale it by one-half or logarithm okay i'm going to say that my results that's we're going to plot a new variable called results is the numpy transpose of the vertical stack of q pressure and intensity okay so it's taking those individual arrays and making a matrix out of them is what that's doing now we go ahead and say that the data set is equal to um we're creating the dictionary we're saying that q is this pressure in gpa corresponds to pressure intensities intensity we're going to use those for labels on the plot and then we come over here we're going to reset the index and do pivot table telling it okay plot it with the index's q with the columns as pressure and for the intensity that's going to be the heat map we want to use intensity okay so you're basically explicitly telling it which one of those you want to be in the heat map then you go ahead and make your figure use grid spec so in the so in the figure that basically is what we need to do to be able to create this middle panel but you'll notice that i think a really good way to do heat maps is to not just show this middle region but to show the what the individual data set on the far left and the far right ought to look like so here's the one on the far left the one at the lowest pressure and one at the highest pressure so this is technically going to be a multi-panel plot you've got one panel over here one middle panel and then a right panel so let's go ahead and do that we're going to create our figure we're going to do grid back i'm going to call this a four by six plot so four rows six columns that allows us to do one column here four in the middle and then one column at the end okay that's what the six columns are going to do here's the spacing i put really small horizontal spacing between these okay but you could make that larger if you wanted if you wanted like a great big spacing you could move that make it bigger i thought it looked better with a really small spacing on this one so i might keep that small maybe we'll do like just point one okay all right now let's go ahead and fill data in our first column on the left here okay so this is not a heat map over here that's a scatter plot you can see that as the intensity changes the color changes okay so we're going to do xtr subplot fig add subplot we're going to do our grid spec we're going to take all four rows we're going to use all four rows here but we're only going to go from 0 to 1 this first column great now this original colors if you tell it to do a scatter plot and you just give it the data in there it's going to take the maximum of the color range it's going to be like a bright bright yellow like this brightest yellow here will be there but you can see that actually in this first data set that's the brightest to get it's less bright than one of the later data sets and on this one over here this brightness even though it's the maximum for this highest pressure it's less intense than it was in the middle so what i needed to do was manually change my color maps on these end members to more closely match that from the heat map so we wrote well actually i had the internet right for me i found this on stack exchange way to truncate your color map where you don't look at the entire section but i'm going to skip this for now you can look at how i did it i'm not going to bother explaining it though basically we started out with the rocket color palette and we only went from 1 10 to 55 percent of it that's what you're seeing here as opposed to going from zero to one hundred percent of the color map we only needed part of it so that it more closely match the heat map that's all i say about that okay we did a scatter plot we sent it the check this out i'm calling this y1 and x1 that's because i've got this thing rotated on its side right the x1 is actually now going to be the uh the y value and the y one's going to be the x value the size i set equal to one the color i set equal to the y one intensity right and then i gave it our new color map okay so i did not give it an x label we didn't need an x label over here that's just intensity i didn't give any tick marks for the y label i told it that it was in q space which is 2 pi divided by d the inner planar spacing which has inverse angstrom spacing so you can do that and then i labeled that this was the pressure corresponding to 13.06 gigapascals right so that's how we went ahead and got this plot how did i make it plot in this funny way where it's plotted backwards well i plotted on the y limit from 4 to 2.25 that's why it's going from small numbers to high numbers here you just tell it you can tell how you want to plot i told i want to start at 4 and go to 2.25 so i plotted it backwards from what you would normally do but that allows us to line up this peak here with that peak there this peak with that peak right it lines up nicely okay um okay so i plotted that with the the correct limits i put the right ticks on it i wanted to go from 2.5 in increments of one half so you can see it clearly and then i want it to turn on the left and the right tick marks but not the top and the bottom so we were able to do that pretty straightforward okay so that's our left wing now let's do this right wing over here the right wing panel the one on the far right is gonna be the exact same check it out we're gonna do xtr subplot fig add subplots we're going to use all four rows right we're going to use this whole region top to bottom but we're only going to go from columns 5 to 6 right over here right we're going to do a scatter plot again using our new color map so that it matches the heat map by the way what would it look like if you didn't do that like what if i didn't do new map let me show you what i what the problem is if i just put in rocket here if i tell it to use the rocket map take a look at this it's going to take the top of that peak value and make it as bright white as possible but we know that bright white does not it that's not what's actually here the bright white is reserved for these these are the highest intensity in the actual whole data series but you didn't know that when you when it did a scatter plot it just took the maximum value and the data that was given in this series right in this one pressure but in the whole data series that wasn't the highest that's why we had to map it over to this new cmap so that's why we did this change where we reduced it and that made it look much more closely like it ought to like that was just starting to turn purple and there you see it's matching this is just turning red and it's just turning red there so it's a much closer map granted i just eyeballed that but i think it made it easier to interpret this okay all right on this one i didn't need any labels so i just turned them off um didn't need any tick marks uh tick labels just tick marks so we just turned on the tick marks um and then again i told it that this corresponds to 43 gigapascal so pretty straightforward stuff you've seen now about the heat map right so in this middle one we're going to do xtr subplot and we're going to do it we're going to plot from columns 1 through column 5. that's going to give us this big section in the middle for the heat map to be and then the heat map itself is pretty simple you just go fig equals sns that's seaborne.heatmap you send it the dataset right we already packaged up that data set up here up here we already where was it right here here's where we defined our data set with we told it what the index the columns the and the value for the intensity should be great so it's ready to go we told it to turn off the tick labels we didn't want any tick labels and on the on the y axis and the exit labels we just gave it four of the data sets and we gave them the data points and i manually told them when to insert those on the zeroth index 7th 13th and 20th data set and that's it and then we're ready to save it so that's one way to do it there are other ways you can do im show so i am going to put these these are commented out these will also produce nice color maps but i think that this does a pretty slick job and i think it's the right way to do color heat maps where you show the end members on the sides like this so you can really see what's happening in higher resolution than just the heat map would tell you alone okay so that is how you do heat maps let's move to our next one okay this one's pretty gnarly this is multiple axes um what do i mean by multiple axes with multiple axes i mean that when you plot it you've got your x and y but maybe you want like a y2 shown in red and a y3 shown in light blue all three of these y's right this capacity coulombic efficiency and discharge energy density those are three different variables but they all correlate to the same x variable which is cycle number so for batteries when you cycle the battery cycle number is really important but there's lots of other parameters right capacity efficiency discharge energy density those are all a function of cycles so how do you create a plot like this um it's pretty tough so here's how you go so we're going to bring in the same libraries plus this multiple locator from matplot.ticker that allows us to do different tick marks for these we're going to use that in a minute first i'm going to set everything to be size 14 font we're going to grab our data which is stored in this multiple axes excel file you can see it yourself how it works i read it in instead of as a csv this time we read it as an excel sheet right so that's fine when you read it as an excel sheet now you've got different worksheets in there so you have to go sheet by sheet for the sheet in the sheets of our data frame sheet we're going to we're going to read this in right so that brings each of those in and stores it into this data frame no problem so now in our data frame we grab the different values right so we've got the one with cycling 100 times cycling 20 times the capacity the efficiency the discharge energy density so we're grabbing all these different values and storing them into these variables here nothing too crazy so far um let's grab some colors here we used rom's colors okay all right we had to build a custom function here and i say wait this is all my grad student kai who's brilliant when you do these like see how there's this other axis over here but we're only like showing the far right hand one the other like the the horizontal bar going out to that has been made invisible he wrote a function to do that where you send a set of axes whether it's the the regular y axis the y 2 or this y 3 right it it creates them invisible and then we just turn on just the far right hand side afterwards so we he wrote a function to do that over and over for us okay now here's where it get very different than what we've seen before instead of just doing figure equals plt.subplots we do figure comma host so the figure is now going to have its regular host axis that's the typical x and y okay in addition to that we can have twin x so we're going to come down here and see how we're going to take our host which is our normal y axis and we're going to twin it meaning we're going to create a second y axis that uses the same x values but as a duplicate y axis so it's called twin x as the function so we're going to call that par one right so par one is this at red axes par two is we're gonna do it again and that's gonna create this blue axes over here okay so we're making twins of this x and y axis set up okay and then we can offset them like we we moved its position over like this light blue one the red one was fine we just put on the right hand side but this one we moved it over off a little bit okay all right what else can we do here so he sends he uses his function that he wrote this make patch spines invisible and he sends it par one and part two here he sends it part two basically turning off the grid right turning off all the box like you can see this normal black box is there he's turning that off and then he's just gonna manually turn on the right hand side he's gonna turn on this blue section on this far right hand side for that part two making the rest of it invisible so some slick work there now he goes ahead and he plots his data on the host data set he's plotting some data on par one he's plotting different data on part two he's plotting other data right they all have the same uh well yeah you're using the same approach here it's still like like before we saw plt.plot here it's not plt.plot it's host.plot or par1.plot because you're you're being specific about where you want these to plot according to which y axes okay giving them the markers that he's chosen the line styles the marker face color here they did some alpha meaning transparency so you can see that these are like piled up on each other that allows you to actually see through them to the ones underneath okay for each one of these three y-axis you need to set axes limits independently plus your x-axis so there's four of them right we're going to set our x to go from 0 to 100 cycles we're going to set our first axes to go from 0 to 1200 0 to 1200 we're going to set our next one to go from 0 to 105 our third one to go from 300 to 1800 right we set these independently by doing host.setxlim as opposed to say par1.set ylm part2.set ylm okay we can do the same thing with the labels we need to create four different labels one for here one for there and one for here and one for there so we set our four different labels okay we can change the colors of those different labels here we're doing that we're changing the colors of them we can change the tick like the the major tick mark separations here they're separated every 200 but here they're separated every 20 and here they're separated every 200 right so you can change different tick marks so here we're doing that for the host for par one and for part two and for the x-axis right so it's a little gnarly but you can do it you've got this as a guide if you want to that i've made for you uh and then the tick parameters we're telling it how long we want the tix parameters to be then the last thing is these different data sets were plotted on different hosts like one was on the host and one was on par one and part two so if we plotted legends that's a problem because they weren't all part of the same figure they were we'd split them into sort of like three figures in one so what we needed to do is grab all the information from those different plots that we created which were up here these were the plots that we created when we plotted them and from those we're going to get the labels for each for l line so for each one of these data plots we're going to grab the labels that were there right and we're going to use those to now plot the labels in our legend on just one of them on just the host plot we're going to plot all of the plots even though they were technically plotted on par 1 and part 2. see what we did there so that allows us to have just one legend here with all that information now in this one something that bugged me is i was never able to get this far right hand line uh to turn red and this far one to turn light blue so i didn't want to fight with it anymore so instead i exported it as a dot eps and that means that it's vector file and we can open it up in illustrator and i could go ahead and manually change those things so if you come over here to where it got saved under data and then plotting this is now something that i could open in illustrator and i could just manually make that one change to make that one line red and the other one blue light blue okay so that's how you go about making these really kind of gnarly multi axis plots okay let's go on to our next one okay just two left let's do reap felt first if you are a material scientist very likely you're familiar with reit felt refinement it has to do with x-ray diffraction right when you plot exported fraction you've got intensity versus two theta but it's very common for you then to say okay here's what i think my phase is what would its pattern look like and how well does that match the observed pattern so the experimental data are shown at blue data points here and then we have our simulated pattern with this orange color and these are simulated where we expect the peaks to be for these given phases and then you can even include like oh and plus the background sample holder is this light blue and then you take the difference you take the blue points minus the yellow points and you get this difference curve and if you match it really well then you did a good job of figuring out what your phases were so if you haven't heard this before don't worry about it it's a whole field um there's great software packages out there in my tutorials i'll probably put a link to one of them in the description below but making like the output is a lot of information it's the real data it's the fit it's the difference it's the sample holder it's where you expected the peaks to be for different phases these are complicated figures so i think that this is a beautiful way to present them it's it's dense with information and it just gives you what you need to know so how do you make a figure like this you just use this file exactly as when you run your repel refinement it will spit out something called it your final report is going to have all this information and i made it really simple so that all you had to do was export it and then i would read it like it will come with these names already there and so it's gonna you just run this python script and it will create that figure for you just like that which is really great so you've got your x value that's your two theta values you've got your observed data points your fit the background and then the hkl positions for the different phases that you were find now here i only did two phases in my refinement if you did 22 it would just keep on giving you more columns and you could just keep on reading them in that would be no problem so that's what pi g says two the refill refinement software will spit out how do you use that pretty straightforward we're gonna grab the same libraries as before we're gonna set the font and the pdf so that it's editable in case we wanna tweak this in illustrator afterwards we're going to read that file in it's a csv so we're going to use pandas to read it in we're going to then go to our data frame where we read it and we're going to individually grab the x data the observed data points the refill refinement the background and the hkl positions for your two different phases okay i'm going to set the limits of my plot we want to plot from 15 to 100 it's pretty common 10 to 90 is also pretty common we're going to use some colors here i think roms colors look great that blue and the and the you know the pumpkin and the cerulean neptune his colors uh they look really good so we're going to use them um we're going to make a multi-panel plot there's this one big panel where most of the action is happening but you've also got these really small panels up here corresponding to the phases and where you expect to see peaks like like this yellow peak because firm comes from a phase now you know that comes from this cobalt 304 phase and not cobalt oxide because we put these histogram tick marks that's really helpful like this really big yellow peak it comes from the coo phase the rock salt phase as opposed to the spinel phase okay so that's why we do those so we're gonna do a multi-panel plot we're gonna do it um with grid spec and we're gonna do 15 different rows right so the first row one and row two can be for these histograms and all the other 13 can be for the main figure right you can divide however you want but that's how i do it uh with small spacings between them right little spacings there all right so let's make this main figure first this big one so we're going to do we're going to do the same before xtr subplot fig equals add underscore subplot we're going to use grid spec we're going to go from rows 2 to 15. we're going to leave rows from 0 to 2 for making these little histograms up top we plot the data we're going to plot x data and the y observed we're going to call it the experimental data because it's experimental data i like to use an actual marker that's why you're seeing these little blue markers here i didn't do a fill color because i think it's kind of nice to be able to see how those like line up on top of each other and then for the fit we don't use markers but we do do a line and we give it this color and we give it this nice label telling it that we achieved a five percent weighted residual same thing with the background we don't use markers we use a line so that's what this light blue is and then the difference check it out i never actually defined the difference as a data set i just do y fit minus y observed that is the difference right so that's this gray data set down here at the bottom great we've got all of our data there now we need to add the other stuff tick labels legend you've seen all this before we manually have it tick every 20 on here i don't put tick labels on the y axis because it's arbitrary units we don't need to so i put ticks i just don't put tick labels right now ticking every 2000 intensity units i just don't care what those are so i don't i don't label them i turn on my tick on all four sides we give it uh x and y labels we've seen this already give it a font size for the legend i had to make this a little bit smaller than i would normally do to make this fit that's that's the name of the game here and then i told it i actually plotted text on here to say hey these two phases are present it's 60 rock salt 40 spinel okay now how do we make these upper sections so here we're going to add the subplot for phase one this is pretty cool so we're going to plot from row zero to row one so it's going to be just this top sliver okay and then here's how it works so the different phases when you run your refinement you don't know how many hkls are going to be present it could be 30 it could be five right it could be any number right and so here's what we do is we know that those are stored in this variable hkl1 if i come over here and i look at hkl1 here's where they're stored so what i do what i want to be able to do is i want to go through this and item by item plot each one of them until there are no more that way hkl1 and hkl2 even though they're very different numbers i use the same exact approach for both so for hq1 i'm going to say 4xc in hko1.value so it's going to go value by value in that list and we're calling it xc it's going to plot a vertical line axv line at x equals that position and it's going to be a bit color that's it and then you can turn on you know you can do your tick parameters and the limits and all that jazz just like before to create this little mini plot here i don't want any tick marks i just want the vertical lines there and i do want to plot off to the side the name of that phase cobalt 304 and then i do the exact same thing for my second phase here i'm going to go value by value in my hkl2 value list and plot little gray lines so that you can tell where they line up so again i think it's just a terrific way to make a complicated plot but one that is is much better than the standard like you don't want to print the screen from your felt refinement software this is the right way to show it i think okay so that's this one let's do one last one and then we're done okay let's talk a little bit about plotting 3d this comes from a really great tutorial i found on this guy's github page on data science handbook it's actually from taking from a textbook in this same family here the same family of books are really really great so it's taken from there i've put the reference up top so you can check it out and this is available under a creative commons license so credit to them for making this available i think it's good stuff so the libraries we're going to bring in from mpl toolkits we're going to import mplot 3d that's going to allow us to make these 3d plots okay let's get started so let's start with just a basic 3d scatter plot so we're going to declare our figure figure equals plt figure we didn't do the sizes here because i took it straight from their code so i didn't make it square or whatever we're going to say that projection is a 3d figure okay so let's get the data we want to plot we need to have three dimensional data so we need x y and z so for the z they do numpy lin space so that's going to linearly space the data from 0 to 15 and it's going to do 1 000 values right so it's going to produce 1 000 values linear space 0 and 15. it then takes those values and for the x and the y data it just runs those through sine and cosine okay so that is now going to be our x y and z data so plotting is as simple as this you just plot ax that's our axes right they where they find the axes here dot plot 3d where you send it x y and z and then they tell it to plot it in gray so let's go ahead and run this first cell by the way this is a cell you see how i did this number this pound sign space percentage percentage that allows you to hit control enter and it will just run what's in this cell which is kind of nice to do so now it went ahead and it created a plot sure enough it plotted um oh i ran this other stuff this first part that we've gone through so far would have just produced the gray line that did this like sort of spring shape there now how do they get these other data points well they go through and they say that z data is equal to 15 multiplied by a random number there where they produce 100 random numbers between 0 and 1. multiply those by 15. so now you've got 15 you've got 100 data points which are somewhere between 0 and 15 for your z and then they run those through sine and cosine and then they add a little bit of noise they add a small random number to each of those and that's what these data points represent now they did a scatter 3d and remember scatter plots encode four information you've got the x y this time z so it's gonna be five information it's got x xyz and then marker size and marker color so here they don't give it size so they're all going to be the same size but it does give it a color it says for color you use the z data right so as it goes from low z to high z you'll notice it's also changing color as it moves vertically so a really cool way to do it and the color palette here is nice this winter our winter in reverse color palette okay um what about contour and surface plots here's another way to do it what if you wanted to make a plot that looked like this this is another type of 3d plot okay here's how you do it they defined a function where when you feed it x and y it's going to return the sign let's see it's doing sine of the square root of x and y squared and added together so some custom math function that has this sort of shape that's what they've created here so they start by creating lists for x and y so they're data for x and y they do linear space from negative 6 to positive 6 30 values they then take those and they create a mesh grid so this is a numpy function numpy mesh grid it basically fills out the space between those you gave it x you gave it y it creates a grid in that cartesian space okay then they send those values to the function to create the z value okay and then they plot it so they come over here they tell that the projection is 3d here they're doing plot surface where they send it the x y and z values this is r stride and c stride that's the set that's the step size and the columns and the rows they give it a color palette viridis and they say no edge color and that's what produced this beautiful shape here then they labeled it x y and z and they gave it a view angle like if you wanted to work if you wanted to view it right along the 45 you could you could now we're looking right along that 45 there or you can move it back to whatever you wanted to view this by if you wanted to tilt it down a little bit so it's not so tilted up you could tilt it way down so you could look more kind of right at it that way you can change you can play with that that view angle a little bit that's one way to do it another way to plot this would be here i'm going to uncomment this would be the plot let me just run this now is to plot it as contour line so if you look at topographical maps it produces these contour lines let's look down on it let's look straight down on it actually if we look straight down on it this will now look like a topographical map oh oops we want to just run this section f9 on that section yeah and sure enough there's our topographical map of just that region which is pretty cool so that is uh that is a bunch of plots um and i'm sure that there's others you can drop a comment below if there was a type of plot that we didn't produce here today and you'd like to know how to go about doing it and i'm able to i'm happy to show you how we would go about doing that i think it's worth time because you can make some really complex but beautiful plots with this type of software so thanks for joining me as we talk about plotting and move on to our next video
Info
Channel: Taylor Sparks
Views: 7,093
Rating: undefined out of 5
Keywords:
Id: fwZahTYfyxA
Channel Id: undefined
Length: 56min 44sec (3404 seconds)
Published: Sun Jan 10 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.