Tidy Tuesday live screencast: Analyzing European energy in R

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi i'm dave robinson and welcome to another screencast i'll be using r and r studio to analyze data i've never seen before as usual the data comes to the tidy tuesday project an amazing weekly data project in our from the art for data science online learning community if you're tuning in live i'm excited to see you here i really love the interactive component to these live screencasts so as as i'm going through these um this announced be sure to ask questions uh be sure to uh suggest ideas of um of uh of the visualizations that i do i won't always have time to get all of them but i uh really like getting getting your engagement that's why i do these live so let's see what we have today the only thing i know about this data set is it is european energy let's find out what's in the data all right ooh uh it's cool a really cool graphic i don't like to look at other people's graphics because i like to uh i like to try and come up with my own but uh let's see coming from an excel based data product tidy format oh man so much great work done by done by uh the artford data science community to clean these data sets and i'm going to use the tide to choose they are package uh to to kick off our um our project in fact first i will start with my nope ah here's my fresh session and i'll start by doing a tidy tidy tuesday tuesday hour um and i'm remembering how it's called use tidy templates up so it creates today's uh tidy tuesday template and i uh load up the i load up tidy verse tidy tuesday are i'm also gonna do theme set theme light i'm also gonna do library scales i always feel like i need all of these and then i'm going to load in today's data set looks like there's two data sets that we're looking at today one is um european is oakland is there readme and one is energy types and one is country totals let's look at them energy types there's country code uh country is there time there's time for three years i might be tidying this because it's hard to work with um these three separate years level one and level two where level two makes up level one that makes up the total i don't understand what that uh what that means uh either total all right it doesn't look like there's total level one or level two that doesn't quite look right uh either i don't see any level so i'm not sure oh wait hold on okay level limited to level one or two production by type conventional thermal fossil fuels nuclear high okay level and type let's try counting level and type and see what what they aha so two in this case is just pumped hydro power okay so the description doesn't seem to quite be right but all right the um uh this is what we were looking at we were seeing like conventional thermal geothermal hydro nuclear i don't understand what i really don't understand what the levels are pumped hydropower maybe that's a subset of hydro i'm not at 100 clear it's energy in gigawatt hours all right is this potential like this for an entire country if so i kind of would have thought that would be maybe in thousands of gigawatt hours or something like that uh that feels i don't know how what is a let's find out actually uh belgium gigawatt hours uh-huh okay this is suggesting no i could be wrong look at this like what's a giggle alright so let's think for a second what is a gig watt hour okay a um i am it's been a very long time since college when last took like it's a power consumption of one watt for one hour okay it's like a 60 watt bulb for one hour would take 60 watt hours it's a measure of electricity production giga is a billion so this is suggesting thousands of billions each month or what would what would this be in belgium like in a year 30 000 okay it's plausible all right so we're looking to then add here's our types of energy consumption all right so i was wrong that it's not necessarily like millions of gigawatt hours something 30 000 gigawatt hours of conventional thermal 40 000 of nuclear all right great now we understand um now now i'm trying to get a real feel to understand what i get i trust this data okay great gigawatt hours type country name what are our type what are our countries i'm actually gonna energy types is tt energy types uh i don't need to do any i don't know of any cleaning i'll do yet why do i do that because it makes the auto complaining a little bit easier and i'll do country name oh 37 countries okay great that's uh as i imagine most of the countries in europe and i have uh codes for them so i can make a graph a part of me on math so what would we want what will we do today we only have three years so i'm probably not an atom so we're probably doing math because we have countries we only have three years 2016 to 2018 so not an animation i don't think it's that interesting to do a um animation with only three only three years of data uh we're definitely going to be doing a graph of energy consumption total over time that's the things we get some of the things we get uh it's consumption by type over time we might make a shiny visualization should we do shiny because it might be need to be interactive uh we could i didn't consider that uh all right but that's some of the things we can uh we can try doing the other is going and the other one is we have country totals i think that might just be a variation on this so on the energy so if we say energy oops pt country totals uh and count level all right that's just total uh all right and oh count type the types here are different they're not pilot i see okay it's like energy absorbs supplied exports inputs so energy applied is net plus import minus export minus energy absorbed by pumpkin all right so we might look at things like we might look at um look at uh exports imports et cetera okay that's that's gonna be interesting all right let's get started so uh so what i'm going to do first is look at total energy consumption uh here we go i'm total energy consumption in europe uh in at least 37 european countries so what i'll do is look at our energy types i like to gather uh in fact i'm going to do this early on i'm going to do this as part of our cleaning because i'm not crazy about having these as three separate ones when everything else is tidy uh so i'm gonna actually do pivot wider uh probably pivot longer i want to be longer i'm practicing doing names two is year and uh and say names to his year and uh may and values to oh the columns are uh vars well they are all the ones that let's see starts with two that's the way i can just say get the 2016 ones the values go the names go to year and the values go to uh value gigawatt actually i'll say gigawatts gigawatt hours all right that didn't work i suspect because hmm let me try gather real quick what if i tried gather gather year gigawatt hours by us ba and everything that starts with two oh hey i think i might have um i may have done that one wrong let's find out all right that one worked great now if i said calls equal starts with two would that have worked i'm still learning to use pivot longer yes it didn't need vars great okay i'll only i'll leave the pivot longer and i'll also say year equals as integer of year because it's not a it's not a string it's a it's not character it's integer all right now the um now we have it in a tidy form i'm going to keep that that and now we'll look at group by year and height and we'll summarize total power equals some gigawatt hours uh and um i'm thinking what other things you might be interested in i don't really care about things like average power um maybe i got some ideas on variation but i'm not going to do them yet this is like the the first graph you probably want to make with this data would be something like year total power fail equals type and uh look at the how the consumption has changed if we had a large amount of time this might look a little more exciting what it does look like is there's more uh there's the amount total amount of power produced pretty steady a little bit more wind than there used to be and a little bit less conventional thermal uh looks like all right i'm also going to reorder this i'm going to say either ungroup because otherwise it'll still be grouped and i have to say type is fcp reorder type by total power based on the sum uh it doesn't necessarily need that it could be still pop up right and notice now i have this in this order where the biggest ones pop are on the bottom this is a little bit more readable what else would i do for this graph i would put the scale y continuous labels is comma that comes with the scales package where i want to say the total power i'm also going to put some i'm going to make this a little bit clearer by saying total power in uh gigawatt hours i don't know the difference between the hydro and pumped hydropower what is that what is the thing about level two the level two makes up level one that makes up the total does someone understand that i have a good understanding of that [Music] uh i'm just worried like maybe pumped hydropower should be subtracted from hydropower or something like that i'm not that uh i'm not that clear maybe i should look only at level one pumped hydro is a sub measure of hydro okay based on that i'm actually okay yes i understand i come from an excel spreadsheet i'm actually going to say level must be level one and drop pump hydro i could have done it as a subset of this uh i don't think it's quite worth uh things that that uh suggestion i don't think it's quite worth putting it in this visualization i think we're fine as we are right here okay so this is total power we get watt hours over time uh and this is about as far as i would go with this uh with this graph we learned that conventional thermal i think that means things like coal uh followed by nuclear followed by hydra are the largest sources of power we could have also looked at this uh but in just we could have visualized this for the bar plot in 2018 and it would look something like this we would have said uh we would have still kept the type ordering but then i would have said total power i need to do the grouping yes what i'll say is is uh punch is uh totals and i'll say europe totals uh and i'll do this let's see in fact i'll be doing that on both steps i'll put this into the as a step in europe totals okay so i have two steps here one i get the energy totals one i i visualize it and now here we go i total power by type gm call i've already and i changed this to here i don't have level in there but i'm actually going nope i do europe totals i have level filter so i don't need it anymore i do need a year filter all right and uh now i could just say okay what was the total power us and um i could have i could have says it that way in that uh scalax tenuous labels equals comma format there's comma and total power consumption in oh i said consumption it's production isn't it i keep scrolling the wrong way uh production yes production in europe uh 20 and 2018 so if i just want to look at the totals i have looked at this way bar plot stacked bar plot all good another way i could have done the stat bar plot would been the facet i say facet wrap by type drop the fill and i could choose to scale the reason i want to do this if i want to compare to look at each of them are they increasing uh so what i might do is free up the y-axis and um yeah this gives a little bit of a sense of it's like uh and then yeah this gives a little sense of like okay other geothermal solar which is how they're changing nuclear steady conventional thermal might have gone down a little bit also the good news from a sustainability perspective is that wind is increasing also interesting other is decreasing a bit all right so that's uh looking at the the europe totals as far as i probably want to go with europe totals i'm gonna start looking by country if you have questions or ideas feel free to throw them in i'm gonna always i'm gonna have filter for level one oops height equals nope it's level equals level i've got a question uh nuclear nuclear is steady wind is increasing so is geothermal these are both uh known as renewable energy sources so is solar so three renewable energy sources have been increasing what countries have been driving that um that that measure that that's uh that sounds a bit interesting so let's start looking by country i'm gonna start by saying which countries have the most uh make up the most um wind power now one second just thinking about this how i'm gonna look at this i'm gonna start by saying um filter for type i'm going to focus in on one let's look focusing on wind and i'm going to say wind and i'm going to start just by saying one year i'm not going to keep it at that for long 37. so now i'm going to say country name uh and i'll say mutate country name is fcp reorder oh i know what i'm going to do it's going to be last all right so country name gigawatt hours and gigawatt hours country name gm paul this is 37 countries uh oh i saw this at congress a second ago which is that the uk is listed as as rna thanks for that data cleaning note that's really helpful to show what that's referring to it looks like when country oops country name is an a becomes country uh so i'm going to do that back here in this cleaning step i'm going to say replace an a list country name this one tidy r is united kingdom and now we can say it again looks way better germany produces the most gigawatt hours uh of any country ooh i've got all kinds of videos when i'm gonna be here uh i'm sorry from wind of anyone i'm actually gonna want to facet this i know what i'm gonna do i'm gonna want to facet this by type because it might be interested in multiple of them a challenge that this one's into is each is going to have a different order here i'm also going to want to scale free why because not every country produces every type it's still not and oh somewhere very low i can actually free up both axes and while i'm at it i'm going to say type is fcp reorder type by gigawatt hours some and i'll do negative because i want to order with the biggest ones up here uh and i'm gonna kind of like that uh so uh problem here is that there's so many things on each access no good what i'm going to do is go with only the largest country so someone's just the top and 10 of them for each i could absolutely do top end but i can also do other so here's what i'm going to do i'm actually going to say i am now the choice do i do top end within each facet or only overall only um or do or do top end overall no i'm going to be lumpy within each facet so within each type here's what i'm going to say i'm going to say um country name is fcp lump country name uh buy gigawatt hours oh i'm sorry weight the top 10 weighted by gigawatt hours uh now if i do this it's actually going to give me a oh it's funny it didn't give me a warning oh because the levels are the same for each why would the levels be the same for each something interesting i didn't expect that uh i group by type i expected these to be different uh and then expected it to give me a warning about reuniting them hmm well uh it seems they are different but uh hmm okay what i'm doing uh but any case what i'm doing is i'm lumping them together and now see how this looks i think there should be another level here we are here's the other level within this aha one of the problems we have ties uh ties are those ties zero uh probably because i see uh zeros so one thing i want to do here is actually say gigawatt hours i don't need the zero uh i don't need the zeros they're definitely not interesting on a graph and they're just going to make the top end be really misleading uh so this is already a better visualization it is not a perfect visualization uh because the order is the same it is the same for all of these that uh but and the order is the same follow these but the items on each are different that's a recipe for confusion and uh anybody know what i would use for this didn't give you folks any time to answer because the tidy text package provides scale um revised for country name reorder within reorder within type and then scale y reordered means that i can say uh reorder the categories within each of these um interestingly the other always ends up on its own oh yes it's because it uses the ad it ends up at the end i want to say uh reorder and the function function is sum so reorder within takes fun argument for here we go here we go here's our uh our types of um i'm gonna clean this up a little bit because it's a neat looking graph and we're gonna say scale x continuous commas yes someone nailed reorder within in the chat but um but there's always a delay uh before uh when i'm speaking to when you're seeing it uh but that was an awesome answer um so what we're doing here is here we go all right so it looks like who produces the most conventional thermal energy uh well it's um germany and turkey wow this is so interesting i did not i really did not uh i would i would have expected this ordering to be much more consistent between countries uh but it looks like for example germany produces tons of conventional thermal very little uh hydro and um uh also so long conventional thermal and solar but it's relatively low on some of the other categories france is the only country that produces a large amount of nuclear uh power all right so um this is interesting uh i'm gonna when i'm doing a facet i like to draw i like to have it go by six so i'm gonna cheat a little bit here and say type is not equal to other drop the other kind of energy even though it was interesting that it was only what were those two countries that produced other basically just austria produces other but i'm still gonna drop it it's the least interesting of these uh there's i'm still gonna drop it it's least interesting and now i can say all right here's your i'm gonna clean this up a little bit i don't need a y-axis cool okay so then we say all right so for example very few countries use geothermal basically just turkey and italy now nobody's commented that this is not per capita that is true these aren't these aren't um this isn't like how much is produced per capita and it's likely that larger countries are producing more but one thing i'll say from looking at this it's not clear that population is the most important factor here for energy production because i'm actually looking at this like norway is kinda nowhere on any of these lists there's really more variation in total energy production by country than i expected i guess it's because countries do a lot of exports it's not like it's it's not uh the the breakdown of their energy production is not necessarily driven by um by their population that's really uh that's interesting that's not what i predicted going into this analysis i thought we'd have to bring in some per capita data we could still look for capital maybe we'll do that when we get to looking at imports and exports i think that might be more relevant for now this is a cool graph i had a thought how could make it even cooler i saw a suggestion a few minutes ago in chat about the gigi flag package i've never heard of the g flag package but uh my guess is it's put flags into uh into a g g plot two and that i'd love to do that let's um put uh try gg flag so is it in crap nope just going the wrong way it's a proof of concept uh this is it looks quite cool um and hold on it's forked am i looking at the wrong one uh i'm not sure here this one has more stars i'm going to trust this one all right and uh try gigi flag i'm gonna do dev tools install github gg flags i am behind and updating this machine puts in a comma so you don't run it by accident a comment i should say and here's gg flags uh gg flags i'm guessing has a geom flag let's look at the documentation looks like it takes a country level all right that's actually that's really cool uh what i'm going to do is is take this plot that we already made uh load in gigi flags gt flag here we go gg flags and gem call geom flag on top of that aes country equals country uh that is the country code ah let's try that one more time sometimes that pops up nope is it possible let's see i gave it an x i gave it a y i gave it a country let me double check the documentation xy country i'm gonna try just giving it size equals something nope didn't help uh let's all right so here's what happens when we're trying something out especially something that's a little bit experimental the way that i would go about debugging it it's possible we can't fix it uh but the way that i go about it is i go to the documentation and i try running their example oh ho scale country what's that ah that's the issue i i think scale country nope that wasn't the problem uh okay so their example works and the example looks pretty nifty uh let's look at this again let's look at that example again uh so x y country and they put in a size and quickly check again if i say size equals gigawatt hours it doesn't help if i add a size uh all right what i'll do then is try this without fastening uh so what i do is i oh wait maybe oh country is a factor could that cause a problem nope that wasn't the problem all right what i do now is i create a minimal reproducible example jenny has a really good example jenny bryan does a lot of great um talks and documentation on how to create a minimal reproducible example in this case what i'm going to do is filter for one year and one type i'm going to say type must be nuclear i got a question of is it country equals country name and i'm pretty confident no that it actually doesn't need these to be um that it needs these to be country codes based on the example notice it says it has these country codes but i will notice those country codes are lowercase that wasn't that wasn't the problem either all right so i'm thinking this i'm going to g plot aes a year by gigawatt hours sure um the year will always be the same it doesn't matter gm point works geom flag country equals i want to be the country code didn't work country equals string to lower of country see now put making these in lower case didn't work now i'll try just a few countries did any of these countries work yes i'm going to start to suspect the problem is that when i introduce a country it can't recognize uh that it breaks there alright the i'm doing what's called binary search i'm literally just looking for which one does it break on uh so it's not a helpful error message but that's okay because we're able to just figure it out the problem was in row 14. slice and d player gets us 14. oh okay uh i'm pretty sure uk is is um that the iso two character is in fact i think great britain uh they're not exactly the same so that's not perfect but uh we might not have a flag for uk whatever flag for gb uh so what i'm gonna do is say here we go mutate country uh uh oh yeah i'm gonna fix it just within this example i'm gonna say um f country is fcp recode country uh gb equals uk see our place with gb has a look now that step works uh but i have a problem of i want to check out all my types not just the 15 that are in here so yeah uk is not included but i think gb does seem to work it's taking a minute it's taking uh it's time how much data is this i didn't think it was that big just slowing me down a little yeah so the um the story then is it's it's cut we figured out the problems countries that are missing and i can use ftv code to replace them ooh something's going slow okay and these are the flags all right uh then what i'm gonna do is all right it just hit me that other is going to be a problem for us uh so here's what i'm going to do i'm going to say here's my data prepared oh i'm going to test a few other things do i need to have the country be lowercase gosh something is slowing me down i think that gm flag ah hmm oh weird to me that it worked i don't understand how that one worked all right doesn't need to be lowercase might not even need to be character uh but i do i know that i do need um all right the story is that i do need i i'm going to create this visualization but i need other not to uh i need other not to appear uh it not not to be a country i think that's gonna break it uh so i'm gonna let me check here country name or a new country country oh i i did country name uh lump all right i'll need to do the same thing on country but luckily it'll it'll work the same way and i need country equals fct recode country gb is uk so doing a little bit of data clean here nope nope uh country is not it's never never gb inside to me filter ism uk uh country oh it's uh it's not country name is it it's country country is uk oh i see it we need a group by all right can i tell us kiwi code not to warn me i don't want to be warned every time i cannot do that all right there's a simple approach don't use ftp recode use if else country is uk then gb otherwise countries here we go uh notice the country name is going to be this is so that the reorder within works uh all right and country is country let's try this once all right the pro i suspect here the problem is the other category i'm gonna actually say country let me see i'm gonna say country is if else country is other than n a else country it does not somewhere it does not care i think it does not care for the n a i'm not 100 sure if i just well let's find out something if i just made this i don't know uh what's uh what's the country code yes for spain lots of little bits of cleaning i got to do here uh if i just made this this other b1 be like yes huh uh count country on group count country what if i just made it maybe i got i maybe got a yes wrong maybe it was jeremy no still not working what if i tried saying country is not equal to other i've got this bug back what if i try saying country is i thought that i thought that i checked this no all right what if i tried randomly subsetting our data oops figure out what my missing country is this is called binary search i'm sorting searching within each what is 14 i'm confused oh ungroup el el is greece iso huh uh it looks like it should probably be data prepared filter uh ungroup huh that looks like a coding issue oh i thought that i saw something about this getting fixed but um looks like it wasn't quite fixed let's say if country is el then gr i think we might finally have done this all right really this is great i'm gonna do two more things i'm gonna drop the um the legend we do not need the legend because we have the countries here uh on the on the axes and i'm going to do one second i'm removing the legend and the second thing i'm gonna do is i don't like qm call for this i like geom error bar h uh where we say x min is um yeah we're actually says x min is zero x max is gigawatt hours why ah actually i don't even need this i can actually just say what am i doing i can do call width equals 0.1 and get a thinner line for each of these does that not work what did i just change one sec oh hmm all right well what did i learn for this ah i don't need size legend position all right so problem with this it looks like a problem with this is that i need scale country uh wow this really has been an adventure hasn't it in all right uh one moment please uh-huh and doesn't work with everything it worked a minute ago didn't it i'm going to go back to when it worked because it worked back then am i missing a country name i thought i fixed all of them i wonder in ggflag do they have a list somewhere of all the countries that they have probably but i can't find it i remember a wondrous moment it was me it was seconds perhaps but it but it did it felt like it happened where this visualization did work didn't even catch my bug i don't know what my bug is i did country i lumped it i take this and i if i try head honey head five does this need to be up here or something nope doesn't change a thing country if i try making this yeah let me see string the lower country oh that was the problem wow oops oops oh that was exhausting uh sorry folks all right that's close to working i'm going to try two things try removing the scale can i do that looks like i can alright i was just confused when i thought they couldn't i did need the country to be lower case i do not think i need the size why did your width equals 1 because i kind of like that 0.1 i kind of like this graph you know a flag goes off the edge all right this is a flag graph i've never made a graph like this before you and i'm sure no dave you but you seem to make it so quickly and without any trouble i never made this before this is a really neat package i wish that it gave a more helpful error message but other than that it's like it's a um this is a cool way to just like oh yeah to get flags of uh of countries into my visualization uh okay i'm also going to make this look a little better by saying theme minimal i'm gonna try and then maybe yeah theme minimal theme panel dot grid equals element panel.grid y i don't want to have oh panel grid major y my i'll say y is element blank what i'm doing is removing some of the lines trying to make this look a little bit neater and say oh i missed a plus cool so this is gigi flat uh i hadn't tried before and it's kind of fun uh so i got a question from someone which that is um which is why is uh i could remove the names the columns yes we could remove the names the columns but uh uh if we did it was yes but but i think it would end up being a little hard to read because we'd have to know every flag um but yes it is true that the flags are redundant with the names of the columns all right that's pretty cool uh all right and um yeah so this is this is a little uh a flag plot uh all right that was an adventure you know what i'd love to have flags anytime we do correlations of a network of countries having these flags pop up as individual points then connected in gigiraf that'd be that would be really cool but that would take some work to get genome node point to work with flags uh maybe a little experimentation all right i also showed how i go about debugging um a g flag and that is that i would go about it poorly that's entirely on me all right so let's do slingos i'm really interested in changes over time uh i'm interested in energy types i didn't even i didn't even make a map today that's all right let me see i'm interested in changes 2016 versus 2018. notice here i say who makes the most say nuclear energy i might be interested in what is what how did that differ between um uh 2016 and 2018. so what i would actually do is i'd say gigawatt hours greater than zero and i'd say filter for year in 2016 2018 and uh and type let's say nuclear what i now get is 30 data points from two years 2016 and 18 often from the same countries and what i can do is actually plot a slope graph i'm thinking ahead of you because i want to kind of show what has been increasing what has been decreasing across a lot of different countries and what i can say here is year y axis gigawatt hours geom point heck while i'm doing and let's make it a scale x y log 10 to spread the points out a little bit more heck while i'm at it i absolutely could do a gigi flag do i do a gigi flag do i do a gt flag yes but what i'm going to do is i'm going to put these steps up in my cleaning so i don't have to do them again and now i can delete this this is a proof of concept and now i can say country equals country i need also um country is string to lower of country notice that now gm flag is useful because i don't really have enough space to put text in every one of these oh i don't need this line anymore oh i'm doing the wrong thing i'm doing the wrong graph i want to do it on this one great i'm gonna also say labels equals comma for that y-axis and i'm going to say um scale x continuous breaks are i only need 2016 and 2018. and the other thing that i really need to do and is the most important part is under the geom flag and for gm line grouped by country this is a slope graph so called because we can see ah this is the slope of uh how much nuclear power has been produced in this year versus this year within one country this is a log scale so that's quite a reduction i'm very i'm not good at all with my with my uh european flag so i don't know which country this is uh could it be i don't even want to guess because i'm going to sound silly uh what i'm going to do is along with the flag i'm going to do a geom text equals country name country name and the justice 1 check overlap equals true who was saying it's belgium oh i forgot a plus this is a slope graph um and i did a v just kind of meant to do an h just all right and here's a trick here is i kind of want the uh so this this does not look good yet i kind of want it on the outsides i want the this looks okay maybe a little bit farther and i need the um two things i need first i need to extend the limits uh to be 2015 to 2019 just to make sure i can see everything uh second oh second i want theme i hate this uh panel grid this element blank i like that one bit better uh and then uh but the most important thing that i need is i need to have the h just differ between these two is h just a a geom or is an aes or is it a um is it an aesthetic yes h truss is an aesthetic that means i didn't actually know i know that which means i can actually say if else year is 2016 then one else zero let me remind myself is that the right order is it the am i getting it backwards yeah that's close uh and now i can actually say what i'm doing is i'm ensuring that they're adjusted on each side we nudged in a positive and negative direction uh and yeah here we go so it's like ah but let's i don't like i don't like that one but united kingdom does that have um immediate filter um i already did the cleaning mm-hmm [Music] what was it called filter all right i renamed it gb united kingdom whole country name now the problem is not that it's not like it has extra space is it or anything uh oh maybe it's um it's just i don't know what that is oh well uh i could also change the x uh x instead of the h just that is really true i'm going to try that it's a little bit more direct this is a good suggestion from a deep uh what i'm going to do is look at 2016 20 ah no i don't i don't want that because i need this text to be right justified uh and that's uh yes i'm going to want the text to be right justified uh for this so i'm actually going to leave the h just check overlay yeah check overlap doesn't move the text uh for uh gg repel would move the text but that's not the problem i think it's something to do with the justification it's something to do with the justification but it's i think it's basically one point that the doing like 1.1 is not exactly supported uh that's fine uh all right so that i think it's time so what i'm going to do is is this is just one visualization slope graph i'm going to tidy this up it's not 100 perfect it's pretty close i don't even hear any gigawatt hours produced in this year nuclear i'm actually not gonna title not gonna title that all right yeah i'm actually going to do something else which is i'm going to include all three years to see whether it's continuous or what but i'm not going to put text on the middle one so i'm going to say if else year is 17 2017. n a otherwise country name i mixed i missed some stuff here they missed a plot up i missed the syntax issue okay where is my syntax poem it's there this parenthesis there haha i like it it's pretty cool uh other thing i'm going to do about with this actually is turn this into a function uh because and how how do i make this a function i'm going to say visual plot slope graph it takes a data set and then it pipes it through it it applies this table to it and returns the resulting ggplot which i can then add to i can say title nuclear power production over time by country all right that looks pretty neat and now um now we want we want to try some of the other ones that were changing so how can we tell what was changing we can look back at our previous visualization not that one uh not this one i don't know this is the one and say oh it's actually uh it looks like wind and geothermal had substantial differences what countries were driving that other two but we saw there's a kind of turkey i'm not i'm not as interested i don't know what other even it means geothermal solar wind geothermal solar wind all right let's look at this who's been driving wind that's a junk data point uh i'm going to change gigawatt hours greater than zero to be given what i was greater than one there was at some point there was like .01 or something like that bad data it happens and look at that it's um one moment so on a log scale you always keep in mind that like yes georgia went up went up a lot on this scale but it went from 10 to 100 watt hours georgia wasn't driving a lot of the change it looks to me like germany and united kingdom would drive with a change let's try not putting it on a log scale and actually i'm going to i'm going to change this code so we have a choice for analog scale after the fact increasing wind energy my suspicion is a lot of this is driven by germany yeah and this gets it a little bit more across actually i might keep it this way um and say scale y continuous labels equals comma cool this isn't even a slope graph anymore now that i'm thinking about it because it is sort of a time graph but it does kind of help communicate some things uh that's neat um all right and yeah shame of the united kingdom being a little bit off-center it happens with the adjustment yeah it looks like germany united kingdom were driving the wind power production let's try another one i said i said the solar was also increasing i'm trying this without a log go all right and it looks like solar power a lot is is driven by germany so my assumption is that germany has that has a um uh has been going through a renewable energy uh uh direction in the in 2016 to 2018. uh turkey as well that's really cool uh turkey is increasing solar power production uh all right i see a comment that it is totally true which is that we can try this on multiple facets it's going to be a little bit crowded uh but the um one i can try is just take let's say solar wind what my other types are distinct type conventional thermal because that's the kind of the the default that might have been decreasing and let's say nuclear now let's say hydro because that's a renewable one and we can facet wrap by by type scales is pretty wide this is going to be too crowded uh when we zoom in it's gonna be a tiny bit better but uh i'm doing four to try and be a little bit less crowded uh but we probably still are gonna have to do some some messing around with it hmm is taking its time to run that to create that graph oh well i'm wow that just that uh it just took some time to create this yeah g flag i think is a little slow in loading these these flags uh so that's why it's taking time i'm still gonna zoom in and we'll see what it what it looks like um i see a couple of great suggestions that we won't have time to do today uh because we're coming up in the hour one of them is using g is using uh jiji highlight to highlight particular countries and avoid the others um and um yeah see i've seen other uh good suggestions but i'm not um not gonna have a chance to visualize them even this might not be able to pop up i probably could hmm tell you what oh there it goes there it goes it's gonna it's gonna take forever to recruit yeah there's too many things here uh but i'll tell you what what i'll do for this is i'm gonna say here we go group by type filter fct lump country by uh let me say this is not others it's gonna say take only the top eight yeah top ten weighted by um the gigawatt hours is i'll try just five from each what i do is pick only the top few within each wow it really is uh slowing down my my graph rendering but yeah you get the sense here what we can do with this is um is uh create slow graphs that show changes over time communicate some information about them the points are kind of neat because when you don't have text to give you a little bit of information uh but most of this they're kind of fun but uh yeah uh all right so just to conclude while this is uh we'll see if this ever renders what do we do today we didn't we learned to do a lot of data cleaning of these country names uh we did some gathering on the gigawatt hours over time we didn't make a map but we looked at it at a couple ways of visualizing consumption over time really important ways of like getting some orientation into our data set then we looked at we loaded up the g flags package we learned to use gd flash package and more generally when you run into an unfamiliar bug how do you uh how do you figure out what it is diagnose it and work around it uh and then we uh we tried a couple different graphs both a uh kind of a flag plot of a geom uh of showing the the top producers of each type of energy and then we did some looking at changes over time a lot of other things we could have done we didn't bring in we could have used the world the wdi package to bring in country populations normalize this per capita we could have created a map uh just so many things we uh we could have done but i hope you tried yourself on this really fun tidy tuesday data set looks like my our studio's crashed so that is where we leave it thanks so much for joining hoping a good time i certainly did if you liked it please subscribe to my youtube channel and i'll see you next week
Info
Channel: David Robinson
Views: 2,950
Rating: 5 out of 5
Keywords:
Id: Rcmu5e-9FSc
Channel Id: undefined
Length: 61min 26sec (3686 seconds)
Published: Wed Aug 05 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.