Basketball and Soccer Heatmaps in R Tutorial by Dominic Samangy

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what is up everybody uh i'm back with my third tutorial in r today um we'll be going over nba data and soccer football data specifically premier league data um and yeah so uh thank you again for everyone tuning in to this one and i've been getting some great feedback on the other ones and um it's awesome to see everyone or see some people using them and then making their own um analysis with them so um it's really cool to see that and um can't thank you enough for that so yeah i thought since uh we did shot charts i'm in basketball the other day and um the second one we did some premier league data i thought that a heat map would be pretty cool um awesome and visually pleasing sometimes if you uh i've seen a bunch on twitter and these are kind of just some that that i've made um throughout the throughout my time and figured i'd show you guys um how to do that and kind of custom customize them a bit so um let's get right into it um it's storming out really bad right now so if you hear some thunder sea lightning or rain that's that's what it is um but so first i'll just uh open up all the packages i believe these are the same ones that we use for the nba shot charts um so you just load those like like always um we could do up here run current chunk or shift command return on a mac does all of them so now that we have those loaded um similar again to the shot chart one we need to load in this chunk of code which is creates the the function to um uh what's it called to to create the court um then yeah so nba data is the same exact process uh so this time um i'm not going to go into this much for this one um because if you want go back to the the shot chart tutorial i'm going to talk a little bit more through that if you want more more of an explanation um but this time instead of doing a single game i'm going to do um look at all the shots that chris paul has taken so far in the nba playoffs um so everything is the same except this code we used last time at the end to filter by a date or a date so that gave us that game that he played on that day um we did kevin durant's historical game i'm in the playoffs but this time if we take that out and just run it this will give us the sun's 2021 season type playoffs um so then we run that and then here um basically just gives us chris paul's um i'm not gonna spend too much time in this so i'll run through it all um and i'll end up with a data frame called player on my wi-fi so yeah here we but go chris calls event data um there's a shot location right here um so yeah now let's get into this all right so this r package called palleteer um you can use to pretty much find all these color palettes for for plots that you want to make um and this goes back i learned this in thomas mock's uh i put a bunch of resources at the end on links so you guys can look through if you want but this one i learned in his um tutorial for heat maps which this is very inspired by um [Music] so yeah here we go basically just gives us a bunch of color palettes and then here's a guide i linked this to as well um these are a bunch of ones that you can use in scale fill manual um so we create our own palette here um we use the yellow orange red so i'll come back to this and show you how it works after but by loading this in the palette we can make that our color palette and the values for fill and color um but let's get to plotting so similar to the nba one clock court um there we go it's our court now here is what we're doing for the heat map so geom density 2d filled so it's basically taking all of the shot locations um and taking the the uh what's it called like okay the density of where the locations are or locations of the shots are and then coloring them so based on this we'll have darker areas of red on where the most shots are taken and yellow kind of towards the outskirts where less shots are taken so we use our player data um which we created up here the data frame um and then like i went through in the first tutorial our aesthetics so x equals x so our x and y locations and then fill equals this i got from thomas mark's blog um it pretty much just fills in like the density um i don't want to go into because i can't explain it as good as he does so go check that out um and then this is how he contours it so how he uh like it customizes it himself and then this point one to one so this basically takes out the part of the key map that just fills in where the zeros are so it kind of cleans it up a little bit makes a little better um and then alpha uh so i can run this and show you guys what this gives us so there's the heat map um here again we're gonna add our palette to it the yellow orange red um but yeah as you can see uh pretty clean looking heat map um and then the alpha is just how transparent or strong it is um so if you go one it's too bright um kind of i mean if you like you like it but i think it's a little too strong it overlays the core too much for me um so i go to 0.5 halfway in between below you can go in between point five and one but i think that's what looks the best at least in my eyes um and then we add the line of scale fill manual to tell um tell the environment to use our value of our color palette for both the fill and the color um so this will make it look like an xc map um in terms you know people think red i mean they think of heat so there we go so actually alive uh we have the density most uh populated area is yellow and then i'll i said i come back so direction equals negative one so it basically flips it from yellow to orange to red to red the orange yellow um so if you take out the negative and then run this again um it will flip all the colors so we have red is the most populated area the most dense area and it focuses out all the way down from that door it's kind of your taste whatever you like um but then i'll show you you can also do any of these um i like blues so i'll show you so if you type in capital b blues here i like to use this sometimes for teams that are blue colored um their colors are blue so then if you run that there we go um so it's like all the way from like a very faded white to a dark blue and then it could be negative one direction to flip them again um yeah so then that's from the faded white to the dark blue so i mean it's all all up to what you guys want to do but that just kind of shows you how you can customize it a bit there you go um and then all of the the theme stuff and the scale lights go y i go over that in the shot shot tutorial um so go back throw it into the p1 and again i would do the gigi draw i explained in the first tutorial don't spend too much time together there we go um that's our final output of what we get um here's our gg save function to save it um to whatever you want to call it i did cp3 heat map um with our dimensions and our resolution so pretty not simple um but once you get the hang of it and the data collection becomes easier you can start making these at a much higher rate um play with the colors a bit that type of stuff so it's just another way to look at a shot distribution um and yeah hopefully you guys enjoy this one too um so we'll get into our second part of the tutorial which is we'll be looking at uh melee data so we're actually going to be using stats bond um which is like arguably the biggest one of the biggest uh data um data companies in world football or soccer um and they've actually released throughout the past couple years their own event data to the public i'm just little bits and pieces so basically enticing the public to to analyze their own data um obviously they can't release all of it which would be awesome you could can't do that because it's proprietary to make money off of it but uh so we'll be going through that through their own stats on our package so we'll download it here these are all the old basically one of these for basketball so we'll install it from there get go get help here excuse me i'm in the library the package and then this one is a package created from fcr stats that we'll use to create our pitch um so install that library there we go now all of our packages are ready um and then here we'll we'll get our data so one of the commands from the stats bomb uh package is free competitions so this basically shows i'll do this all the games or all the competitions that they have open day that we can look into so they have champions a champions league immense championship game from each pretty much each season since 99 or a few um this is all the messy data that they released that a lot of people have been looking into but today we'll be looking at the 2018-19 championship game which is actually the championship final uh between liverpool and tottenham so here we will pipe it filter competition name champions league season 28 19. um so we get this first row based on these two columns like that back into the competition so now we just have this one and then there are other commands that is free matches so this basically you put in the composition you selected and it will spit out the data for that competition that you chose um so yeah champions league tottenham even yeah basically just describing the game um and then the next command stats bomb free event free event so this is the one that goes in and gets um the data from that match so we only have to use this data frame because there's only one game but if there wasn't if there's more than one you have to filter that out um so and then you basically just run this i wonder why let's see what this gives us let's go with the load you can name your data frames whatever um but okay so this is the event data um see second possession uh what the the action was passed ballsy duel carry their shots in there um what team has possession i'm not gonna explain all this you can go through and check it out there's like what is a 120 variable so it's a big data set um and then this they just suggest in their guide that there's a guy so i link this in the resources below but they have their own presentation on how to use their stats form data and r um so go through that for sure it's really cool to show you how to do a bunch of cool stuff but they just suggest using this uh command which i guess cleans the data i don't know what it does exactly but just run it um there we go now we have 157 columns um and 3165 rows of data of now we have just this game so we have ucl final and what we're going to do is look at liverpool's um all their passes from that game so we will take here we go ucl 19 pipe it um and filter for team name equals or double equals liverpool um and then type name equals pass for pass so this will give us so basically what you have to do is go up here and find what big value or call it pass sometimes it's a capital p sometimes it's not it can be over yeah so just go in here this is important pass and team name which i tell you who the action is for and i will send that back into the data frame so called liverpool to keep it simple there we go so 360 326 passes right here um from the game and so basically we don't want to we don't want to have any passes that weren't completed um this is where it's a little weird i think it's like column 55 so this one takes you all the way to the end this one takes you to the next 50. um yeah so pass.outcome.name this basically says what happens um to the past or what the outcome is um so they have incomplete unknown which really don't know what that means um out so out of play i think there's another one somewhere but it's basically not i think it's like off kick off that's what it is um so basically i'm assuming everything that shows up as an n a is a completed pass um it's not this might not be a perfect um reasoning um that's what i'm assuming so it's kind of it kind of collected a little weird in the stats for the r data um so what this does is we filter this liverpool data set for is.na so it runs through whatever column we put in here which is the past outcome name and if it's an n a um it keeps it so if we didn't if we wanted everything in that column but from the data set based on that column that is isn't in it we would put this estimation before but we want to keep just the rows that have that column with an nx so i'll do that and send it right back into this same data frame so it overrides it so there we go we have 202 passes now which seems reasonable uh completed passes from this game next i'm just going to go through the same data set and select only certain columns because i don't really want to look through a data frame with 157 columns so i'm just going to select um i'm not going to run all these you guys can obviously tell what i selected um if you want you can choose more choose less but i think these are good to have for making a pass heat map um and i included shots in here too in case who knows but i could also look into that at one point for another tutorial but i sent this back into a data frame added one at the end just to differentiate from the two so we're selecting all of these and sending it back now we have this with only 18 variables which is much easier to look at um so pass link pass angle please receiving the pass oh i actually didn't choose [Music] all right so now you got planning who's receiving the pass where the pass starts where where it starts x and y where ends x and y so i think we're good here now we have the data now we'll get into creating the plot so the same thing as the mba one um we create our palette with the color we want i'm using the yellow ones red again you guys can use whatever you want and that that guide i have at the end of the resources but i think just the heat the yellow red um really similar guys like feet so i've been i've been using it a ton so i like it um but yeah so we have to create our pitch now they the difference between our mba one is we created that own function with our code on this command is in the package we loaded from fcr stack so all we need to do is type the command which is create pitch right um and then it has a few things you can edit in there so grass color which is basically just the plot background of the pitch background color 15 then one color is white so run just that to show you there you go change this to you can see the difference i like all gray you can really do whatever it's a little funky so like it just does the boxes and grass color and you got to change both of them there you go um that's an awful color green but that would get darker but yeah so then the same pretty much same thing that we did in the mba one um geom density 2d filled we're doing the density plots which is basically technical term for heat map data equals liverpool one so the final data set that we made with the passes um so you could do a heat map for i could show you after this um but you could do a heat map for where the passes are played from or where they're received from um or where they're played so i'm going to do that so we kind of show where liverpool were passing to um on average or most most often um so our x variable is the x location of the end of the passes and the y variable is the end of the passes for the y um location and then i did point four for the outfit for the for the soccer one or the football one um just because it's a i don't know it looks a little different so that's what i went with like i said you guys can pretty much do it whatever but this is like a baseline i gave you um so yeah here we go um that's the basic before we do anything to it now as you can see um it kind of runs off um the field through the lines which we don't really want and we have a legend here so we'll clean that up and i'll show you how to do that so this theme legend position because none gets rid of this stuff on the right um and then scale x continuous and scale y continuous so how stats bomb or how fcr stats builds their pitch is based on um stats bonds data so stats bonds data the x locations only go from 0 to 80 and y only goes 0 to 120 so we want to cut this off so these two commands basically tells um it's telling the the plot to cut off this heat map at the ends of the pitch line so we should have that and she'll get rid of the legend of this code so there we go you can see how it basically cuts it off um and we got rid of it right so this is looking really clean all we got to do next is add the scale film scale fill manual um with the values equaling the palette that we made with the yellow orange red on the colorway add that and here we go um so i switched this one instead of the nba one i think directions just equal to one not negative one so we got yellow orange and then red being the most dots um so just another way to visualize um the density of something instead of just looking maybe in the shot chart that we did in the nba and then same stuff we're adding some themes i'm pretty sure it's the same stuff that you did here so i'll run all this oh and then the labs so the labels title equals whatever you want to call it subtitle once again then you can change this to the name or whatever i just leave it on there but um i won't run that and the way it sets up is a little funky until you save it like the titles will appear um so i will come back here and throw that into p1 and then draw p1 um basically just so it doesn't run off and i'll run it all this should save it and i'll show you guys the final thing after it's saved which it always the dimension figure itself out there we go um so like i said you can change your colorway you can change your text color the text um pretty much with anything but anything you want to do but yeah i just want to show you guys at least a base uh get a base understanding how to do it um yeah i think it looks really cool and you can use for the shots um in the data set you just have to start filtering for different things you can do it from where the passes start um try to change real quick and show you guys so we need pass end i believe it's location dot x so basically it was just um allison and ben's like just playing out of the back the whole time with but yeah so we'll go back and change it back to the location but this marks the end of the third tutorial i hope you guys um enjoyed this one too it's always fun to to do this stuff and i really enjoy it so um once again i'll throw this code on my github with um the examples on there uh make sure to check out mckay's work for piper i mean he's ago everyone knows that that stuff so make sure to check him out um and thanks again to him for for having me on the channel um and my twitter should be in the bio so if you have any questions or if anything the code feel free to reach out to me and i can hopefully help you guys out um and yeah i think that that's all i got for this one um oh and also let me know if there's anything specific you guys want to see in the comments um you know i do have some things i want to think of or i'm thinking of doing next but by all means i've got to do stuff that you guys want to see first so um definitely let me know if there's anything but yeah so thank you guys again for sitting through this with me and watching it and i'll see you guys next time
Info
Channel: McKay Johns
Views: 571
Rating: 5 out of 5
Keywords: coding, python, data science, programming, code, data analytics, sports analytics, basketball, soccer, heatmaps, tutorials for r, beginning r tutorials, r beginner, sports analytics tutorials, basketball analytics, syracuse sports analytics, soccer analytics, football analytics, heatmap tutorials, sports heatmaps
Id: rBBVSmFJqyE
Channel Id: undefined
Length: 26min 47sec (1607 seconds)
Published: Thu Jul 15 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.