class: inverse, center, middle # Using R for Your Big Data Cup Project <img src="figs/hanic/puck.png" width="250px"/> ### A gentle intro and some tips & tricks Meghan Hall <br>
[@MeghanMHall](https://www.twitter.com/MeghanMHall) <br> [meghan.rbind.io](https://meghan.rbind.io/) <br> [Hockey (Analytics) Night in Canada](https://www.hanic-analytics.com) <br> Feb. 18, 2021 --- background-image: url(figs/hanic/puck.png) background-position: 98% 2% background-size: 114px 104px padding-right: 10px layout: true --- ## Can I teach you R in 10 minutes? ...no 😞 -- But we *can*... - talk about what's easier to do in R - go over some common functions you might use for this project - discuss a roadmap for learning more <br> -- **Disclaimer** <br> *The speaker herein pledges as follows: no declarative statements will be uttered about how R is objectively "the best" and no moral judgments will be made on which software, IDE, programming language, packages, dark mode, etc. the viewer chooses to use and/or not use.* **tl;dr** <br> I don't care what you use for your project! Use Excel! Use Python! Use SAS! Use an abacus! -- <br> But these are my 10 minutes, so we're going to talk about R. --- ## Step 0: How does one use R, exactly? 1️⃣ the language itself: [https://cloud.r-project.org/](https://cloud.r-project.org/) <br> 2️⃣ the IDE (integrated development environment): [https://rstudio.com/products/rstudio/download/](https://rstudio.com/products/rstudio/download/) --- ## Step 1: Data! ```r library(tidyverse) library(janitor) scouting <- read_csv("https://tinyurl.com/BDCscouting") %>% clean_names() nwhl <- read_csv("https://tinyurl.com/BDCnwhl") %>% clean_names() women <- read_csv("https://tinyurl.com/BDCwomens") %>% clean_names() ``` --- ## Step 1: Data! ```r *library(tidyverse) library(janitor) scouting <- read_csv("https://tinyurl.com/BDCscouting") %>% clean_names() nwhl <- read_csv("https://tinyurl.com/BDCnwhl") %>% clean_names() women <- read_csv("https://tinyurl.com/BDCwomens") %>% clean_names() ``` --- ## Step 1: Data! ```r library(tidyverse) library(janitor) *scouting <- read_csv("https://tinyurl.com/BDCscouting") %>% clean_names() *nwhl <- read_csv("https://tinyurl.com/BDCnwhl") %>% clean_names() *women <- read_csv("https://tinyurl.com/BDCwomens") %>% clean_names() ``` --- ## Step 1: Data! ```r library(tidyverse) *library(janitor) scouting <- read_csv("https://tinyurl.com/BDCscouting") %>% * clean_names() nwhl <- read_csv("https://tinyurl.com/BDCnwhl") %>% * clean_names() women <- read_csv("https://tinyurl.com/BDCwomens") %>% * clean_names() ``` --- ## Step 2: Explore ```r *View(scouting) *scouting %>% * glimpse() scouting %>% count(event) scouting %>% count(event, detail_1) ``` --- ## Step 2: Explore ```r View(scouting) scouting %>% glimpse() *scouting %>% * count(event) scouting %>% count(event, detail_1) ``` <table class="table" style="font-size: 14px; width: auto !important; "> <thead> <tr> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> event </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #5C164E !important;"> n </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Dump In/Out </td> <td style="text-align:right;"> 4888 </td> </tr> <tr> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:right;"> 2441 </td> </tr> <tr> <td style="text-align:left;"> Goal </td> <td style="text-align:right;"> 293 </td> </tr> <tr> <td style="text-align:left;"> Incomplete Play </td> <td style="text-align:right;"> 8890 </td> </tr> <tr> <td style="text-align:left;"> Penalty Taken </td> <td style="text-align:right;"> 419 </td> </tr> <tr> <td style="text-align:left;"> Play </td> <td style="text-align:right;"> 23778 </td> </tr> <tr> <td style="text-align:left;"> Puck Recovery </td> <td style="text-align:right;"> 20667 </td> </tr> <tr> <td style="text-align:left;"> Shot </td> <td style="text-align:right;"> 4887 </td> </tr> </tbody> </table> --- ## Step 2: Explore ```r View(scouting) scouting %>% glimpse() scouting %>% count(event) *scouting %>% * count(event, detail_1) ``` <table class="table" style="font-size: 14px; width: auto !important; "> <thead> <tr> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> event </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> detail_1 </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #5C164E !important;"> n </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Dump In/Out </td> <td style="text-align:left;"> Lost </td> <td style="text-align:right;"> 4143 </td> </tr> <tr> <td style="text-align:left;"> Dump In/Out </td> <td style="text-align:left;"> Retained </td> <td style="text-align:right;"> 745 </td> </tr> <tr> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Backhand </td> <td style="text-align:right;"> 2179 </td> </tr> <tr> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Feet </td> <td style="text-align:right;"> 17 </td> </tr> <tr> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Forehand </td> <td style="text-align:right;"> 245 </td> </tr> <tr> <td style="text-align:left;"> Goal </td> <td style="text-align:left;"> Deflection </td> <td style="text-align:right;"> 17 </td> </tr> <tr> <td style="text-align:left;"> Goal </td> <td style="text-align:left;"> Slapshot </td> <td style="text-align:right;"> 8 </td> </tr> <tr> <td style="text-align:left;"> Goal </td> <td style="text-align:left;"> Snapshot </td> <td style="text-align:right;"> 148 </td> </tr> </tbody> </table> --- ## Step 3: A Question From the `scouting` data, among players who've taken at least 50 faceoffs, who has the best faceoff percentage? -- (Is this the most *rigorous* Big Data Cup question? No, but I only have 10 minutes!) -- <table class="table" style="font-size: 14px; width: auto !important; "> <thead> <tr> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> game_date </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> team </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> player </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> event </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> player_2 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Sudbury Wolves </td> <td style="text-align:left;"> Blake Murray </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Connor Lockhart </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Sudbury Wolves </td> <td style="text-align:left;"> Macauley Carson </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Austen Swankler </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Sudbury Wolves </td> <td style="text-align:left;"> Quinton Byfield </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Chad Yetman </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Sudbury Wolves </td> <td style="text-align:left;"> Macauley Carson </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Austen Swankler </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Sudbury Wolves </td> <td style="text-align:left;"> Quinton Byfield </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Chad Yetman </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Erie Otters </td> <td style="text-align:left;"> Alex Gritz </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Blake Murray </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Erie Otters </td> <td style="text-align:left;"> Connor Lockhart </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Ethan Larmand </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Erie Otters </td> <td style="text-align:left;"> Austen Swankler </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Macauley Carson </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Erie Otters </td> <td style="text-align:left;"> Chad Yetman </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Quinton Byfield </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Erie Otters </td> <td style="text-align:left;"> Brendan Hoffmann </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Blake Murray </td> </tr> </tbody> </table> --- ## Step 3: A Question From the `scouting` data, among players who've taken at least 50 faceoffs, who has the best faceoff percentage? (Is this the most *rigorous* Big Data Cup question? No, but I only have 10 minutes!) <table class="table" style="font-size: 14px; width: auto !important; "> <thead> <tr> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> game_date </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> team </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> player </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> event </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> player_2 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Sudbury Wolves </td> <td style="text-align:left;"> Blake Murray </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Connor Lockhart </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Sudbury Wolves </td> <td style="text-align:left;"> Macauley Carson </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Austen Swankler </td> </tr> <tr> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> 2019-09-20 </td> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> Sudbury Wolves </td> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> Quinton Byfield </td> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> Faceoff Win </td> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> Chad Yetman </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Sudbury Wolves </td> <td style="text-align:left;"> Macauley Carson </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Austen Swankler </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Sudbury Wolves </td> <td style="text-align:left;"> Quinton Byfield </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Chad Yetman </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Erie Otters </td> <td style="text-align:left;"> Alex Gritz </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Blake Murray </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Erie Otters </td> <td style="text-align:left;"> Connor Lockhart </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Ethan Larmand </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Erie Otters </td> <td style="text-align:left;"> Austen Swankler </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Macauley Carson </td> </tr> <tr> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> 2019-09-20 </td> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> Erie Otters </td> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> Chad Yetman </td> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> Faceoff Win </td> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> Quinton Byfield </td> </tr> <tr> <td style="text-align:left;"> 2019-09-20 </td> <td style="text-align:left;"> Erie Otters </td> <td style="text-align:left;"> Brendan Hoffmann </td> <td style="text-align:left;"> Faceoff Win </td> <td style="text-align:left;"> Blake Murray </td> </tr> </tbody> </table> --- ## Step 3: A Question <table class="table" style="font-size: 14px; width: auto !important; "> <thead> <tr> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> player </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #5C164E !important;"> faceoffs </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #5C164E !important;"> faceoff_wins </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #5C164E !important;"> faceoff_perc </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Ty Dellandrea </td> <td style="text-align:right;"> 55 </td> <td style="text-align:right;"> 35 </td> <td style="text-align:right;"> 0.6363636 </td> </tr> <tr> <td style="text-align:left;"> Cole Schwindt </td> <td style="text-align:right;"> 57 </td> <td style="text-align:right;"> 36 </td> <td style="text-align:right;"> 0.6315789 </td> </tr> <tr> <td style="text-align:left;"> Cam Hillis </td> <td style="text-align:right;"> 104 </td> <td style="text-align:right;"> 64 </td> <td style="text-align:right;"> 0.6153846 </td> </tr> <tr> <td style="text-align:left;"> Lucas Theriault </td> <td style="text-align:right;"> 52 </td> <td style="text-align:right;"> 29 </td> <td style="text-align:right;"> 0.5576923 </td> </tr> <tr> <td style="text-align:left;"> Keean Washkurak </td> <td style="text-align:right;"> 55 </td> <td style="text-align:right;"> 29 </td> <td style="text-align:right;"> 0.5272727 </td> </tr> <tr> <td style="text-align:left;"> Danny Zhilkin </td> <td style="text-align:right;"> 67 </td> <td style="text-align:right;"> 34 </td> <td style="text-align:right;"> 0.5074627 </td> </tr> <tr> <td style="text-align:left;"> Austen Swankler </td> <td style="text-align:right;"> 334 </td> <td style="text-align:right;"> 169 </td> <td style="text-align:right;"> 0.5059880 </td> </tr> <tr> <td style="text-align:left;"> Ryan McGregor </td> <td style="text-align:right;"> 72 </td> <td style="text-align:right;"> 36 </td> <td style="text-align:right;"> 0.5000000 </td> </tr> <tr> <td style="text-align:left;"> Hayden Fowler </td> <td style="text-align:right;"> 310 </td> <td style="text-align:right;"> 149 </td> <td style="text-align:right;"> 0.4806452 </td> </tr> <tr> <td style="text-align:left;"> Chad Yetman </td> <td style="text-align:right;"> 724 </td> <td style="text-align:right;"> 343 </td> <td style="text-align:right;"> 0.4737569 </td> </tr> <tr> <td style="text-align:left;"> Connor Lockhart </td> <td style="text-align:right;"> 207 </td> <td style="text-align:right;"> 92 </td> <td style="text-align:right;"> 0.4444444 </td> </tr> <tr> <td style="text-align:left;"> Noah Sedore </td> <td style="text-align:right;"> 68 </td> <td style="text-align:right;"> 30 </td> <td style="text-align:right;"> 0.4411765 </td> </tr> <tr> <td style="text-align:left;"> Brendan Hoffmann </td> <td style="text-align:right;"> 488 </td> <td style="text-align:right;"> 197 </td> <td style="text-align:right;"> 0.4036885 </td> </tr> <tr> <td style="text-align:left;"> Elias Cohen </td> <td style="text-align:right;"> 149 </td> <td style="text-align:right;"> 57 </td> <td style="text-align:right;"> 0.3825503 </td> </tr> </tbody> </table> --- ## Step 3: A Question ```r faceoffs <- scouting %>% filter(event == "Faceoff Win") %>% select(player, player_2) %>% rename(winner = player, loser = player_2) %>% pivot_longer(winner:loser, names_to = "status", values_to = "player") %>% mutate(win = ifelse(status == "winner", 1, 0)) %>% group_by(player) %>% summarize(faceoffs = n(), faceoff_wins = sum(win)) %>% mutate(faceoff_perc = faceoff_wins / faceoffs) %>% filter(faceoffs >= 50) %>% arrange(desc(faceoff_perc)) ``` --- ## Step 3: A Question ```r *faceoffs <- scouting %>% filter(event == "Faceoff Win") %>% select(player, player_2) %>% rename(winner = player, loser = player_2) %>% pivot_longer(winner:loser, names_to = "status", values_to = "player") %>% mutate(win = ifelse(status == "winner", 1, 0)) %>% group_by(player) %>% summarize(faceoffs = n(), faceoff_wins = sum(win)) %>% mutate(faceoff_perc = faceoff_wins / faceoffs) %>% filter(faceoffs >= 50) %>% arrange(desc(faceoff_perc)) ``` --- ## Step 3: A Question ```r faceoffs <- scouting %>% * filter(event == "Faceoff Win") %>% * select(player, player_2) %>% rename(winner = player, loser = player_2) %>% pivot_longer(winner:loser, names_to = "status", values_to = "player") %>% mutate(win = ifelse(status == "winner", 1, 0)) %>% group_by(player) %>% summarize(faceoffs = n(), faceoff_wins = sum(win)) %>% mutate(faceoff_perc = faceoff_wins / faceoffs) %>% filter(faceoffs >= 50) %>% arrange(desc(faceoff_perc)) ``` --- ## Step 3: A Question ```r faceoffs <- scouting %>% filter(event == "Faceoff Win") %>% select(player, player_2) %>% * rename(winner = player, * loser = player_2) %>% pivot_longer(winner:loser, names_to = "status", values_to = "player") %>% mutate(win = ifelse(status == "winner", 1, 0)) %>% group_by(player) %>% summarize(faceoffs = n(), faceoff_wins = sum(win)) %>% mutate(faceoff_perc = faceoff_wins / faceoffs) %>% filter(faceoffs >= 50) %>% arrange(desc(faceoff_perc)) ``` --- ## Step 3: A Question ```r faceoffs <- scouting %>% filter(event == "Faceoff Win") %>% select(player, player_2) %>% rename(winner = player, loser = player_2) %>% * pivot_longer(winner:loser, * names_to = "status", * values_to = "player") %>% mutate(win = ifelse(status == "winner", 1, 0)) %>% group_by(player) %>% summarize(faceoffs = n(), faceoff_wins = sum(win)) %>% mutate(faceoff_perc = faceoff_wins / faceoffs) %>% filter(faceoffs >= 50) %>% arrange(desc(faceoff_perc)) ``` --- ## Step 3: A Question <table class="table" style="font-size: 14px; width: auto !important; "> <thead> <tr> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> status </th> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> player </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> winner </td> <td style="text-align:left;"> Blake Murray </td> </tr> <tr> <td style="text-align:left;"> loser </td> <td style="text-align:left;"> Connor Lockhart </td> </tr> <tr> <td style="text-align:left;"> winner </td> <td style="text-align:left;"> Macauley Carson </td> </tr> <tr> <td style="text-align:left;"> loser </td> <td style="text-align:left;"> Austen Swankler </td> </tr> <tr> <td style="text-align:left;"> winner </td> <td style="text-align:left;"> Quinton Byfield </td> </tr> <tr> <td style="text-align:left;"> loser </td> <td style="text-align:left;"> Chad Yetman </td> </tr> <tr> <td style="text-align:left;"> winner </td> <td style="text-align:left;"> Macauley Carson </td> </tr> <tr> <td style="text-align:left;"> loser </td> <td style="text-align:left;"> Austen Swankler </td> </tr> <tr> <td style="text-align:left;"> winner </td> <td style="text-align:left;"> Quinton Byfield </td> </tr> <tr> <td style="text-align:left;"> loser </td> <td style="text-align:left;"> Chad Yetman </td> </tr> </tbody> </table> --- ## Step 3: A Question ```r faceoffs <- scouting %>% filter(event == "Faceoff Win") %>% select(player, player_2) %>% rename(winner = player, loser = player_2) %>% pivot_longer(winner:loser, names_to = "status", values_to = "player") %>% * mutate(win = ifelse(status == "winner", 1, 0)) %>% group_by(player) %>% summarize(faceoffs = n(), faceoff_wins = sum(win)) %>% mutate(faceoff_perc = faceoff_wins / faceoffs) %>% filter(faceoffs >= 50) %>% arrange(desc(faceoff_perc)) ``` --- ## Step 3: A Question ```r faceoffs <- scouting %>% filter(event == "Faceoff Win") %>% select(player, player_2) %>% rename(winner = player, loser = player_2) %>% pivot_longer(winner:loser, names_to = "status", values_to = "player") %>% mutate(win = ifelse(status == "winner", 1, 0)) %>% * group_by(player) %>% * summarize(faceoffs = n(), * faceoff_wins = sum(win)) %>% mutate(faceoff_perc = faceoff_wins / faceoffs) %>% filter(faceoffs >= 50) %>% arrange(desc(faceoff_perc)) ``` --- ## Step 3: A Question ```r faceoffs <- scouting %>% filter(event == "Faceoff Win") %>% select(player, player_2) %>% rename(winner = player, loser = player_2) %>% pivot_longer(winner:loser, names_to = "status", values_to = "player") %>% mutate(win = ifelse(status == "winner", 1, 0)) %>% group_by(player) %>% summarize(faceoffs = n(), faceoff_wins = sum(win)) %>% * mutate(faceoff_perc = faceoff_wins / faceoffs) %>% * filter(faceoffs >= 50) %>% arrange(desc(faceoff_perc)) ``` --- ## Step 3: A Question ```r faceoffs <- scouting %>% filter(event == "Faceoff Win") %>% select(player, player_2) %>% rename(winner = player, loser = player_2) %>% pivot_longer(winner:loser, names_to = "status", values_to = "player") %>% mutate(win = ifelse(status == "winner", 1, 0)) %>% group_by(player) %>% summarize(faceoffs = n(), faceoff_wins = sum(win)) %>% mutate(faceoff_perc = faceoff_wins / faceoffs) %>% filter(faceoffs >= 50) %>% * arrange(desc(faceoff_perc)) ``` --- ## Step 3: A Question <table class="table" style="font-size: 14px; width: auto !important; "> <thead> <tr> <th style="text-align:left;font-weight: bold;color: white !important;background-color: #5C164E !important;"> player </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #5C164E !important;"> faceoffs </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #5C164E !important;"> faceoff_wins </th> <th style="text-align:right;font-weight: bold;color: white !important;background-color: #5C164E !important;"> faceoff_perc </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Ty Dellandrea </td> <td style="text-align:right;"> 55 </td> <td style="text-align:right;"> 35 </td> <td style="text-align:right;"> 0.6363636 </td> </tr> <tr> <td style="text-align:left;"> Cole Schwindt </td> <td style="text-align:right;"> 57 </td> <td style="text-align:right;"> 36 </td> <td style="text-align:right;"> 0.6315789 </td> </tr> <tr> <td style="text-align:left;"> Cam Hillis </td> <td style="text-align:right;"> 104 </td> <td style="text-align:right;"> 64 </td> <td style="text-align:right;"> 0.6153846 </td> </tr> <tr> <td style="text-align:left;"> Lucas Theriault </td> <td style="text-align:right;"> 52 </td> <td style="text-align:right;"> 29 </td> <td style="text-align:right;"> 0.5576923 </td> </tr> <tr> <td style="text-align:left;"> Keean Washkurak </td> <td style="text-align:right;"> 55 </td> <td style="text-align:right;"> 29 </td> <td style="text-align:right;"> 0.5272727 </td> </tr> <tr> <td style="text-align:left;"> Danny Zhilkin </td> <td style="text-align:right;"> 67 </td> <td style="text-align:right;"> 34 </td> <td style="text-align:right;"> 0.5074627 </td> </tr> <tr> <td style="text-align:left;"> Austen Swankler </td> <td style="text-align:right;"> 334 </td> <td style="text-align:right;"> 169 </td> <td style="text-align:right;"> 0.5059880 </td> </tr> <tr> <td style="text-align:left;"> Ryan McGregor </td> <td style="text-align:right;"> 72 </td> <td style="text-align:right;"> 36 </td> <td style="text-align:right;"> 0.5000000 </td> </tr> <tr> <td style="text-align:left;"> Hayden Fowler </td> <td style="text-align:right;"> 310 </td> <td style="text-align:right;"> 149 </td> <td style="text-align:right;"> 0.4806452 </td> </tr> <tr> <td style="text-align:left;"> Chad Yetman </td> <td style="text-align:right;"> 724 </td> <td style="text-align:right;"> 343 </td> <td style="text-align:right;"> 0.4737569 </td> </tr> <tr> <td style="text-align:left;"> Connor Lockhart </td> <td style="text-align:right;"> 207 </td> <td style="text-align:right;"> 92 </td> <td style="text-align:right;"> 0.4444444 </td> </tr> <tr> <td style="text-align:left;"> Noah Sedore </td> <td style="text-align:right;"> 68 </td> <td style="text-align:right;"> 30 </td> <td style="text-align:right;"> 0.4411765 </td> </tr> <tr> <td style="text-align:left;"> Brendan Hoffmann </td> <td style="text-align:right;"> 488 </td> <td style="text-align:right;"> 197 </td> <td style="text-align:right;"> 0.4036885 </td> </tr> <tr> <td style="text-align:left;"> Elias Cohen </td> <td style="text-align:right;"> 149 </td> <td style="text-align:right;"> 57 </td> <td style="text-align:right;"> 0.3825503 </td> </tr> </tbody> </table> --- ## Step 4: A graph! <img src="figs/hanic/unnamed-chunk-24-1.png" width="504" style="display: block; margin: auto;" /> --- ## Step 4: A graph! ```r faceoffs_team %>% ggplot(aes(x = reorder(player, faceoff_perc), y = faceoff_perc, fill = team)) + geom_bar(stat = "identity") + coord_flip() + scale_fill_manual(values = c("#F2A900", "#FF6720", "#862633", "#00205B", "#010101", "#C8C9C7")) + labs(title = "Faceoff Percentages", subtitle = "Among players with 50+ faceoffs, from Big Data Cup scouting data set") + ylab("Faceoff Win Percentage") + geom_text(aes(label = scales::percent(faceoff_perc, accuracy = 0.1)), family = "Seravek", hjust = -0.15) + scale_y_continuous(labels = scales::percent_format(accuracy = 1), limits = c(0, 0.72)) + hanic_theme() + theme(legend.title = element_blank(), legend.position = "bottom", panel.grid.major.y = element_blank(), axis.title.y = element_blank()) ``` --- ## Step 4: A graph! ```r faceoffs_team %>% * ggplot(aes(x = reorder(player, faceoff_perc), * y = faceoff_perc, fill = team)) + * geom_bar(stat = "identity") + * coord_flip() + scale_fill_manual(values = c("#F2A900", "#FF6720", "#862633", "#00205B", "#010101", "#C8C9C7")) + labs(title = "Faceoff Percentages", subtitle = "Among players with 50+ faceoffs, from Big Data Cup scouting data set") + ylab("Faceoff Win Percentage") + geom_text(aes(label = scales::percent(faceoff_perc, accuracy = 0.1)), family = "Seravek", hjust = -0.15) + scale_y_continuous(labels = scales::percent_format(accuracy = 1), limits = c(0, 0.72)) + hanic_theme() + theme(legend.title = element_blank(), legend.position = "bottom", panel.grid.major.y = element_blank(), axis.title.y = element_blank()) ``` --- ## Step 4: A graph! ```r faceoffs_team %>% ggplot(aes(x = reorder(player, faceoff_perc), y = faceoff_perc, fill = team)) + geom_bar(stat = "identity") + coord_flip() + * scale_fill_manual(values = c("#F2A900", "#FF6720", "#862633", * "#00205B", "#010101", "#C8C9C7")) + * labs(title = "Faceoff Percentages", * subtitle = "Among players with 50+ faceoffs, from Big Data * Cup scouting data set") + * ylab("Faceoff Win Percentage") + * geom_text(aes(label = scales::percent(faceoff_perc, * accuracy = 0.1)), * family = "Seravek", hjust = -0.15) + * scale_y_continuous(labels = scales::percent_format(accuracy = 1), * limits = c(0, 0.72)) + * hanic_theme() + * theme(legend.title = element_blank(), * legend.position = "bottom", * panel.grid.major.y = element_blank(), * axis.title.y = element_blank()) ``` --- ## Learn More If you want to stick with the hockey theme: - [my workshop from CMSAC last year](https://meghan.rbind.io/talk/cmsac/) - [interactive tutorials in RStudio](https://github.com/meghall06/betweenthepipes) ```r install.packages("devtools") devtools::install_github("meghall06/betweenthepipes") # look in the Tutorial pane in the upper-right of RStudio # or...run: betweenthepipes::intro() betweenthepipes::data_manip() ``` --- ## Learn More If you want to stick with the hockey theme: - [my workshop from CMSAC last year](https://meghan.rbind.io/talk/cmsac/) - [interactive tutorials in RStudio](https://github.com/meghall06/betweenthepipes) .center[<img src="figs/hanic/screenshot.png" width="700px"/>] --- ## Learn More If you want to stick with the hockey theme: - [my workshop from CMSAC last year](https://meghan.rbind.io/talk/cmsac/) - [interactive tutorials in RStudio](https://github.com/meghall06/betweenthepipes) ```r install.packages("devtools") devtools::install_github("meghall06/betweenthepipes") # look in the Tutorial pane in the upper-right of RStudio # or...run: betweenthepipes::intro() betweenthepipes::data_manip() ``` Elsewhere: - [R for Data Science](https://r4ds.had.co.nz/) ⭐ - [R for Excel Users](https://rstudio-conf-2020.github.io/r-for-excel/) - [STAT 545](https://stat545.com/) --- class: inverse, center, middle layout: false # Thanks!
[@MeghanMHall](https://www.twitter.com/MeghanMHall) <br> [meghan.rbind.io](https://meghan.rbind.io/) <br> Slides created via the R package [xaringan](https://github.com/yihui/xaringan).