betweenthepipes is an R package I developed that currently holds two tutorials, created with
learnr, to facilitate learning the tidyverse through hockey data. There are also two sample data sets that are used in the tutorials (and are useful for learning to work with hockey data).
First, download this package via Github:
- Introduction to R with Hockey Data. A beginner-friendly introduction to R and the tidyverse with sample hockey data. Introduces the basic tidyverse functions: filter(), select(), arrange(), filter(), mutate(), group_by(), and summarize().
- More Data Manipulation. Going further into data manipulation with details on pivoting data (using pivot_longer() and pivot_wider()), joining data, and working with strings.
Once the package has been downloaded, there are two options to access the tutorial. You can access each tutorial individually with the following code:
library(betweenthepipes) intro() data_manip()
Or, if you have an RStudio version 1.3 or later, there should be a Tutorial pane in the upper right corner (near Environment and Git). That pane should list all the tutorials available from the packages you’ve downloaded.
There are two data sets available in this package:
pbp_example is a data set containing NHL play-by-play data for four Philadelphia Flyers games from November 2019.
bio_example is a data set containing some NHL biographic data from 2019, useful for practicing joins with the data in
pbp_example. More information on the data sets is available with
In October 2020, I gave a tidyverse-focused workshop at the Carnegie Mellon Sports Analytics Conference using the data available in this package. The slides and code from the workshop are available here.
- Posted on:
- July 1, 2020
- 2 minute read, 298 words
- See Also: