library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.6 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.4 ✓ stringr 1.4.0
## ✓ readr 2.1.1 ✓ forcats 0.5.1
## Warning: package 'stringr' was built under R version 3.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(ggplot2)
library(knitr)
library(tidyr)
library(readr)
library(sp)
library(patchwork)
Football <- read_csv('https://projects.fivethirtyeight.com/nfl-api/nfl_elo_latest.csv')
## Rows: 285 Columns: 33
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): playoff, team1, team2, qb1, qb2
## dbl (27): season, neutral, elo1_pre, elo2_pre, elo_prob1, elo_prob2, elo1_p...
## date (1): date
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Football = Football %>%
mutate(elo_change = (elo1_post/elo1_pre))
I Chose to use the data on NFL teams this year from 538 https://github.com/fivethirtyeight/data/tree/master/nfl-elo. I chose this because football is favorite sport and I really enjoy looking at stats and learning about new analytics within the game, including things like predicting a games outcome or how well a player performs. This data set has a little bit of both of these.
p<-ggplot(Football) +
geom_line(aes(date, elo_prob1)) +
facet_wrap(. ~ team1) +
ylab('Home Win Prob') +
xlab('Date')
p0<-ggplot(Football) +
geom_line(aes(date, elo1_pre), col = "blue") +
geom_line(aes(date, elo2_pre), col = "red") +
facet_wrap(. ~ team1) +
ylab('Team Rating') +
xlab('Date')
p+p0
With this plot I was looking at each teams win probability when they are playing a game at home. I wanted to look at this to see how playing at home effects a teams likelihood to win. Something that surprised me was the Bengals (CIN) win probability wasn’t higher consistently since they went to the super bowl. Comparing them to teams like the Packers (GB), Bucs (TB), and Bills (BUF) they consistently had a lower win probability but finished the season as the second best team in the league. There were no preparation steps that I needed to take to create this plot. Update: I added the rating of each team playing. The red is the opponent of the team listed who is in blue. This way the teams can be compared and why the win probability is what it is. For example it makes sense that BUF has a high win probability all year since their team rating is always higher than thier opponents.
p1<-ggplot(Football) +
geom_point(aes(qb1_game_value, score1))+
facet_wrap(. ~ team1) +
ylab('Team Score') +
xlab('QB Performance Level')
p2<-ggplot(Football) +
geom_point(aes(qb2_game_value, score2)) +
facet_wrap(. ~ team1) +
ylab('Opponent Score') +
xlab('Opponent QB Performance Level')
p1+p2
In this plot I looked at how high the each teams QB performance level compared to how many points their team scored. This was interesting because the quarterback is considered to be the most important player to a team so if a teams quarterback plays well they should score more points. What I found surprising is this isn’t always the case. There are teams like the Packers and Bengals where the graph is quite linear showing that the better the quarterback played the more they scored. However, with teams like the Bucs, other than one outlier, they scored a pretty consistent high amount of points regardless of quarterback play. Then there are teams like the Cowboys (DAL) and Pats (NE) scored a wide range of points at the same level of quarterback performance. Overall there are few teams that quarterback play really seems to have a great affect on scoring. For this plot I also did not need to manipulate the data. Update: I added how many points opponents scored against the team listed and the opponents QB performance level. This allows for the view to get an idea on how good the offense and defense of a team are based on the points for and against. Then it also allows the viewer to see if teams played a lot of good or bad quarterbacks in the season.