Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolving inconsistencies between columns in load_player_stats() #454

Closed
1 task done
john-b-edwards opened this issue Jan 14, 2024 · 6 comments · Fixed by #470
Closed
1 task done

Resolving inconsistencies between columns in load_player_stats() #454

john-b-edwards opened this issue Jan 14, 2024 · 6 comments · Fixed by #470

Comments

@john-b-edwards
Copy link

john-b-edwards commented Jan 14, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Is your feature request related to a problem? Please describe.

There are some notable inconsistencies in how player biographical or contextual information is represented for different stat_types in load_player_stats().

nflreadr::load_player_stats(stat_type = "offense") |>
    colnames()
#>  [1] "player_id"                   "player_name"                
#>  [3] "player_display_name"         "position"                   
#>  [5] "position_group"              "headshot_url"               
#>  [7] "recent_team"                 "season"                     
#>  [9] "week"                        "season_type"                
#> [11] "completions"                 "attempts"                   
#> [13] "passing_yards"               "passing_tds"                
#> [15] "interceptions"               "sacks"                      
#> [17] "sack_yards"                  "sack_fumbles"               
#> [19] "sack_fumbles_lost"           "passing_air_yards"          
#> [21] "passing_yards_after_catch"   "passing_first_downs"        
#> [23] "passing_epa"                 "passing_2pt_conversions"    
#> [25] "pacr"                        "dakota"                     
#> [27] "carries"                     "rushing_yards"              
#> [29] "rushing_tds"                 "rushing_fumbles"            
#> [31] "rushing_fumbles_lost"        "rushing_first_downs"        
#> [33] "rushing_epa"                 "rushing_2pt_conversions"    
#> [35] "receptions"                  "targets"                    
#> [37] "receiving_yards"             "receiving_tds"              
#> [39] "receiving_fumbles"           "receiving_fumbles_lost"     
#> [41] "receiving_air_yards"         "receiving_yards_after_catch"
#> [43] "receiving_first_downs"       "receiving_epa"              
#> [45] "receiving_2pt_conversions"   "racr"                       
#> [47] "target_share"                "air_yards_share"            
#> [49] "wopr"                        "special_teams_tds"          
#> [51] "fantasy_points"              "fantasy_points_ppr"         
#> [53] "opponent_team"

nflreadr::load_player_stats(stat_type = "defense") |>
    colnames() 
#>  [1] "season"                        "week"                         
#>  [3] "player_id"                     "player_name"                  
#>  [5] "player_display_name"           "position"                     
#>  [7] "position_group"                "headshot_url"                 
#>  [9] "team"                          "def_tackles"                  
#> [11] "def_tackles_solo"              "def_tackles_with_assist"      
#> [13] "def_tackle_assists"            "def_tackles_for_loss"         
#> [15] "def_tackles_for_loss_yards"    "def_fumbles_forced"           
#> [17] "def_sacks"                     "def_sack_yards"               
#> [19] "def_qb_hits"                   "def_interceptions"            
#> [21] "def_interception_yards"        "def_pass_defended"            
#> [23] "def_tds"                       "def_fumbles"                  
#> [25] "def_fumble_recovery_own"       "def_fumble_recovery_yards_own"
#> [27] "def_fumble_recovery_opp"       "def_fumble_recovery_yards_opp"
#> [29] "def_safety"                    "def_penalty"                  
#> [31] "def_penalty_yards"

nflreadr::load_player_stats(stat_type = "kicking") |>
    colnames()
#>  [1] "season"              "week"                "season_type"        
#>  [4] "team"                "player_name"         "player_id"          
#>  [7] "fg_made"             "fg_missed"           "fg_blocked"         
#> [10] "fg_long"             "fg_att"              "fg_pct"             
#> [13] "pat_made"            "pat_missed"          "pat_blocked"        
#> [16] "pat_att"             "pat_pct"             "fg_made_distance"   
#> [19] "fg_missed_distance"  "fg_blocked_distance" "gwfg_att"           
#> [22] "gwfg_distance"       "gwfg_made"           "gwfg_missed"        
#> [25] "gwfg_blocked"        "fg_made_0_19"        "fg_made_20_29"      
#> [28] "fg_made_30_39"       "fg_made_40_49"       "fg_made_50_59"      
#> [31] "fg_made_60_"         "fg_missed_0_19"      "fg_missed_20_29"    
#> [34] "fg_missed_30_39"     "fg_missed_40_49"     "fg_missed_50_59"    
#> [37] "fg_missed_60_"       "fg_made_list"        "fg_missed_list"     
#> [40] "fg_blocked_list"

stat_type = defense lacks the column season_type for instance, and we have player_display_name and position for defense and offense but not kicking (position = K is assumed but that is not always the case, see Dare Ogunbowale's kicking exploits for example).

Describe the solution you'd like

I think we should standardize how biographical and contextual information for player stats is represented in these columns.

Describe alternatives you've considered

No response

Additional context

No response

@mrcaseb mrcaseb transferred this issue from nflverse/nflverse-data Jan 15, 2024
@mrcaseb
Copy link
Member

mrcaseb commented Jan 15, 2024

Transferred to nflfastR as we should resolve this directly in the underlying functions

@mrcaseb
Copy link
Member

mrcaseb commented Jan 15, 2024

Cross checking this and it seems like some of this has already been resolved in nflfastR. I guess we need to trigger the workflow to rebuild all data in nflverse-pbp at some point

season_type is currently missing in def so we need to add this before rebuild

pbp <- nflreadr::load_pbp(2023)

off <- nflfastR::calculate_player_stats(pbp, weekly = TRUE)
def <- nflfastR::calculate_player_stats_def(pbp, weekly = TRUE)
kick <- nflfastR::calculate_player_stats_kicking(pbp, weekly = TRUE)

colnames(off)
#>  [1] "player_id"                   "player_name"                
#>  [3] "player_display_name"         "position"                   
#>  [5] "position_group"              "headshot_url"               
#>  [7] "recent_team"                 "season"                     
#>  [9] "week"                        "season_type"                
#> [11] "opponent_team"               "completions"                
#> [13] "attempts"                    "passing_yards"              
#> [15] "passing_tds"                 "interceptions"              
#> [17] "sacks"                       "sack_yards"                 
#> [19] "sack_fumbles"                "sack_fumbles_lost"          
#> [21] "passing_air_yards"           "passing_yards_after_catch"  
#> [23] "passing_first_downs"         "passing_epa"                
#> [25] "passing_2pt_conversions"     "pacr"                       
#> [27] "dakota"                      "carries"                    
#> [29] "rushing_yards"               "rushing_tds"                
#> [31] "rushing_fumbles"             "rushing_fumbles_lost"       
#> [33] "rushing_first_downs"         "rushing_epa"                
#> [35] "rushing_2pt_conversions"     "receptions"                 
#> [37] "targets"                     "receiving_yards"            
#> [39] "receiving_tds"               "receiving_fumbles"          
#> [41] "receiving_fumbles_lost"      "receiving_air_yards"        
#> [43] "receiving_yards_after_catch" "receiving_first_downs"      
#> [45] "receiving_epa"               "receiving_2pt_conversions"  
#> [47] "racr"                        "target_share"               
#> [49] "air_yards_share"             "wopr"                       
#> [51] "special_teams_tds"           "fantasy_points"             
#> [53] "fantasy_points_ppr"
colnames(def)
#>  [1] "season"                        "week"                         
#>  [3] "player_id"                     "player_name"                  
#>  [5] "player_display_name"           "position"                     
#>  [7] "position_group"                "headshot_url"                 
#>  [9] "team"                          "def_tackles"                  
#> [11] "def_tackles_solo"              "def_tackles_with_assist"      
#> [13] "def_tackle_assists"            "def_tackles_for_loss"         
#> [15] "def_tackles_for_loss_yards"    "def_fumbles_forced"           
#> [17] "def_sacks"                     "def_sack_yards"               
#> [19] "def_qb_hits"                   "def_interceptions"            
#> [21] "def_interception_yards"        "def_pass_defended"            
#> [23] "def_tds"                       "def_fumbles"                  
#> [25] "def_fumble_recovery_own"       "def_fumble_recovery_yards_own"
#> [27] "def_fumble_recovery_opp"       "def_fumble_recovery_yards_opp"
#> [29] "def_safety"                    "def_penalty"                  
#> [31] "def_penalty_yards"
colnames(kick)
#>  [1] "season"              "week"                "season_type"        
#>  [4] "player_id"           "team"                "player_name"        
#>  [7] "player_display_name" "position"            "position_group"     
#> [10] "headshot_url"        "fg_made"             "fg_att"             
#> [13] "fg_missed"           "fg_blocked"          "fg_long"            
#> [16] "fg_pct"              "fg_made_0_19"        "fg_made_20_29"      
#> [19] "fg_made_30_39"       "fg_made_40_49"       "fg_made_50_59"      
#> [22] "fg_made_60_"         "fg_missed_0_19"      "fg_missed_20_29"    
#> [25] "fg_missed_30_39"     "fg_missed_40_49"     "fg_missed_50_59"    
#> [28] "fg_missed_60_"       "fg_made_list"        "fg_missed_list"     
#> [31] "fg_blocked_list"     "fg_made_distance"    "fg_missed_distance" 
#> [34] "fg_blocked_distance" "pat_made"            "pat_att"            
#> [37] "pat_missed"          "pat_blocked"         "pat_pct"            
#> [40] "gwfg_att"            "gwfg_distance"       "gwfg_made"          
#> [43] "gwfg_missed"         "gwfg_blocked"

@mrcaseb
Copy link
Member

mrcaseb commented Jan 15, 2024

Season type has been added to defense stats. We could define a consistent column order to finish this off

@john-b-edwards
Copy link
Author

Seems like nflverse/nflreadr#237 falls under this scope

@mrcaseb
Copy link
Member

mrcaseb commented Aug 5, 2024

I started a fresh player stats approach in #470 which should resolve all of this by computing all stats in on function

@mrcaseb
Copy link
Member

mrcaseb commented Oct 16, 2024

We will deprecate calculate_player_stats_*() functions in a future release. The new function calculate_stats() (#470 ) will fix the issue

@mrcaseb mrcaseb linked a pull request Oct 16, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants