Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[data issue] import_ids() players missing gsis_id, but appear in import_weekly_data() with a player_id #123

Open
2 tasks done
LarryLoveIV opened this issue Nov 28, 2024 · 1 comment

Comments

@LarryLoveIV
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

If this is a data issue, have you tried clearing your nflverse cache?

I have cleared my nflverse cache and the issue persists.

What version of the package do you have?

0.3.3

Describe the bug

The following players appear in weekly data with a player_id, but no gsis_id in import_ids():

['Taylor Decker',
'Trenton Scott',
'Mike Caliendo',
'Tyler Smith',
'Vederian Lowe',
'Tanner Conner',
'Tyreik McAllister',
'Juanyeh Thomas',
'Aaron Shampklin',
'Tucker Fisk',
'Ben VanSumeren',
'Grant DuBose',
'Christopher Brooks',
'Blake Whiteheart',
'Ryan Miller',
'Brycen Tremayne',
'John Samuel Shenker',
'Wanya Morris',
'Jermaine Jackson',
'Cam Grandy',
'Bryce Oliver',
'Terrell Jennings']

This makes it difficult to join to other data sources in the package (pfr,ftn,etc.).

Reprex

import pandas as pd
import numpy as np
import nfl_data_py as nfl

weekly_df = nfl.import_weekly_data([2024])
id_df = nfl.import_ids()
weekly_df = pd.merge(weekly_df, id_df[['gsis_id', 'pfr_id']], left_on='player_id', right_on='gsis_id', how='left', indicator=True)
[i for i in weekly_df[weekly_df['_merge']=='left_only']['player_display_name'].unique()]

Expected Behavior

Would expect a query on 'left_only' to return no results.

nflverse_sitrep

NA

Screenshots

No response

Additional context

No response

@mrcaseb
Copy link
Member

mrcaseb commented Jan 13, 2025

This is actually an issue about missing gsis_id <-> pfr_id mappings of some players.

Some notes on this:

  1. Whenever you are missing an ID of a player, you can open an issue and provide us data. See https://github.com/nflverse/nflverse-players/blob/master/.github/CONTRIBUTING.md
  2. Currently, import_ids() relies on player data from https://github.com/dynastyprocess/data/raw/master/files/db_playerids.rds which is mostly fine. However, I have implemented a new players data pipeline and we will replace old players data with the new data in the upcoming offseason.
  3. @alecglen this means that you probably need to update some download urls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants