Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] <title>AttributeError: np.float_ was removed in the NumPy 2.0 release. Use np.float64 instead. #98

Closed
2 tasks done
Brigidi opened this issue Sep 9, 2024 · 3 comments

Comments

@Brigidi
Copy link

Brigidi commented Sep 9, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Have you installed the latest development version of the package(s) in question?

  • I have installed the latest development version of the package.

If this is a data issue, have you tried clearing your nflverse cache?

I have cleared my nflverse cache and the issue persists.

What version of the package do you have?

0.3.2

Describe the bug

The latest version of numpy does not include np.float_, which breaks the import_pbp_data call. Another user linked to a StackOverflow post that suggested the following:

  • uninstall numpy
  • pip install "numpy<2"

This fixes the issue and allows for the function to work as intended. I wanted to make this post for transparency, so that others who have this bug have a path to solving it.

Reprex

import nfl_data_py as nfl

# explore the pbp data
df = nfl.import_pbp_data([2023])

# print the df
print(df)

Expected Behavior

Dataframe with data.

nflverse_sitrep

NA

Screenshots

No response

Additional context

No response

@connor-reidy
Copy link

Tagging onto this issue thread because I just ran into the same problem as well.

I implemented a workaround to manage dependency version incompatibilities by modifying the behavior of read_parquet() to default to using the pyarrow engine (instead of the fastparquet engine which is the default when unspecified). This allowed me to sidestep some issues without needing to deal with dependency version changes.

Here’s the modification I made at the top of the file where dataframes are read:

_read_parquet = pd.read_parquet

def patched_read_parquet(*args, **kwargs):
    kwargs['engine'] = 'pyarrow'
    return _read_parquet(*args, **kwargs)

pd.read_parquet = patched_read_parquet

I haven't tested this fix with all of the dataframe import functions but it worked for the handful I did test it on. Wanted to add it here for visibility.

@alecglen
Copy link
Member

Hey all - while @connor-reidy's patch avoids the title error, there are a ton of API changes in numpy 2.0 and subsequently in pandas 2.0. Unfortunately, some of the changes are less obvious and lead to non-erroring bugs that are difficult to track down, such as what happened in #45.

For that reason, I strongly recommend that users stick with numpy and pandas < 2 when using nfl_data_py 0.X. (the upcoming 0.3.3 release will make this a requirement).

nfl_data_py 1.0 is in the works, and it will include the upgrade to v2 of both packages once all of those little issues have been worked out.

@alecglen
Copy link
Member

Closing as the requirement of numpy < 2.0 and pandas < 2.0 is included in 0.3.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants