-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
BUG: Raise OutOfBoundsDatetime in DataFrame.replace when value exceeds datetime64[ns] bounds (GH#61671) #61717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
looking into the CI failures |
Regarding CI failures — So after the changes in Because of that, Happy to revert or gate the check if needed. |
pandas/core/dtypes/cast.py
Outdated
raise OutOfBoundsDatetime( | ||
f"{right!r} overflows datetime64[ns] during dtype inference" | ||
) | ||
except (OverflowError, ValueError) as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick can you call this err
instead of e
. try to avoid 1-letter variable names
@@ -685,6 +685,7 @@ Datetimelike | |||
- Bug in :func:`tseries.frequencies.to_offset` would fail to parse frequency strings starting with "LWOM" (:issue:`59218`) | |||
- Bug in :meth:`DataFrame.fillna` raising an ``AssertionError`` instead of ``OutOfBoundsDatetime`` when filling a ``datetime64[ns]`` column with an out-of-bounds timestamp. Now correctly raises ``OutOfBoundsDatetime``. (:issue:`61208`) | |||
- Bug in :meth:`DataFrame.min` and :meth:`DataFrame.max` casting ``datetime64`` and ``timedelta64`` columns to ``float64`` and losing precision (:issue:`60850`) | |||
- Bug in :meth:`DataFrame.replace` where attempting to replace a ``datetime64[ns]`` column with an out-of-bounds timestamp would raise an ``AssertionError`` or silently coerce. Now correctly raises ``OutOfBoundsDatetime``. (:issue:`61671`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's wasn't just replace? it was also happening with .iloc
and __setitem__
?
df = pd.DataFrame([np.nan], dtype="datetime64[ns]")
df.iloc[0, 0] = datetime.datetime(3000, 1, 1)
# AssertionError: Something has gone wrong, please report a bug at https://github.com/pandas-dev/pandas/issues
maybe a more generic note and tests for the other cases?
…time64[ns] (GH#61671)
cd0dd3a
to
d41405c
Compare
Thanks for the review and suggestions @simonjayhawkins @jbrockmendel! Observed Behavior (Confirmed via Logs)
Additional Context
Next Steps from My SideBefore I add tests for these cases, just wanted to check:
Happy to add the tests once I get a bit of guidance on how we want to handle these edge cases consistently. |
Fixes a bug where
DataFrame.replace
would raise a genericAssertionError
when trying to replacenp.nan
in adatetime64[ns]
column with an out-of-boundsdatetime.datetime
object (e.g.,datetime(3000, 1, 1)
).This PR fixes that by explicitly raising
OutOfBoundsDatetime
when the replacement datetime can't safely fit into thedatetime64[ns]
dtype.Datetimelike
for 3.0.0Let me know if you'd like to test other edge cases or if there's a more idiomatic way to handle this!