-
-
Couldn't load subscription status.
- Fork 19.2k
DOC: Clarify groupby operates on axis 0 and remove 'selected axis' re… #62853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ference This commit addresses issue pandas-dev#56397 by removing outdated references to "the selected axis" in groupby documentation and clarifying that: 1. DataFrame.groupby() always operates along axis 0 (rows) 2. The axis parameter was removed in pandas 3.0 3. To group by columns, users must transpose the DataFrame first Changes: - Updated API reference docstring in DataFrame.groupby() to replace "selected axis" with "number of rows" - Enhanced user guide to explicitly state groupby operates on axis 0 - Added note explaining the removal of the axis parameter and the need to use .T for column-wise grouping Fixes pandas-dev#56397
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
doc/source/user_guide/groupby.rst
Outdated
| The above GroupBy will split the DataFrame on its index (rows). DataFrame groupby | ||
| always operates along axis 0 (rows). To split by columns instead, first transpose | ||
| the DataFrame: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the addition of "always" here, clarifying that it isn't just the example above. However these two sentences seem redundant:
DataFrame groupby always operates along axis 0 (rows).
and
The above GroupBy will split the DataFrame on its index (rows).
Can you combine them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I don't see any significant difference between
To split by columns, first do a transpose:
and
To split by columns instead, first transpose the DataFrame:
This seems to me to just be a matter of personal style or taste. I generally think we should not make changes unless they are objectively positive.
doc/source/user_guide/groupby.rst
Outdated
| .. note:: | ||
|
|
||
| Prior to pandas 3.0, groupby had an ``axis`` parameter. This has been removed. | ||
| To group by columns, transpose your DataFrame using ``.T`` before calling groupby. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to repeat it again here?
| will be used to determine the groups (the Series' values are first | ||
| aligned; see ``.align()`` method). If a list or ndarray of length | ||
| equal to the selected axis is passed (see the `groupby user guide | ||
| equal to the number of rows is passed (see the `groupby user guide |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch!
- Combine redundant sentences about axis 0 - Keep original transpose wording - Remove redundant note block
|
good day Mr. @rhshadrach Thanks so much for the feedback! I've updated the PR to address your comments, combined the redundant sentences and removed the repeated note. Let me know if there's anything else you'd like me to adjust! |
…ference
This commit addresses issue #56397 by removing outdated references to "the selected axis" in groupby documentation and clarifying that:
Changes:
Fixes #56397
doc/source/whatsnew/vX.X.X.rstfile if fixing a bug or adding a new feature.