Skip to content

Conversation

@justmhie
Copy link

…ference

This commit addresses issue #56397 by removing outdated references to "the selected axis" in groupby documentation and clarifying that:

  1. DataFrame.groupby() always operates along axis 0 (rows)
  2. The axis parameter was removed in pandas 3.0
  3. To group by columns, users must transpose the DataFrame first

Changes:

  • Updated API reference docstring in DataFrame.groupby() to replace "selected axis" with "number of rows"
  • Enhanced user guide to explicitly state groupby operates on axis 0
  • Added note explaining the removal of the axis parameter and the need to use .T for column-wise grouping

Fixes #56397

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

…ference

This commit addresses issue pandas-dev#56397 by removing outdated references to
"the selected axis" in groupby documentation and clarifying that:

1. DataFrame.groupby() always operates along axis 0 (rows)
2. The axis parameter was removed in pandas 3.0
3. To group by columns, users must transpose the DataFrame first

Changes:
- Updated API reference docstring in DataFrame.groupby() to replace
  "selected axis" with "number of rows"
- Enhanced user guide to explicitly state groupby operates on axis 0
- Added note explaining the removal of the axis parameter and the
  need to use .T for column-wise grouping

Fixes pandas-dev#56397
@mroeschke mroeschke requested a review from rhshadrach October 27, 2025 20:33
Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Comment on lines 140 to 142
The above GroupBy will split the DataFrame on its index (rows). DataFrame groupby
always operates along axis 0 (rows). To split by columns instead, first transpose
the DataFrame:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the addition of "always" here, clarifying that it isn't just the example above. However these two sentences seem redundant:

DataFrame groupby always operates along axis 0 (rows).

and

The above GroupBy will split the DataFrame on its index (rows).

Can you combine them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I don't see any significant difference between

To split by columns, first do a transpose:

and

To split by columns instead, first transpose the DataFrame:

This seems to me to just be a matter of personal style or taste. I generally think we should not make changes unless they are objectively positive.

Comment on lines 155 to 158
.. note::

Prior to pandas 3.0, groupby had an ``axis`` parameter. This has been removed.
To group by columns, transpose your DataFrame using ``.T`` before calling groupby.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to repeat it again here?

will be used to determine the groups (the Series' values are first
aligned; see ``.align()`` method). If a list or ndarray of length
equal to the selected axis is passed (see the `groupby user guide
equal to the number of rows is passed (see the `groupby user guide
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch!

- Combine redundant sentences about axis 0
- Keep original transpose wording
- Remove redundant note block
@justmhie
Copy link
Author

good day Mr. @rhshadrach Thanks so much for the feedback! I've updated the PR to address your comments, combined the redundant sentences and removed the repeated note. Let me know if there's anything else you'd like me to adjust!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DOC: groupby with column name

2 participants