- 
          
 - 
                Notifications
    
You must be signed in to change notification settings  - Fork 19.2k
 
Closed as not planned
Closed as not planned
Copy link
Labels
EnhancementNeeds TriageIssue that has not been reviewed by a pandas team memberIssue that has not been reviewed by a pandas team member
Description
Feature Type
- Adding new functionality to panda
 - Changing existing functionality in pandas
 - Removing existing functionality in pandas
 
Problem Description
I wanted to get the nunique for each column in my df, but some columns contained unhashable values like lists, so I got TypeError: unhashable type: 'list'. It would be nice if df.nunique() could skip columns like that, putting NaN for them.
I got around the problem myself like this:
def nunique_if_hashable(s: pd.Series) -> float:
    try:
        return s.nunique()
    except TypeError:
        return np.nan
df.apply(nunique_if_hashable)With a result like this:
A    0.0
B    1.0
C    3.0
D    NaN
dtype: float64
Since D contains at least one list, and lists aren't hashable, it's skipped.
Setup:
import numpy as np
import pandas as pd
df = pd.DataFrame({
    'A': [np.nan] * 4,
    'B': [1] * 4,
    'C': [5, 5, 6, 7],
    'D': [[], [], [], None]})Feature Description
I'm imagining a parameter like say skip_unhashable: bool = False that would do the equivalent of the above:
>>> df.nunique(skip_unhashable=True)
A      0
B      1
C      3
D    NaN
Alternative Solutions
The helper function I wrote above isn't so bad. It's not crucial to put this functionality in Pandas, it would just be nice is all.
Additional Context
I loaded this df from JSON.
Metadata
Metadata
Assignees
Labels
EnhancementNeeds TriageIssue that has not been reviewed by a pandas team memberIssue that has not been reviewed by a pandas team member