Open
Description
Expected Behavior
If I set n_components to 8, for example, I should be able to encode columns with up to 2^8 values.
Actual Behavior
Whatever I set n_components to is the number of unique records of the encoded columns I get. For example, if I set n_components to 32, there only ends up being 32 unique records. This obviously will not do as an encoder as there are countless collisions.
This issue does not come up with binary coding for example. I successfully can encode my high cardinality categorical column with it.
Steps to Reproduce the Problem
- fit the transform to some high cardinality categorical column
- look at the number of unique records for the col_0,..., col_7 combinations
Specifications
- Version: category_encoders 2.6.0
- Platform:
- Subsystem:
Metadata
Metadata
Assignees
Labels
No labels