You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would love to have an optimised (or at least CUDA) implementation of the K-Prototypes algorithm (package that I use: kmodes, since a lot of data science deals with categorical data, and it would be great if I don't have to use TargetEncoders or worse, pd.get_dummies() for categorical data with a lot of categories.
Right now, the solution that I use is using a TargetEncoder on the categorical variables and then using the kmeans/knn in this package, which I feel is a little 'fix'-ey, because of numerical data being continuous and having some relations, whereas it is not necessary for the categorical variables to have any relations (greater than/less than)
The text was updated successfully, but these errors were encountered:
I would love to have an optimised (or at least CUDA) implementation of the K-Prototypes algorithm (package that I use: kmodes, since a lot of data science deals with categorical data, and it would be great if I don't have to use TargetEncoders or worse, pd.get_dummies() for categorical data with a lot of categories.
Right now, the solution that I use is using a TargetEncoder on the categorical variables and then using the kmeans/knn in this package, which I feel is a little 'fix'-ey, because of numerical data being continuous and having some relations, whereas it is not necessary for the categorical variables to have any relations (greater than/less than)
The text was updated successfully, but these errors were encountered: