Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Possibility to process PySpark DataFrames? #148

Open
ZlaTanskY opened this issue Feb 24, 2023 · 1 comment
Open

Feature: Possibility to process PySpark DataFrames? #148

ZlaTanskY opened this issue Feb 24, 2023 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@ZlaTanskY
Copy link
Contributor

Task: Should we have a possibility to process PySpark DataFrames?

Currently at Telenet there is a use case in which they use PySpark DataFrames and they would like to use the cobra preprocessing for creating their model. Uncertain that this is currently possible, this issue is created.

@ZlaTanskY ZlaTanskY added the question Further information is requested label Feb 24, 2023
@sandervh14
Copy link
Contributor

Hi Jano!

We have a branch spark-cobra that was once created for that.
You can try out (or can tell the person who contacted you for this) to try out if that branch can do the work.
Word of warning: this branch once was created to transform dataframes into the target encoding etc needed for the PIGs etc, but it is no longer up to date with our dev/main branch. It may still do the work needed, though. To be tried.

Sander

@sandervh14 sandervh14 added this to the 2023-03 milestone Mar 9, 2023
@sandervh14 sandervh14 modified the milestones: 2023-03, 2023-04, New features development Apr 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants