Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No clean way to return empty dataframe #96

Open
ishmandoo opened this issue Jan 27, 2021 · 2 comments
Open

No clean way to return empty dataframe #96

ishmandoo opened this issue Jan 27, 2021 · 2 comments

Comments

@ishmandoo
Copy link

Right now the only way to return an empty dataset or null result is to construct an empty Spark frame. This is kind of clunky to do.

Might it make sense to change session.read to work on a variable set to None and interpret it as an empty dataframe?

@acroz
Copy link
Owner

acroz commented Jan 27, 2021

Hi Ben, thanks for the suggestion!

Could you help me to understand your use case a little better? It's not clear to me in what situation you'd have a variable in a (presumably PySpark) session set to None and wish that to be interpreted as an empty DataFrame when you attempt to download it. In this suggestion, the differentiation between None (no dataframe at all) and an empty result would be lost, which seems valuable to keep.

@ishmandoo
Copy link
Author

My understanding is that trying to read a variable whose value is None will result in an error. For my application, I sometimes want to return a null result that will be interpreted as an empty dataframe. Right now I'm building an empty spark dataframe to return like spark.createDataFrame([], T.StructType([])). I was hoping to avoid having to do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants