Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace pandas with polars #486

Open
nleroy917 opened this issue Jun 6, 2024 · 0 comments
Open

Replace pandas with polars #486

nleroy917 opened this issue Jun 6, 2024 · 0 comments

Comments

@nleroy917
Copy link
Member

I want to bring up the idea of replacing pandas with polars. I can think of three reasons why this would be beneficial:

Processing speed

polars is much faster. @khoroshevskyi has been investigating this and adoption of polars could drastically speed up the time it takes to process PEPs on the PEPhub server, enabling real-time edits to PEPs.

It's hard to find unbiased, fair comparisons especially considering the polars hype, but this post does a pretty good job highlighting some of the large improvements.

Import speed

From my own experimentation, importing polars is almost 4 times faster than importing pandas. This would work to improve things like the looper cli import issues: pepkit/looper#476

Interface with genimtools

Genimtools is native-Rust with pyo3 bindings. polars follows this model as well. Because of this, the integration of peppy objects with genimtools becomes seamless. In fact, there is an entire crate maintained by the polars group dedicated to this interface.

This sets the stage for processing PEPs and their data in genimtools, further improving server speeds for real time PEP editing. eido comes to mind as a potential bottleneck with real-time PEP editing.

Potential downsides

I think some downsides to such a switch are:

  • polars is new, and not as "battle-tested" as pandas.
  • polars breaks down when you want to do data visualization as libraries like matplotlib don't natively support it.
  • time invested in a refactor of the sample table in peppy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant