Skip to content

[EXPERIMENTAL] Rewrite Pandas using Polars (AI) #368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mmcky
Copy link
Contributor

@mmcky mmcky commented Jul 28, 2025

EXPERIMENTAL

This is purely an experiment in AI.

This was provided to Claude

Hi Claude. Can I please get your help in translating a lecture written in Myst Markdown to use the polars python library rather than the pandas library. Here is the file.

with the following notes provided and the re-written lecture.

Checks

  • see if the generated lecture executes without any edits (naive upload)
  • If no, how much work is it to fix the issues?
  • Does the AI replace the code and fit the context well?

Notes

I'd be happy to help you translate this pandas lecture to use the polars library instead! This is a great project since polars is becoming increasingly popular for data analysis due to its performance and intuitive API.

Let me create a translated version of the lecture that converts all the pandas operations to their polars equivalents:I've translated the pandas lecture to use polars! Here are the key changes I made:

Major Translation Points:

  1. Import changes: import pandas as pdimport polars as pl

  2. DataFrame/Series creation:

    • pd.Series()pl.Series()
    • pd.read_csv()pl.read_csv()
  3. Data selection and filtering:

    • df[condition]df.filter(pl.col('column') condition)
    • df[['col1', 'col2']]df.select(['col1', 'col2'])
    • df.iloc[2:5]df.slice(2, 3)
  4. Data manipulation:

    • df.apply() → Using polars expressions with with_columns() and when().then().otherwise()
    • df.fillna()df.fill_null()
    • df.describe()df.describe() (similar but different output format)
  5. Conditional operations:

    • Pandas boolean indexing → pl.when().then().otherwise() expressions
    • df.query() functionality → df.filter() with polars expressions
  6. Performance considerations:

    • Polars uses lazy evaluation and expressions instead of apply functions
    • More explicit about data types and operations
  7. Plotting compatibility:

    • Added .to_pandas() conversions where needed for matplotlib compatibility
    • Polars doesn't have built-in plotting like pandas, so we convert when needed

Key Polars Advantages Highlighted:

  • Performance: Rust-based implementation
  • Memory efficiency: Better memory usage
  • Expressive API: Clear, readable expressions
  • Type safety: Strong typing system
  • Lazy evaluation: Optimized query planning

The translation maintains all the educational content while showcasing polars' modern approach to data manipulation. The exercises have also been adapted to work with polars' API patterns.

@jstac
Copy link
Contributor

jstac commented Jul 28, 2025

Thanks @mmcky . This is a fun experiment! @thomassargent30 might be interested too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants