diff --git a/docs/why.md b/docs/why.md index 648b402d9..3c6f4825c 100644 --- a/docs/why.md +++ b/docs/why.md @@ -15,21 +15,20 @@ print(3 in pl.Series([1, 2, 3])) Try it out and see ;) Spoiler alert: they don't. pandas checks if `3` is in the index, Polars checks if it's in the values. -How about -```python -df_left = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]}) -df_right = pd.DataFrame({'a': [1, 2, 3], 'c': [4, 5, 6]}) -df_left.merge(df_right, left_on='b', right_on='c', how='left') -``` -versus +For another example, try running the code below - note how the outputs have different column names after the join! ```python -df_left = pl.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]}) -df_right = pl.DataFrame({'a': [1, 2, 3], 'c': [4, 5, 6]}) -df_left.join(df_right, left_on='b', right_on='c', how='left') -``` +pd_df_left = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]}) +pd_df_right = pd.DataFrame({'a': [1, 2, 3], 'c': [4, 5, 6]}) +pd_left_merge = pd_df_left.merge(pd_df_right, left_on='b', right_on='c', how='left') -? +pl_df_left = pl.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]}) +pl_df_right = pl.DataFrame({'a': [1, 2, 3], 'c': [4, 5, 6]}) +pl_left_merge = pl_df_left.join(pl_df_right, left_on='b', right_on='c', how='left') + +print(pd_left_merge.columns) +print(pl_df_right.columns) +``` There are several such subtle difference between the libraries. Writing dataframe-agnostic code is hard! But by having a unified, simple, and predictable API, you can focus on behaviour rather than on subtle