Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pl.Array + pl.lit PanicException Cannot apply operation on arrays of different lengths #18831

Closed
2 tasks done
cmdlineluser opened this issue Sep 20, 2024 · 7 comments
Closed
2 tasks done
Assignees
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars

Comments

@cmdlineluser
Copy link
Contributor

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl

df = pl.DataFrame({
    "A": [[0.1, 0.2], [0.3, 0.4]]
}).cast(pl.Array(float, 2))

df.select(pl.all() * pl.lit([3, 5], pl.Array(float, 2)))
# thread '<unnamed>' panicked at crates/polars-core/src/chunked_array/ops/arity.rs:815:14:
# Cannot apply operation on arrays of different lengths
# PanicException: Cannot apply operation on arrays of different lengths

Log output

No response

Issue description

Adding as a column first works as expected.

(df.with_columns(pl.lit([3, 5], pl.Array(float, 2)))
   .select(pl.nth(0) * pl.nth(1))
)

# shape: (2, 1)
# ┌───────────────┐
# │ A             │
# │ ---           │
# │ array[f64, 2] │
# ╞═══════════════╡
# │ [0.3, 1.0]    │
# │ [0.9, 2.0]    │
# └───────────────┘

Expected behavior

No panic.

Installed versions

--------Version info---------
Polars:              1.7.1
Index type:          UInt32
Platform:            macOS-13.6.1-arm64-arm-64bit
Python:              3.12.6 (main, Sep  6 2024, 19:03:47) [Clang 15.0.0 (clang-1500.1.0.2.5)]

----Optional dependencies----
adbc_driver_manager  <not installed>
altair               <not installed>
cloudpickle          <not installed>
connectorx           <not installed>
deltalake            <not installed>
fastexcel            <not installed>
fsspec               <not installed>
gevent               <not installed>
great_tables         <not installed>
matplotlib           <not installed>
nest_asyncio         <not installed>
numpy                1.26.4
openpyxl             <not installed>
pandas               2.2.1
pyarrow              15.0.2
pydantic             <not installed>
pyiceberg            <not installed>
sqlalchemy           <not installed>
torch                <not installed>
xlsx2csv             <not installed>
xlsxwriter           <not installed>
@cmdlineluser cmdlineluser added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Sep 20, 2024
@mcrumiller
Copy link
Contributor

Looks like the arrays are expanded into their underlying physical arrays (e.g. f64 here), so it thinks that the literal value is length 2.

@ritchie46 ritchie46 self-assigned this Sep 23, 2024
@ritchie46
Copy link
Member

fixed by #18851

@cmdlineluser
Copy link
Contributor Author

Thanks @ritchie46

Just testing the new list arithmetic with the same example - it doesn't Panic, but it does raise.

Would a similar broadcast_list() need to be added?

df = pl.DataFrame({
    "A": [[0.1, 0.2], [0.3, 0.4]]
})

df.with_columns(pl.all() * pl.lit([3, 5]))
# InvalidOperationError: can only do arithmetic operations on Series of the same size; got 2 and 1
(df.with_columns(pl.lit([3, 5]))
   .select(pl.nth(0) * pl.nth(1))
)
# shape: (2, 1)
# ┌────────────┐
# │ A          │
# │ ---        │
# │ list[f64]  │
# ╞════════════╡
# │ [0.3, 1.0] │
# │ [0.9, 2.0] │
# └────────────┘

@ritchie46
Copy link
Member

Yes, the new list arithemetic should indeed broadcast. @itamarst would you be interested in that one, as you worked on that feature?

@itamarst
Copy link
Contributor

itamarst commented Sep 23, 2024

My plan once my fix for #8006 is merged is to work on scalars next, yes (but also looks like someone came up with a PR already?)

@itamarst
Copy link
Contributor

Oh, I guess this isn't quite the same as numeric scalars, which is what I was thinking of, but an extension.

@itamarst
Copy link
Contributor

Definitely not an expert on how broadcasting ought to work, but I did add a bunch of comments on additional testing that would be useful based on my experience so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

4 participants