Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Cannot correctly handle arrow metadata derived from the R arrow package #713

Open
eitsupi opened this issue Feb 19, 2025 · 3 comments
Open

Comments

@eitsupi
Copy link
Contributor

eitsupi commented Feb 19, 2025

It seems to be unable to correctly handle the Arrow data headers generated from extension type such as from package_version?

arrow::as_arrow_array(
  package_version(c(
    "1.2-4", "1.2-3", "2.1"
  ))
) |>
  nanoarrow::as_nanoarrow_array() |>
  as.vector()
#> Error: lexical error: invalid char in json text.
#>                                        A 3 263170 197888 5 UTF-8 787 0
#>                      (right here) ------^

Created on 2025-02-19 with reprex v2.1.1

@paleolimbot
Copy link
Member

Yes, I think the vctrs_extension_type uses a different serialization which was changed here for security reasons. I think the fastest solution is to just use a different extension name in nanoarrow (so that the storage from arrow gets implicitly converted and vice versa).

@eitsupi
Copy link
Contributor Author

eitsupi commented Mar 6, 2025

Thank you for your comment!
It seems the error message is not clear here, so I am wondering if it is possible to display a hint that ignores the extension type and converts the value to R. (Is such a thing possible?)

@paleolimbot
Copy link
Member

Agreed. The easiest way to work around this is to just change the extension name, since they are both valid extensions but they are different. I should have done this when I fixed the security issue in nanoarrow.

na_extension(storage_type, "arrow.r.vctrs", serialize_ptype(ptype))

(+ a few more references to arrow.r.vctrs would have to be renamed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants