Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R arrow example - enable S3 support #84

Open
agmeiner-bhs opened this issue Mar 19, 2024 · 7 comments
Open

R arrow example - enable S3 support #84

agmeiner-bhs opened this issue Mar 19, 2024 · 7 comments

Comments

@agmeiner-bhs
Copy link

Hi,
thanks for the example installing arrow in alpine.
To check the capabilities of the installed package, you can use arrow_info() and see, that s3-support ist OFF.
Installing on Linux describes, that s3-support would be turned on during build, if system dependencies

CURL: install libcurl-devel (rpm) or libcurl4-openssl-dev (deb)
OpenSSL >= 1.0.2: install openssl-devel (rpm) or libssl-dev (deb)

are found.
The extention of the example

RUN mkdir -p ~/.R && echo "LDFLAGS+=-fPIC" >> ~/.R/Makevars

RUN installr -d \
    -t "make cmake linux-headers" \
    -a "openssl openssl-dev curl-dev" arrow

RUN R -q -e 'library(arrow)'

was not successful in my case.

@gaborcsardi
Copy link
Collaborator

gaborcsardi commented Mar 19, 2024

Maybe arrow does not find the system requirements?

FWIW when I am running the install manually I see

--   ARROW_S3=OFF [default=OFF]
--       Build Arrow with S3 support (requires the AWS SDK for C++)

@agmeiner-bhs
Copy link
Author

Thanks! That worked

ENV ARROW_S3=ON

RUN mkdir -p ~/.R && echo "LDFLAGS+=-fPIC" >> ~/.R/Makevars
RUN installr -d \
    -t "make cmake linux-headers" \
    -a "openssl openssl-dev curl-dev" arrow
RUN R -q -e 'library(arrow)'

@gaborcsardi
Copy link
Collaborator

Cool, do you want to add a note to the example that points to your comment with the solution?

@agmeiner-bhs
Copy link
Author

Hi,

after some time I have to come back to this topic.
I can't reproduce my result with the code above. Does anybody confirm this with all the changes on the image made since March?

@agmeiner-bhs agmeiner-bhs reopened this Dec 13, 2024
@gaborcsardi
Copy link
Collaborator

The example now uses libarrow from Alpine which was not built with S3 support it seems:

d8d9ce40e19a:~# apk list --installed | grep arrow
apache-arrow-14.0.1-r3 aarch64 {apache-arrow} (Apache-2.0) [installed]
libarrow-14.0.1-r3 aarch64 {apache-arrow} (Apache-2.0) [installed]
libarrow_acero-14.0.1-r3 aarch64 {apache-arrow} (Apache-2.0) [installed]
libarrow_dataset-14.0.1-r3 aarch64 {apache-arrow} (Apache-2.0) [installed]
libparquet-14.0.1-r3 aarch64 {apache-arrow} (Apache-2.0) [installed]
d8d9ce40e19a:~# R -q -e 'arrow::arrow_info()' | grep s3
s3        FALSE

We can have another example that builds arrow from source (I assume that's still possible), and turns on S3 support. Do you want to submit a PR? (No pressure!)

@agmeiner-bhs
Copy link
Author

Hi Gabor, sorry, I don't have a solution yet. When I do, I will definitely submit a PR...

@gaborcsardi
Copy link
Collaborator

So S3 support is off in libarrow on Alpine. I tried to rebuild the Alpine package with S3, but the build fails.

If I install the aws sdk packages and try to build the R package from source, without Alpine arrow, that fails as well, for all the R package versions I tried (~5). So this does not seem easy.

This was all on aarch64, it is entirely possible that x86_64 is better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants