-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor setup scripts #10670
base: main
Are you sure you want to change the base?
Refactor setup scripts #10670
Conversation
✅ Deploy Preview for meta-velox canceled.
|
764aae4
to
35b2e65
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@czentgr Looks great! Some comments.
README.md
Outdated
$ make | ||
``` | ||
|
||
Note that `setup-adapters.sh` supports MacOS and Ubuntu 20.04 or later. | ||
Note that the `install_adapters` command is available for the supports MacOS and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: supported
scripts/centos.dockerfile
Outdated
@@ -16,6 +16,8 @@ ARG image=quay.io/centos/centos:stream9 | |||
FROM $image | |||
|
|||
COPY scripts/setup-helper-functions.sh / | |||
COPY scripts/setup-common.sh / | |||
COPY scripts/setup-linux.sh / |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy the versions file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. Adding it.
scripts/setup-linux.sh
Outdated
SCRIPTDIR=$(dirname "${BASH_SOURCE[0]}") | ||
source $SCRIPTDIR/setup-common.sh | ||
|
||
function install_hdfs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not in common?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The macos version below does not have hadoop installed. Everything else is the same. Hadoop is required for testing. Let's add this hadoop install directly to ubuntu/centos install_hdfs and remove setup-linux.sh
35b2e65
to
8c44bdc
Compare
scripts/setup-centos9.sh
Outdated
@@ -64,7 +60,7 @@ function install_velox_deps_from_dnf { | |||
dnf_install libevent-devel \ | |||
openssl-devel re2-devel libzstd-devel lz4-devel double-conversion-devel \ | |||
libdwarf-devel elfutils-libelf-devel curl-devel libicu-devel bison flex \ | |||
libsodium-devel zlib-devel | |||
libsodium-devel zlib-devel gmock gtest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the gmock-devel
here. This pulls in gtest-devel too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since #10422 we now properly detect system gtest, so having that in the images would be nice anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Actually, building the Arrow dependency failed for me from a fresh system until I had gtest-devel installed. Arrow has testing turned on...
On a side note: pip isn't used to install regex
and cmake-format
anymore? Was that removed?
On a developer machine this would be needed to run make format-check
/ make-format-fix
. In the CI pipeline it has its own container for checking.
Do we want to add this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #10668 for the python changes, the CI container for the check job is mostly for clang-format I think because different versions can show very different results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, the venv was added for MacOS. So far the python versions on Linux don't cause a problem but we likely will need to expand it. I was just wondering why we install clang-format/regex on MacOS but not the Linux platforms and whether or not we should have it commonly done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point actually, not sure? They probably should? @majetideepak
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a separate script for linux scripts/setup-check.sh
. This was likely moved out to simplify building the check container.
Developers mainly use MacOS and they need the check set up. Linux is used for deployment and the checks are not required.
We can include setup-check.sh inside the linux scripts and install as well.
We need to ensure Linux and MacOS use the same clang-format version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linux is used for deployment and the checks are not required.
I think development on Linux is done. I suppose the developers did the setup themselves. There is an assumption that pip would install the same version on all platforms but it could be fixed.
Let me undo the change though and deal with this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Developers mainly use MacOS
eh, maybe (written on linux^^) but even then we want to make it easy to contribute so this should be included in the devsetup for linux as well
8c44bdc
to
7474bae
Compare
a0eed92
to
e80ea28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good. @kgpai Any thoughts on this?
94e6ace
to
8d6d937
Compare
Could you add the macos setup script as a trigger to the macos workflow so it get's run with these changes as well? Overall a long overdue refactor 👏 nice job! |
run_and_time install_s3 | ||
run_and_time install_gcs | ||
run_and_time install_abfs | ||
run_and_time install_hdfs | ||
} | ||
|
||
function install_velox_deps { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@majetideepak Question: we are calling install_fmt
but also install it through brew earlier. Should we pick one or the other? I see it was added a while back in one of your PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove the brew install. I think fmt version should align with the other FB library versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks will do. And now this has come up with boost. Brew keeps everything up to date and can cause issues on update. Now seen with boost 1.86.0 while Linux uses boost 1.84.0. Maybe for this we should also switch to the common install function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, lets use a common install function for boost as well.
6eb7dfa
to
33686b5
Compare
CI should go green if you rebase |
33686b5
to
100065f
Compare
.github/workflows/macos.yml
Outdated
source scripts/setup-macos.sh | ||
brew install $MACOS_BUILD_DEPS $MACOS_VELOX_DEPS | ||
|
||
echo "OS used" ${{ matrix.os }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO remove.
100065f
to
ebc5cf6
Compare
ebc5cf6
to
c79ab70
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@czentgr Final set of comments.
Glad to see a bunch of duplication go away.
Thanks for working on this!
scripts/setup-common.sh
Outdated
cmake_install re2 -DRE2_BUILD_TESTING=OFF | ||
} | ||
|
||
function install_gflags { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need install_gflags
and install_glog
now that they are installed via the system package manager?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still built in Centos9 and doesn't come from the system install.
I checked and Centos9 has gflags 2.2.2 (from epel) which is the same version as it is currently built. Glog is also available as 0.3.5 but that is much outdated compared to 0.6.0 which is installed.
We could install gflags by the system and build glog.
The glog system install depends on gflags.
So I suppose we can move gflags to the system install and keep glog.
scripts/setup-common.sh
Outdated
|
||
function install_aws_deps { | ||
local AWS_REPO_NAME="aws/aws-sdk-cpp" | ||
local |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove empty local
?
scripts/setup-common.sh
Outdated
} | ||
|
||
function install_minio { | ||
local MINIO_ARCH=$1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe add defaults for macos arm (more devs use this platform).
scripts/setup-macos.sh
Outdated
function install_mvfst { | ||
wget_and_untar https://github.com/facebook/mvfst/archive/refs/tags/${FB_OS_VERSION}.tar.gz mvfst | ||
cmake_install mvfst -DBUILD_TESTS=OFF | ||
local MINIO_ARCH=$MACHINE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MACHINE is defined in the setup-common.sh.
This if-else block here and inside other files can go to install_minio
there.
c79ab70
to
d5fe2be
Compare
d5fe2be
to
8569270
Compare
scripts/setup-macos.sh
Outdated
|
||
DEPENDENCY_DIR=${DEPENDENCY_DIR:-$(pwd)} | ||
MACOS_VELOX_DEPS="bison boost double-conversion flex fmt gflags glog googletest icu4c libevent libsodium lz4 lzo openssl protobuf@21 simdjson snappy thrift xz xsimd zstd" | ||
MACOS_VELOX_DEPS="bison double-conversion flex gflags glog googletest icu4c libevent libsodium lz4 lzo openssl protobuf@21 simdjson snappy thrift xz zstd" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove double-conversion
. It is installed using the install function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@majetideepak I tried to resolve this dual install.
The problem with removing it here is that this is needed for boost. And the CI pipleine has a path where the actual dependencies are not installed. Only the brew dependencies are installed and the rest comes from bundling.
double-conversion is not bundled and so if the setup script does not run the installation (remember we only turned it on for maOS 13) of the deps it'll fail.
So a couple of options:
- Add it to be bundled
- Remove install from being called and use version from brew (this just like with other dependencies means it can be updated at any time and cause issues).
- Have the CI pipeline always run the setup script for macOS.
@assignUser What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could run the install scripts and cache the dependencies, that would give the best control over versions and have the smallest impact on ci times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could we do that? It wouldn't be via container? And being on different versions of MacOS it isn't like we have a common build step that needs to be executed only once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we can do that later and just get the scripts in for now and optimize later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with the cache or stash action (though for this cache actually makes more sense) but yeah it's something we have to change in the workflow not the script, this PR is big enough as is we can do it as a follow up!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me then build the dependencies always. There is a current problem with fmt due to it being upgraded in homebrew (because ccache upgrade). We want to make changes to use the predefined and installed versions instead of using uncontrolled build deps from homebrew (which will always be found first because of the linkage). So we don't want to just get the macOS deps from homebrew. Lets see what happens.
This affects this PR as well.
Edit: using the bundled fmt will should work for now.
8569270
to
d0e9c1f
Compare
Use a common file to download and install external dependencies. Extract versions for each library.
d0e9c1f
to
6a4dc12
Compare
@@ -78,7 +76,7 @@ jobs: | |||
|
|||
- name: Configure Build | |||
env: | |||
folly_SOURCE: BUNDLED #brew folly does not have int128 | |||
fmt_SOURCE: BUNDLED #picked up brew fmt (from ccache) is too new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Folly doesn't need to be force bundled because it is not coming from brew anymore but is installed by the setup script.
It uses fmt as bundled because the fmt version is too new. This need to be undone once the build script picks up the installed version via the setup script (and not the one present in homebrew from some dependency).
I am not sure if this should be an addition to the scripts or the docs but it's easy to avoid system install of dependencies (see ftm version issue) by setting |
This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the PR, make sure you've addressed reviewer comments, and rebase on the latest main. Thank you for your contributions! |
Will come back to it soon. |
Use a common file to download and install external dependencies.
Extract versions for each library.
This also addresses #10860
xsimd is removed from brew and instead installed using the install function. The issue is caused by xsimd being newer than works for Velox at this point.