Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change routing build to t-route master branch #318

Closed
wants to merge 8 commits into from

Conversation

mattw-nws
Copy link
Contributor

@mattw-nws mattw-nws commented Apr 10, 2023

Switch to building/installing t-route master instead of t-route ngen branch.

Additions

  • Several small build system improvements, including using ${BUILD_PARALLEL_JOBS} instead of $(nproc), and a way to optionally pass a pre-downloaded boost tarball to the image creation process.

Removals

  • Does not work out of the box to use legacy t-route (e.g. ngen branch), which might limit capability with some older hydrofabric datasets.

Changes

  • Now compiles and builds master t-route by default (instead of ngen branch)
  • Several Python elements are now installed in the global (root) site. This enables use-cases where a user other than mpi(1000) is used, e.g. using docker run --user=$(uid -u), so now the routing modules are available to that user (tested!).

Testing

  1. Built and ran the gauge_01073000 dataset, including using --user to change the uid

Screenshots

Notes

Todos

Checklist

  • PR has an informative and human-readable title
  • Changes are limited to a single goal (no scope creep)
  • Code can be automatically merged (no conflicts)
  • Code follows project standards (link if applicable)
  • Passes all existing automated tests
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future todos are captured in comments
  • Visually tested in supported browsers and devices (see checklist below 👇)
  • Project documentation has been updated (including the "Unreleased" section of the CHANGELOG)
  • Reviewers requested with the Reviewers tool ➡️

Testing checklist

Target Environment support

  • Linux

…er python modules will be installed there (instead of in mpi's user site), and can be readily used if the container is launched with uid=X... a useful scenario for reusing this container image.
@mattw-nws mattw-nws changed the title T-route-master Change routing build to t-route master branch Apr 13, 2023
@mattw-nws mattw-nws marked this pull request as ready for review April 13, 2023 15:11
@robertbartel robertbartel added enhancement New feature or request maas MaaS Workstream labels Apr 13, 2023
Copy link
Contributor

@robertbartel robertbartel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few things I suggest changing and a couple others I had questions about.

&& mkdir /boost \
&& mv boost_${BOOST_VERSION//./_}.tar.bz2 /boost/.
# Copy of entrypoint.sh here is just to avoid failure if there is no boost_*.tar.* present,,,
COPY entrypoint.sh boost_*.tar.* ${WORKDIR}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't want to do this, at least not exactly this way. This would cause rebuilding of anything depending on Boost any time the entrypoint script changes.

@@ -252,8 +258,8 @@ ARG ROCKY_NGEN_DEPS_REQUIRED
# Note that this includes numpy, which is needed for Python BMI support, regardless of BMI module
USER root
RUN dnf update -y && dnf install -y ${ROCKY_NGEN_DEPS_REQUIRED} \
&& pip3 install --upgrade pip \
&& if [ "${NGEN_ACTIVATE_PYTHON}" == "ON" ]; then pip3 install numpy; fi
&& pip3 install "pip>=23.0,<23.1" wheel packaging \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change to how pip is installed feels somewhat brittle. What was the reason behind it?

@@ -310,20 +316,20 @@ RUN echo "export HYDRA_HOST_FILE=${HYDRA_HOST_FILE}" >> /etc/profile \
&& tar xfz mpich-${MPICH_VERSION}.tar.gz \
&& cd mpich-${MPICH_VERSION} \
&& ./configure ${MPICH_CONFIGURE_OPTIONS} \
&& make -j $(nproc) ${MPICH_MAKE_OPTIONS} && make install \
&& make -j ${BUILD_PARALLEL_JOBS} ${MPICH_MAKE_OPTIONS} && make install \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some flaws with both the old approach and this change. The problem with going this way is the lack of safeguards around BUILD_PARALLEL_JOBS having a sane value (e.g., unset, double the CPUs on the host, etc.). Using the nproc approach provides protections but offers less control.

Ideally, we'd control this a different way anyway, but I don't think it's quite that simple to do with our tools. They are based on the old style docker-compose (not the same as docker compose), and it isn't clear at glance how we would accomplish this at image build time.

For the moment, I recommend leaving this as it was. We need to revise things, but that should be handled separately in something like #325.

&& tar -xjf boost_${BOOST_VERSION//./_}.tar.bz2 \
&& rm boost_${BOOST_VERSION//./_}.tar.bz2 \
&& tar -xf boost_tarball.blob \
&& rm boost_tarball.blob \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may end up being irrelevant, depending on what's done about an earlier comment ...

I'm curious as to why this file name change was done. I don't see any issues with it beyond a few uncommon and essentially trivial side effects when debugging image stages. But the old way didn't exactly seem troublesome.

Also, are we certain the -j flag isn't needed?

@robertbartel
Copy link
Contributor

@mattw-nws, pinging you again on this one, just to keep it on your radar. In addition to the stuff I commented on, it looks like the branch will need rebasing, probably due to #353.

@robertbartel
Copy link
Contributor

This is now superseded by #429.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request maas MaaS Workstream
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants