Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove dependence on build script #139

Merged
merged 3 commits into from
Sep 20, 2023

Conversation

SeanBryan51
Copy link
Collaborator

@SeanBryan51 SeanBryan51 commented Sep 15, 2023

Currently benchcab uses the offline/build3.sh script to build the CABLE executable. The build3.sh is a wrapper around Makefile, parallel_cable and serial_cable scripts. Using build3.sh is convenient for CABLE developers, however it has its draw backs when building executables for benchcab, these are outlined in #138.

This change removes the dependence on build3.sh from the build and instead invokes the Makefile, parallel_cable and serial_cable scripts directly. The new build system currently reproduces the previous build system exactly.

Fixes #138

@SeanBryan51 SeanBryan51 linked an issue Sep 15, 2023 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Sep 15, 2023

Codecov Report

Merging #139 (24b7fe4) into master (44873fc) will increase coverage by 0.29%.
Report is 4 commits behind head on master.
The diff coverage is 94.26%.

@@            Coverage Diff             @@
##           master     #139      +/-   ##
==========================================
+ Coverage   88.47%   88.77%   +0.29%     
==========================================
  Files          27       27              
  Lines        1527     1630     +103     
==========================================
+ Hits         1351     1447      +96     
- Misses        176      183       +7     
Files Changed Coverage Δ
benchcab/benchcab.py 32.58% <0.00%> (-1.34%) ⬇️
benchcab/repository.py 98.98% <97.14%> (+0.37%) ⬆️
benchcab/fluxsite.py 83.13% <100.00%> (ø)
benchcab/internal.py 90.47% <100.00%> (+0.23%) ⬆️
benchcab/utils/fs.py 100.00% <100.00%> (ø)
benchcab/utils/subprocess.py 100.00% <100.00%> (ø)
tests/common.py 89.47% <100.00%> (+0.90%) ⬆️
tests/conftest.py 100.00% <100.00%> (ø)
tests/test_repository.py 100.00% <100.00%> (ø)
tests/test_subprocess.py 100.00% <100.00%> (ø)

@SeanBryan51 SeanBryan51 force-pushed the 138-remove-dependence-on-build-script branch 6 times, most recently from 0a7c6c5 to 69a0ef2 Compare September 15, 2023 04:20
Currently benchcab uses the `offline/build3.sh` script to build the
CABLE executable. The `build3.sh` is a wrapper around `Makefile`,
`parallel_cable` and `serial_cable` scripts. Using `build3.sh` is
convenient for CABLE developers, however it has its draw backs when
building executables for benchcab, these are outlined in #138.

This change removes the dependence on `build3.sh` from the build and
instead invokes the `Makefile`, `parallel_cable` and `serial_cable`
scripts directly.

Fixes #138
@SeanBryan51 SeanBryan51 force-pushed the 138-remove-dependence-on-build-script branch from 69a0ef2 to 5cf2f4a Compare September 15, 2023 04:29
@SeanBryan51
Copy link
Collaborator Author

Integration tests

Test: compare new build against old build

#!/bin/bash
bench_example_dir='bench_example_test_build'
rm -rf $bench_example_dir
git clone [email protected]:CABLE-LSM/bench_example.git $bench_example_dir
cd $bench_example_dir
git reset --hard 6287539e96fc8ef36dc578201fbf9847314147fb
cat > config.yaml << EOL
project: tm70

experiment: AU-Tum

realisations:
  - path: trunk
  - path: trunk
    name: trunk_with_build_script
    build_script: offline/build3.sh

modules: [
  intel-compiler/2021.1.1,
  netcdf/4.7.4,
  openmpi/4.1.0
]

science_configurations:
  - cable:
      cable_user:
        CONSISTENCY_CHECK: False
EOL
benchcab run -v

Standard output from job:

Running fluxsite tasks...
Running task AU-Tum_2002-2017_OzFlux_Met_R1_S0... CABLE standard output saved in /scratch/tm70/sb8430/benchcab_integration_tests/bench_example_test_build/runs/fluxsite/tasks/AU-Tum_2002-2017_OzFlux_Met_R1_S0/out.txt
./cable cable.nml
Adding attributes to output file: /scratch/tm70/sb8430/benchcab_integration_tests/bench_example_test_build/runs/fluxsite/outputs/AU-Tum_2002-2017_OzFlux_Met_R1_S0_out.nc
Running task AU-Tum_2002-2017_OzFlux_Met_R0_S0... CABLE standard output saved in /scratch/tm70/sb8430/benchcab_integration_tests/bench_example_test_build/runs/fluxsite/tasks/AU-Tum_2002-2017_OzFlux_Met_R0_S0/out.txt
./cable cable.nml
Adding attributes to output file: /scratch/tm70/sb8430/benchcab_integration_tests/bench_example_test_build/runs/fluxsite/outputs/AU-Tum_2002-2017_OzFlux_Met_R0_S0_out.nc
Successfully ran fluxsite tasks

Running comparison tasks...
Comparing files AU-Tum_2002-2017_OzFlux_Met_R0_S0_out.nc and AU-Tum_2002-2017_OzFlux_Met_R1_S0_out.nc bitwise...
nccmp -df /scratch/tm70/sb8430/benchcab_integration_tests/bench_example_test_build/runs/fluxsite/outputs/AU-Tum_2002-2017_OzFlux_Met_R0_S0_out.nc /scratch/tm70/sb8430/benchcab_integration_tests/bench_example_test_build/runs/fluxsite/outputs/AU-Tum_2002-2017_OzFlux_Met_R1_S0_out.nc
Success: files AU-Tum_2002-2017_OzFlux_Met_R0_S0_out.nc AU-Tum_2002-2017_OzFlux_Met_R1_S0_out.nc are identical
Successfully ran comparison tasks

======================================================================================
                  Resource Usage on 2023-09-15 16:37:05:
   Job Id:             95201446.gadi-pbs
   Project:            tm70
   Exit Status:        0
   Service Units:      1.23
   NCPUs Requested:    18                     NCPUs Used: 18              
                                           CPU Time Used: 00:02:53        
   Memory Requested:   30.0GB                Memory Used: 409.2MB         
   Walltime requested: 06:00:00            Walltime Used: 00:02:03        
   JobFS requested:    100.0MB                JobFS used: 0B              
======================================================================================

Copy link
Collaborator

@ccarouge ccarouge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I am done. I think we really need to work on the organisation of the tests but that's for another issue. Maybe something to bring in Ben's view on.

benchcab/repository.py Outdated Show resolved Hide resolved
benchcab/repository.py Outdated Show resolved Hide resolved
benchcab/repository.py Outdated Show resolved Hide resolved
benchcab/repository.py Outdated Show resolved Hide resolved
benchcab/repository.py Outdated Show resolved Hide resolved
benchcab/repository.py Outdated Show resolved Hide resolved
benchcab/repository.py Outdated Show resolved Hide resolved
Split up the default build function into smaller `pre_build`,
`run_build` and `post_build` functions to improve code readability.

Simplify unit tests for `*_build` functions.

Add `benchcab/utils/fs.py` for defining utility functions for
interacting with the file system.
@SeanBryan51
Copy link
Collaborator Author

I think I am done. I think we really need to work on the organisation of the tests but that's for another issue. Maybe something to bring in Ben's view on.

I'm definitely happy to discuss the unit testing approach and general code architecture. It is good to get a fresh pair of eyes on the code base.

Copy link
Collaborator

@ccarouge ccarouge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor thing in the tests. Otherwise it looks good to me.

(tmp_dir / internal.CABLE_EXE).touch()
repo = get_mock_repo()
repo.post_build()
assert (offline_dir / internal.CABLE_EXE).exists()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we actually want to test a move, we should add an assert to check the file isn't under tmp_dir. It's a detail but it might be important to get it to fail if we decide to change to a copy later on.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed.

Copy link
Collaborator

@bschroeter bschroeter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor suggestions from me. Without being well-versed in the codebase as yet to form strong opinions blocking a merge. It would be good to have a chat around the coding approaches you have so I know where my reviews can be more helpful.

)
else:
print(
f"Compiling CABLE {'with MPI' if internal.MPI else 'serially'} for "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be semantics, but you don't often see conditionals embedded directly into an f-string, rather, the conditional bits are evaluated prior then subbed in place for clarity.

(tmp_dir / internal.CABLE_EXE).touch()
repo = get_mock_repo()
repo.post_build()
assert (offline_dir / internal.CABLE_EXE).exists()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed.

with chdir(build_script_path.parent), self.modules_handler.load(
modules, verbose=verbose
):
self.subprocess_handler.run_cmd(
shlex.join([f"./{tmp_script_path.name}", *args]),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion, but why not abstract the shlex wrapping in the subprocess handler itself, rather than having to manually wrap arguments each time it is called?

Consider changing the subprocess handler(?) to accept the script and subsequent arguments, applying the shlex wrapping and executing.

i.e. something along the lines of...

# UNTESTED!!!
def execute(script, *args):
   subprocess.run(shlex.join([script, args]))

...obviously following the execution logic you already having in place.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will leave out the shlex wrapping feature for now as this was actually the only place where we used shlex.join() and it has now been removed. But it might be an idea for the future since shell-escaped commands are currently not enforced by the subprocess handler.

@SeanBryan51 SeanBryan51 merged commit b5fe7fd into master Sep 20, 2023
4 checks passed
@SeanBryan51 SeanBryan51 deleted the 138-remove-dependence-on-build-script branch September 20, 2023 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove dependence on build3.sh
3 participants