-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify conda env yaml files and combine into single env #144
Conversation
- Combine `proc.yml` with `pub.yml` - Remove `autocurator` is a dependency - Update dependencies using looser constraints - Make `esgconfigparser >=1.0.0a1`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @TonyB9000, I kickstarted some work to simplify the conda env yaml files. Please review the description and update the dependencies as needed. There are a few action items for you to complete.
dependencies: | ||
# Base | ||
# ================== | ||
- python >=3.9 | ||
- pip | ||
- distributed | ||
- ipdb | ||
- matplotlib | ||
- netcdf4 | ||
- numpy >=1.23.0 # This version of numpy includes support for Python 3.11. | ||
- pyyaml | ||
- termcolor | ||
- tqdm | ||
- watchdog | ||
- xarray >=2022.02.0 # This version of Xarray drops support for Python 3.8. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Packages are considered direct dependencies if they are directly imported by datasm
, or some datasm module call requires an 'optional' dependency (e.g., xarray with the matplotlib optional dependency for xarray plotting).
Update this section by removing or adding any dependencies as needed.
- pip: | ||
- esgcet>=5.2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
esgcet>=5.2.0
now includes an -xarray
flag to replace autocurator
. I removed autocurator
as a result, which allowed me to combine proc.yml
with pub.yml
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow - Thanks Tom! This is great. I have some 5/6 different environments (some for dev, some for pub), some customized for v1 or v2 Large Ensemble processing, some with/without e2c v1.10.0rc1 (nee rc2) - and it will be great to cut them down.
I have long-running CMIP6 jobs running (expected to complete around Sept 5) and just finished a publication run) so I'll need to create a new environment (or two) to test these - don't want to destabilize running stuff...
I need to think about how to test this (publish without actually publishing, etc) or else publish something small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow - Thanks Tom! This is great. I have some 5/6 different environments (some for dev, some for pub), some customized for v1 or v2 Large Ensemble processing, some with/without e2c v1.10.0rc1 (nee rc2) - and it will be great to cut them down.
No problem Tony! I hope this optimizes your workflow so you don't need to manage many different conda environments.
Ideally, you should have just two environments: 1 for production and 1 for development.
- The production environment should include official, stable releases of packages (no release candidates) since it is meant for production usage.
- The development environment has more flexibility and can use package release candidates (e.g.,
e3sm_to_cmip=v1.10.rc1
) to testdatasm
on.
This brings a good point that we might need a dev.yml
to define the development environment.
For the custom environments, it will be harder to version control if you install packages manually without using the yaml file specs (as you are probably aware of by now). Also as I mentioned before, pip
installing a local build of e3sm_to_cmip
can cause issues too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomvothecoder Given that there as never been a datasm "stable release", this would leave me with only a "dev" environment ...
The scope of datasm is so broad, no regression testing makes headway before operational exigencies demand changes to accommodate new data irregularities. Not to mention - I have scores of big jobs queued up that are pushing against slurm "PENDING (resources)". I suppose I need to develop a "non-slurm" test regime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomvothecoder Although the prod.yml includes
- cwltool >=3.1.20220202173120
when I build a prod environment and pip install datasm (and e3sm_to_cmip, FWIW), datasm postprocess fails with
/var/spool/slurmd/job210469/slurm_script: line 7: cwltool: command not found
The environment I built shows (mamba list):
cryptography 41.0.3 py310h75e40e8_0 conda-forge
curl 8.2.1 hca28451_0 conda-forge
cwl-upgrader 1.2.8 pyhd8ed1ab_0 conda-forge
cwl-utils 0.28 pyh1d7be83_0 conda-forge
cwlformat 2022.02.18 pyhd8ed1ab_0 conda-forge
cycler 0.11.0 pyhd8ed1ab_0 conda-forge
(not that I ever looked for cwltool before, so I don't know what to expect.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomvothecoder This could be my fault (naturally...). I changed "- default" to "- nodefault", having read a comment about that. Now that I've changed it back and rebuilt, cwltool appears... magic!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's weird... cwltool
only exists in conda-forge (i.e., nodefaults
shouldn't impact it ...) ... hmmm 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be superstition on my part. Other intervening changes may have occurred. I should create another env with "nodefaults" and check again. Maybe the (acme1) base environment is different. (too many variables...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't worry about it --- whatever gets the job done! 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad that the recipes can be consolidated into one. The changes look good to me based on a visual check. We will need Tony's confirmation with a test.
Co-authored-by: Tom Vo <[email protected]>
@chengzhuzhang Do you feel it is OK to test the resulting "prod" environment with a publication? Nothing (at present) is publication-authorized, and (AFAIK) the only way to test egspublish is to publish ;). I can retract/delete an E3SM set (and only retract a CMIP6 set). I have built a new environment based upon the prod.yml. It includes "- e3sm_to_cmip >=1.9.1", but issuing e3sm_to_cmip --version produces
Also, "pip install ."(datasm) returns:
Finally, if I "pip install ."(e3sm_to_cmip), to obtain v1.10.0rc1, the e2c version works: Successfully installed e3sm-to-cmip-1.10.0rc1 |
name: datasm_prod | ||
channels: | ||
- conda-forge | ||
- defaults |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably do nodefaults
here and across the entire E3SM ecosystem, but I would defer to Xylar to make a recommendation.
https://conda-forge.org/docs/user/tipsandtricks.html#using-multiple-channels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I need to start replacing defaults
with nodefaults
across the many conda env yaml files in E3SM repos.
Yes, I think you can test publishing an ensemble from v1 ssp375, because we are only waiting for the final 4 ensembles to be processed, so that we can push all ensembles together. Test publishing already cmorized data should be fine and you don't need to retract. v1.10.0rc1 fixed the deprecated numpy.warning, though as @tomvothecoder suggested we need to be careful with the fix, see here E3SM-Project/e3sm_to_cmip#205 |
Datasm has a "setup.py" (written by Sterling I suppose) that I have never knowingly invoked, and it appears to be the source of the error
I assume pip reads this file. (# TODO -lol) Can I simply change the line |
I added a
Yes, pip reads this file when you invoke
I set the constraint to The line should be updated to |
@tomvothecoder Thanks Tom! I've managed a conda/mamba build (and datasm_install) having made the adjustments. (Full-up testing is another matter, but will require a bit more time.) ASIDE: When I recently did "pip install", I added "-vvv" to get more information, and was surprised to see a syntax error in a module that had long ago been "git removed". I poked around and found that "build" and "datasm-egg-info" listed old files. What process is supposed to update these? |
I have not encountered the issue of old files being present in those directories so I'm not exactly sure how they are handled. I think they usually get overwritten after running Are those directories in the root of the |
From email chain 9/11/23: Tony received this error when building and using the Traceback (most recent call last):
File "/home/bartoletti1/mambaforge/envs/dsm_prod_test2/bin/esgpublish", line 8, in <module>
sys.exit(main())
File "/home/bartoletti1/mambaforge/envs/dsm_prod_test2/lib/python3.10/site-packages/esgcet/pub_internal.py", line 83, in main
pub = pub_args.get_args()
File "/home/bartoletti1/mambaforge/envs/dsm_prod_test2/lib/python3.10/site-packages/esgcet/args.py", line 46, in get_args
parser.add_argument("--version", action="version", version=f"esgpublish v{esgcet.__version__}",help="Print the version and exit")
NameError: name 'esgcet' is not defined
For what its worth, my environment (dsm_prod_test2) lists this esgcet:
(dsm_prod_test2) -bash-4.2$ mamba list | grep esgcet
esgcet 5.2.0 pypi_0 pypi Sasha replied that the We are waiting on a new version of esgcet to be released from Sasha. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Great, this is a big W and should make your life easier. Thanks @TonyB9000. |
@tomvothecoder Just a side note - during publication of the v2 Large Ensemble datasets, the publisher balked at the CFmon variables. Sasha recognized the problem and was to address it - but it necessitated reverting to the "pub-only" environment (older publisher), albeit only for the CFmon publications. I am ignorant of the range of specific checks that the publisher makes during publication. |
@TonyB9000 could you submit an issue to https://github.com/ESGF/esg-publisher so that this won't slip? |
@chengzhuzhang Done (Issue #226) |
Summary of changes
prod.yml
proc.yml
withpub.yml
esgcet>=5.2.0
autocurator
as a dependencyesgcet>=5.2.0
with the-xarray
flagpub.yml
dependencyautocurator
is not maintained, only supports Linux and Python <=3.8 #143pub.yml
andproc.yml
ci.yml
-- this repo has no GitHub Actions CI/CD workflows to run buildsAction Items
prod.yml
(works on my end -- @tomvothecoder), also works for @TonyB9000datasm
-- in progress (@TonyB9000)esgcet>=5.2.0
and the-xarray
command which replacesautocurator
-- in progress (@TonyB9000)