Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create section or page for requirements/tips for reporting issues with Python/Anaconda modules on our machines #546

Open
felker opened this issue Nov 9, 2024 · 0 comments
Assignees

Comments

@felker
Copy link
Member

felker commented Nov 9, 2024

Users are constantly extending the existing public Anaconda modules on Polaris, Sophia etc, installing new Python packages, conda libraries, older or newer versions of existing packages, etc. When they run into issues, the [email protected] tickets often require much iteration to get the necessary info to resolve their problems.

They could be reporting legitimate issues with the installed PyTorch, DeepSpeed, etc. package, or it could be a basic user-error for the way they are launching the jobs (e.g. on a UAN, et.c), or it could be that they arent using the Python and/or installed package that they thought they were using (e.g. ~/.local/ user site-packages installs..., cloned conda env that reinstalled PyTorch from a generic wheel from pip/conda-forge, etc).

It might not be a bad idea to create a new page in the user docs to give reporting requirements/tips when opening tickets regarding Python/conda/ML frameworks. Could be a top level page like the recently-created page https://docs.alcf.anl.gov/issues/ "Questions/Issues on ALCF Docs", but instead "Questions/Issues on ALCF Installed Software".

Or a section in https://docs.alcf.anl.gov/polaris/data-science-workflows/python/ and the other machine pages.

Include all the following details:

  1. which base conda module and environment you are using?
  2. “module list” output
  3. have you extended the base via venv, conda clone, etc.?
  4. have you installed new packages, removed existing ones? if so, include script and commands
    ..
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants