Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLVM and horovod init action fails for 2.1 and 2.2 #1196

Open
prince-cs opened this issue Jul 9, 2024 · 1 comment
Open

MLVM and horovod init action fails for 2.1 and 2.2 #1196

prince-cs opened this issue Jul 9, 2024 · 1 comment
Assignees

Comments

@prince-cs
Copy link
Collaborator

While trying to fix the integration tests for 2.1 and 2.2 for mlvm and horovod, cluster creation fails as there are some dependency issues between torch, torchvision and torchaudio

For mlvm trying to install torch, torchvision and torchaudio with conda results in failures as it is taking too long to run the command which results in timeout errors. And also, for 2.0 installing torch, torchvision and torchaudio with pip and also installing pip packages before installing conda packages fixed the issue for me.

2.1 and 2.2 doesn't seem to work as installing conda packages are causing issues because of python version being used in conda.

python-3.12.4 |h194c7f8_0_cpython 30.6 MB conda-forge python_abi-3.12 | 4_cp312 6 KB conda-forge

`+ eval '/opt/conda/miniconda3/bin/mamba install -y r-dplyr=1.0 r-essentials=4.1 r-sparklyr=1.7 scikit-learn=0.24 xgboost=1.4 r-xgboost=1.4 -p /opt/conda/miniconda3'
++ /opt/conda/miniconda3/bin/mamba install -y r-dplyr=1.0 r-essentials=4.1 r-sparklyr=1.7 scikit-learn=0.24 xgboost=1.4 r-xgboost=1.4 -p /opt/conda/miniconda3
conda-forge/linux-64 Using cache
conda-forge/noarch Using cache
pytorch/linux-64 Using cache
pytorch/noarch Using cache

Looking for: ['r-dplyr=1.0', 'r-essentials=4.1', 'r-sparklyr=1.7', 'scikit-learn=0.24', 'xgboost=1.4', 'r-xgboost=1.4']

Pinned packages:

  • python 3.10.*
  • conda 22.9.*
  • python 3.10.*
  • r-base 4.1.*
  • r-recommended 4.1.*

Could not solve for environment specs
Encountered problems while solving:

  • package xgboost-1.4.0-py39hf3d152e_0 requires python >=3.9,<3.10.0a0, but none of the providers can be installed
  • package scikit-learn-0.24.0-py39h4dfa638_0 requires python >=3.9,<3.10.0a0, but none of the providers can be installed

The environment can't be solved, aborting the operation

  • sleep 5
  • (( i++ ))
  • (( i < 10 ))
  • echo 'Cmd '''/opt/conda/miniconda3/bin/mamba install -y r-dplyr=1.0 r-essentials=4.1 r-sparklyr=1.7 scikit-learn=0.24 xgboost=1.4 r-xgboost=1.4 -p /opt/conda/miniconda3''' failed.'
    Cmd '/opt/conda/miniconda3/bin/mamba install -y r-dplyr=1.0 r-essentials=4.1 r-sparklyr=1.7 scikit-learn=0.24 xgboost=1.4 r-xgboost=1.4 -p /opt/conda/miniconda3' failed.
  • return 1
    `
@cjac
Copy link
Contributor

cjac commented Jul 28, 2024

@bradmiro - to discuss in our upcoming meeting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants