Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Venv creation and uv support #245

Open
wants to merge 61 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
fa3dc1f
add empty requirements file for cuda
jarlsondre Nov 11, 2024
e9babf9
add requirements files and update pyproject toml
jarlsondre Nov 11, 2024
e994bf4
update pyproject
jarlsondre Nov 11, 2024
4b32a05
update installation in pyproject.toml
jarlsondre Nov 12, 2024
39e5801
update readme and horovod installation script
jarlsondre Nov 12, 2024
c9d786b
update readme with horovod explanation
jarlsondre Nov 12, 2024
8932f36
update horovod installation script
jarlsondre Nov 13, 2024
0906e33
update readme with -e flag
jarlsondre Nov 13, 2024
0d588ad
fix linter readme errors
jarlsondre Nov 13, 2024
750618f
add more info to readme
jarlsondre Nov 13, 2024
00f4454
trailing whitespace 🙃
jarlsondre Nov 13, 2024
ae89e0c
trailing whitespace 🙃 (again)
jarlsondre Nov 13, 2024
149a536
add draft of table of contents to readme
jarlsondre Nov 13, 2024
337ebd9
update readme toc
jarlsondre Nov 13, 2024
7b1cff9
update readme toc again
jarlsondre Nov 13, 2024
2457826
add section about uv lock to readme
jarlsondre Nov 13, 2024
4940963
update toc of readme
jarlsondre Nov 13, 2024
ddc7d13
fix errors in readme
jarlsondre Nov 14, 2024
abff6c1
add version numbers to packages in pyproject.toml
jarlsondre Nov 14, 2024
4eb5352
remove uv.lock (for now)
jarlsondre Nov 14, 2024
c9cbcef
remove link from readme
jarlsondre Nov 14, 2024
eb163ef
put toc in html comment
jarlsondre Nov 14, 2024
a99a674
remove toc, remove ds and horovod from reqs, add docs comment to pyproj
jarlsondre Nov 14, 2024
61e8574
Itwinai jlab Docker image (#236)
matbun Nov 14, 2024
d38385e
Virgo HDF5 file format (#240)
jarlsondre Nov 15, 2024
c51a1c4
add requirements files and update pyproject toml
jarlsondre Nov 11, 2024
76c7863
update installation in pyproject.toml
jarlsondre Nov 12, 2024
468ef94
add pytorch extra to horovod and remove redundant script
jarlsondre Nov 15, 2024
b0cd8ac
update readme tutorial with pip installation
jarlsondre Nov 15, 2024
0bd9a0a
add uv tutorial in separate file
jarlsondre Nov 15, 2024
4b1876b
fix linting errors
jarlsondre Nov 15, 2024
737f70b
update horovod install script
jarlsondre Nov 15, 2024
b8863bd
Merge branch 'uv-package-manager' of github.com:interTwin-eu/itwinai …
jarlsondre Nov 15, 2024
eb8cb08
fix dead link
jarlsondre Nov 15, 2024
7a784f5
update readme
jarlsondre Nov 19, 2024
3ac9313
add uv installation command to readme
jarlsondre Nov 19, 2024
f751912
add requirements files and update pyproject toml
jarlsondre Nov 11, 2024
6f9c5c1
update pyproject
jarlsondre Nov 11, 2024
6e65624
update installation in pyproject.toml
jarlsondre Nov 12, 2024
0a731ed
add version numbers to packages in pyproject.toml
jarlsondre Nov 14, 2024
def18fd
update horovod install script and add pip as dependency
jarlsondre Nov 19, 2024
7379659
fix merge conflicts
jarlsondre Nov 19, 2024
6c8f4db
formatting
jarlsondre Nov 19, 2024
690bed3
fix linting
jarlsondre Nov 19, 2024
9412e48
trailing whitespace
jarlsondre Nov 19, 2024
a23583a
remove comment from readme
jarlsondre Nov 19, 2024
60cbc6f
remove comments and small formatting difference
jarlsondre Nov 19, 2024
8f88c3e
move uv tutorial under docs/
jarlsondre Nov 28, 2024
de202e2
merge with main
jarlsondre Nov 28, 2024
018cc47
update readme with nvidia and amd instead of linux
jarlsondre Nov 28, 2024
6895472
remove duplicate entries in pyproject and reformat distributed file
jarlsondre Nov 28, 2024
69e1dd2
update readmes
jarlsondre Nov 28, 2024
bb815e6
separate horovod ds installation script into two files
jarlsondre Nov 28, 2024
d06dfe9
fix linting errors and update dependencies
jarlsondre Nov 28, 2024
a368cc0
fix tests and update lockfile
jarlsondre Nov 28, 2024
166f1ec
fix linting errors
jarlsondre Nov 28, 2024
59302f5
update installation scripts for testing
jarlsondre Nov 29, 2024
402598c
add local test command
jarlsondre Nov 29, 2024
93af263
add tf to installation in readme
jarlsondre Nov 29, 2024
81fa4a3
add torch cuda to project dependencies
jarlsondre Nov 29, 2024
d02a9cf
remove index from tutorial
jarlsondre Nov 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ torch-env-cpu: env-files/torch/generic_torch.sh
env ENV_NAME=.venv-pytorch \
NO_CUDA=1 \
bash -c 'bash env-files/torch/generic_torch.sh'
.venv-pytorch/bin/horovodrun --check-build
# .venv-pytorch/bin/horovodrun --check-build

# Install TensorFlow env (GPU support)
tensorflow-env: env-files/tensorflow/generic_tf.sh
Expand Down Expand Up @@ -44,7 +44,10 @@ tf-env-vega: env-files/tensorflow/createEnvVegaTF.sh env-files/tensorflow/generi


test:
PYTORCH_ENABLE_MPS_FALLBACK=1 .venv-pytorch/bin/pytest -v tests/ -m "not slurm"
.venv/bin/pytest -v tests/

test-local:
PYTORCH_ENABLE_MPS_FALLBACK=1 .venv/bin/pytest -v tests/ -m "not hpc"

test-jsc: tests/run_on_jsc.sh
bash tests/run_on_jsc.sh
Expand Down
80 changes: 77 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,10 +128,84 @@ git clone [--recurse-submodules] [email protected]:interTwin-eu/itwinai.git

### Install itwinai environment

You can create the
Python virtual environments using our predefined Makefile targets.
In this project, we are using `uv` as a project-wide package manager. Therefore, if
you are a developer, you should see the [uv tutorial](/docs/uv-tutorial.md) after reading
the following `pip` tutorial.

#### PyTorch (+ Lightning) virtual environment
#### Installation using pip

##### Creating a venv

You can install the `itwinai` environment for development using `pip`. First, however,
you would want to make a Python venv if you haven't already. Make sure you have
Python installed (on HPC you have to load it with `module load Python`), and then you
can create a venv with the following command:

```bash
python -m venv <name-of-venv>
```

For example, if I wanted to create a venv in the directory `.venv` (which is useful if
you use e.g. `uv`), then I would do:

```bash
python -m venv .venv
```

After this you can activate your venv using the following command:

```bash
source .venv/bin/activate
```

Now anything you pip install will be installed in your venv and if you run any python
commands they will use the version from your venv.

##### Installation of packages

We provide some _extras_ that can be activated depending on which platform you are
using.

- `macos`, `amd` or `nvidia` depending on which platform you use. Changes the version
of `prov4ML`.
- `dev` for development purposes. Includes libraries for testing and tensorboard etc.
- `torch` for installation with PyTorch.

If you want to install PyTorch using CUDA then you also have to add an
`--extra-index-url` to the CUDA version that you want. Since you are developing the
library, you also want to enable the editable flag, `-e`, so that you don't have to
reinstall everything every time you make a change. If you are on HPC, then you will
usually want to add the `--no-cache-dir` flag to avoid filling up your `~/.cache`
directory, as you can very easily reach your disk quota otherwise. An example of a
complete command for installing as a developer on HPC with CUDA thus becomes:

```bash
pip install -e ".[torch,dev,nvidia,tf]" \
--no-cache-dir \
--extra-index-url https://download.pytorch.org/whl/cu121
```

If you wanted to install this locally on macOS (i.e. without CUDA) with PyTorch, you
would do the following instead:

```bash
pip install -e ".[torch,dev,macos,tf]"
```

<!-- You can create the Python virtual environments using our predefined Makefile targets. -->

#### Horovod and DeepSpeed

The above does not install `Horovod` and `DeepSpeed`, however, as they require a
specialized [script](env-files/torch/install-horovod-deepspeed-cuda.sh). If you do not
require CUDA, then you can install them using `pip` as follows:

```bash
pip install --no-cache-dir --no-build-isolation git+https://github.com/horovod/horovod.git
pip install --no-cache-dir --no-build-isolation deepspeed
```

#### PyTorch (+ Lightning) virtual environment with makefiles

Makefile targets for environment installation:

Expand Down
98 changes: 98 additions & 0 deletions docs/uv-tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Tutorial for using the uv package manager

[uv](https://docs.astral.sh/uv/) is a Python package manager meant to act as a drop-in
replacement for `pip` (and many more tools). In this project, we use it to manage our
packages, similar to how `poetry` works. This is done using a lockfile called
`uv.lock`.

## uv as a drop-in replacement for pip

`uv` is a lot faster than `pip`, so we recommend installing packages from `PyPI`
with `uv pip install <package>` instead of `pip install <package>`. You don't need to
change anything in your project to use this feature, as it works as a drop-in
replacement to `pip`.

## uv as a project-wide package manager

If you wish to use the `uv sync` and/or `uv lock` commands, which is how you use `uv`
to manage all your project packages, then note that these commands will only work
with the directory called `.venv` in the project directory. Sometimes, this can be a
bit annoying, especially with an existing venv, so we recommend using a
[symlink](https://en.wikipedia.org/wiki/Symbolic_link). If you need to have multiple
venvs that you want to switch between, you can update the symlink to whichever of them
you want to use at the moment. For SLURM scripts, you can hardcode them if need be.

### Symlinking .venv

To create a symlink between your venv and the `.venv` directory, you can use the
following command:

```bash
ln -s <path/to/your_venv> <path/to/.venv>
```

As an example, if I am in the `itwinai/` folder and my venv is called `envAI_juwels`,
then the following will create the wanted symlink:

```bash
ln -s envAI_juwels .venv
```

### Installing from uv.lock

> [!Warning]
> If `uv` creates your venv for you, the venv will not contain `pip`. However, you need
> to have `pip` installed to be able to run the installation scripts for `Horovod` and
> `DeepSpeed`, so we have included `pip` in the dependencies in `pyproject.toml`.

To install from the `uv.lock` file into the `.venv` venv, you can do the following:

```bash
uv sync
```

If the `uv.lock` file has optional dependencies (e.g. `macos` or `torch`), then these
can be added with the `--extra` flag as follows:

```bash
uv sync --extra torch --extra macos
```

These will usually correspond to the optional dependencies in the `pyproject.toml`. In
particular, if you are a developer you would use one of the following two commands. If
you are on HPC with cuda, you would use:

```bash
uv sync --extra dev --extra nvidia --extra torch --extra tf --no-cache
```

If you are developing on your local computer with macOS, then you would use:

```bash
uv sync --extra torch --extra tf --extra dev --extra macos
```

### Updating the uv.lock file

To update the project's `uv.lock` file with the dependencies of the project, you can
use the command:

```bash
uv lock
```

This will create a `uv.lock` file if it doesn't already exist, using the dependencies
from the `pyproject.toml`.

## Adding new packages to the project

To add a new package to the project (i.e. to the `pyproject.toml` file) with `uv`, you
can use the following command:

```bash
uv add <package>
```

> [!Warning]
> This will add the package to your `.venv` venv, so make sure to have symlinked to
> this directory if you haven't already.
134 changes: 66 additions & 68 deletions env-files/tensorflow/generic_tf.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,89 +9,87 @@ if [ -z "$ENV_NAME" ]; then
ENV_NAME=".venv-tf"
fi

if [ -z "$NO_CUDA" ]; then
echo "Installing itwinai and its dependencies in '$ENV_NAME' virtual env (CUDA enabled)"
else
echo "Installing itwinai and its dependencies in '$ENV_NAME' virtual env (CUDA disabled)"
fi

# get python version
pver="$(python --version 2>&1 | awk '{print $2}' | cut -f1-2 -d.)"

# use pyenv if exist
if [ -d "$HOME/.pyenv" ];then
export PYENV_ROOT="$HOME/.pyenv"
export PATH="$PYENV_ROOT/bin:$PATH"
fi

# if [ -z "$NO_CUDA" ]; then
# echo "Installing itwinai and its dependencies in '$ENV_NAME' virtual env (CUDA enabled)"
# else
# echo "Installing itwinai and its dependencies in '$ENV_NAME' virtual env (CUDA disabled)"
# fi
#
# # get python version
# pver="$(python --version 2>&1 | awk '{print $2}' | cut -f1-2 -d.)"
#
# # use pyenv if exist
# if [ -d "$HOME/.pyenv" ];then
# export PYENV_ROOT="$HOME/.pyenv"
# export PATH="$PYENV_ROOT/bin:$PATH"
# fi
#
# set dir
cDir=$PWD

# create environment
if [ -d "${cDir}/$ENV_NAME" ];then
echo "env $ENV_NAME already exists"

source $ENV_NAME/bin/activate
else
python3 -m venv $ENV_NAME

# activate env
source $ENV_NAME/bin/activate

echo "$ENV_NAME environment is created in ${cDir}"
fi

pip3 install --no-cache-dir --upgrade pip

# get wheel -- setuptools extension
pip3 install --no-cache-dir wheel

# install TF
if [ -f "${cDir}/$ENV_NAME/bin/tensorboard" ]; then
echo 'TF already installed'
echo
else
if [ -z "$NO_CUDA" ]; then
pip3 install tensorflow[and-cuda]==2.16.* --no-cache-dir
else
# CPU only installation
pip3 install tensorflow==2.16.* --no-cache-dir
fi
fi

# CURRENTLY, horovod is not used with TF. Skipped.
# # install horovod
# if [ -f "${cDir}/$ENV_NAME/bin/horovodrun" ]; then
# echo 'Horovod already installed'
source $ENV_NAME/bin/activate
pip install --no-cache-dir -e ".[dev,nvidia,tf]"

# pip3 install --no-cache-dir --upgrade pip
#
# # get wheel -- setuptools extension
# pip3 install --no-cache-dir wheel
#
# # install TF
# if [ -f "${cDir}/$ENV_NAME/bin/tensorboard" ]; then
# echo 'TF already installed'
# echo
# else
# if [ -z "$NO_CUDA" ]; then
# export HOROVOD_GPU=CUDA
# export HOROVOD_GPU_OPERATIONS=NCCL
# export HOROVOD_WITH_TENSORFLOW=1
# # export TMPDIR=${cDir}
# pip3 install tensorflow[and-cuda]==2.16.* --no-cache-dir
# else
# # CPU only installation
# export HOROVOD_WITH_TENSORFLOW=1
# # export TMPDIR=${cDir}
# pip3 install tensorflow==2.16.* --no-cache-dir
# fi

# pip3 install --no-cache-dir horovod[tensorflow,keras] # --ignore-installed
# fi

# WHEN USING TF >= 2.16:
# install legacy version of keras (2.16)
# Since TF 2.16, keras updated to 3.3,
# which leads to an error when more than 1 node is used
# https://keras.io/getting_started/
pip3 install --no-cache-dir tf_keras==2.16.*

# Install Pov4ML
if [[ "$OSTYPE" =~ ^darwin ]] ; then
pip install "prov4ml[apple,nvidia]@git+https://github.com/matbun/ProvML@new-main" || exit 1
else
pip install "prov4ml[nvidia]@git+https://github.com/matbun/ProvML@new-main" || exit 1
fi

# Install itwinai: MUST be last line of the script for the user installation script to work!
pip3 install --no-cache-dir -e .[dev]
#
# # CURRENTLY, horovod is not used with TF. Skipped.
# # # install horovod
# # if [ -f "${cDir}/$ENV_NAME/bin/horovodrun" ]; then
# # echo 'Horovod already installed'
# # echo
# # else
# # if [ -z "$NO_CUDA" ]; then
# # export HOROVOD_GPU=CUDA
# # export HOROVOD_GPU_OPERATIONS=NCCL
# # export HOROVOD_WITH_TENSORFLOW=1
# # # export TMPDIR=${cDir}
# # else
# # # CPU only installation
# # export HOROVOD_WITH_TENSORFLOW=1
# # # export TMPDIR=${cDir}
# # fi
#
# # pip3 install --no-cache-dir horovod[tensorflow,keras] # --ignore-installed
# # fi
#
# # WHEN USING TF >= 2.16:
# # install legacy version of keras (2.16)
# # Since TF 2.16, keras updated to 3.3,
# # which leads to an error when more than 1 node is used
# # https://keras.io/getting_started/
# pip3 install --no-cache-dir tf_keras==2.16.*
#
# # Install Pov4ML
# if [[ "$OSTYPE" =~ ^darwin ]] ; then
# pip install "prov4ml[apple,nvidia]@git+https://github.com/matbun/ProvML@new-main" || exit 1
# else
# pip install "prov4ml[nvidia]@git+https://github.com/matbun/ProvML@new-main" || exit 1
# fi
#
#
# # Install itwinai: MUST be last line of the script for the user installation script to work!
# pip3 install --no-cache-dir -e .[dev]
Loading