Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

separate dependencies #418

Merged
merged 8 commits into from
Jan 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 8 additions & 4 deletions .github/actions/setup-venv/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ inputs:
description: Update this to invalidate the cache
required: true
default: v0
torch-version:
description: The PyTorch version to install
required: false
default: '<2.2'
runs:
using: composite
steps:
Expand All @@ -30,24 +34,24 @@ runs:
id: virtualenv-cache
with:
path: .venv
key: ${{ inputs.cache-prefix }}-${{ runner.os }}-${{ env.PYTHON_VERSION }}-${{ hashFiles('requirements.txt', 'dev-requirements.txt', 'hf_olmo/requirements.txt') }}
key: ${{ inputs.cache-prefix }}-${{ runner.os }}-${{ env.PYTHON_VERSION }}-${{ hashFiles('*requirements.txt', '*pyproject.toml') }}

- if: steps.virtualenv-cache.outputs.cache-hit != 'true'
shell: bash
run: |
# Set up virtual environment without cache hit.
test -d .venv || virtualenv -p $(which python) --copies --reset-app-data .venv
. .venv/bin/activate
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
pip install -e .[dev]
pip install 'torch${{ inputs.torch-version }}' --extra-index-url https://download.pytorch.org/whl/cpu
pip install -e .[all]
pip install -e hf_olmo

- if: steps.virtualenv-cache.outputs.cache-hit == 'true'
shell: bash
run: |
# Set up virtual environment from cache hit.
. .venv/bin/activate
pip install --no-deps -e .[dev]
pip install --no-deps -e .[all]
pip install --no-deps -e hf_olmo

- shell: bash
Expand Down
17 changes: 9 additions & 8 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ on:

env:
# Change this to invalidate existing cache.
CACHE_PREFIX: v1
CACHE_PREFIX: v2
PYTHONPATH: ./
TOKENIZERS_PARALLELISM: 'false'

Expand Down Expand Up @@ -78,10 +78,10 @@ jobs:
uses: actions/cache@v3
with:
path: .mypy_cache
key: mypy-${{ env.CACHE_PREFIX }}-${{ runner.os }}-${{ matrix.python }}-${{ hashFiles('*requirements.txt') }}-${{ github.ref }}-${{ github.sha }}
key: mypy-${{ env.CACHE_PREFIX }}-${{ runner.os }}-${{ matrix.python }}-${{ hashFiles('*requirements.txt', '*pyproject.toml') }}-${{ github.ref }}-${{ github.sha }}
restore-keys: |
mypy-${{ env.CACHE_PREFIX }}-${{ runner.os }}-${{ matrix.python }}-${{ hashFiles('*requirements.txt') }}-${{ github.ref }}
mypy-${{ env.CACHE_PREFIX }}-${{ runner.os }}-${{ matrix.python }}-${{ hashFiles('*requirements.txt') }}
mypy-${{ env.CACHE_PREFIX }}-${{ runner.os }}-${{ matrix.python }}-${{ hashFiles('*requirements.txt', '*pyproject.toml') }}-${{ github.ref }}
mypy-${{ env.CACHE_PREFIX }}-${{ runner.os }}-${{ matrix.python }}-${{ hashFiles('*requirements.txt', '*pyproject.toml') }}

- name: ${{ matrix.task.name }}
run: |
Expand Down Expand Up @@ -172,10 +172,11 @@ jobs:
with:
python-version: '3.10'

- name: Install requirements
run: |
pip install --upgrade pip setuptools wheel build
pip install -r dev-requirements.txt
- name: Setup Python environment
uses: ./.github/actions/setup-venv
with:
python-version: '3.10'
cache-prefix: ${{ env.CACHE_PREFIX }}

- name: Prepare environment
run: |
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ To install from source (recommended for training/fine-tuning) run:
```bash
git clone https://github.com/allenai/OLMo.git
cd OLMo
pip install -e .
pip install -e .[all]
```

Otherwise you can install the model code by itself directly from PyPI with:
Expand Down
10 changes: 0 additions & 10 deletions dev-requirements.txt

This file was deleted.

7 changes: 5 additions & 2 deletions docker/Dockerfile.gantry
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,10 @@ FROM olmo-torch2-base

WORKDIR /stage

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY pyproject.toml .
RUN mkdir olmo && touch olmo/__init__.py && \
pip install --no-cache-dir .[all] && \
pip uninstall -y ai2-olmo && \
rm -rf olmo/
Comment on lines +12 to +15
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we uninstall?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually want the codebase built into these Dockerfiles, just the dependencies


WORKDIR /app/olmo
8 changes: 6 additions & 2 deletions docker/Dockerfile.lumi
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,12 @@ RUN cd /opt && \
sed -i 's/hostname -I/hostname -s/g' /usr/local/lib/python3.10/dist-packages/deepspeed/comm/comm.py

# Install more dependencies
COPY requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
COPY pyproject.toml .
RUN mkdir olmo && touch olmo/__init__.py && \
pip install --no-cache-dir .[all] && \
pip uninstall -y ai2-olmo && \
rm -rf olmo/

RUN pip install --no-cache-dir py-spy
RUN pip install --no-cache-dir wandb --upgrade

Expand Down
46 changes: 42 additions & 4 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,54 @@ build-backend = "setuptools.build_meta"

[project]
name = "ai2-olmo"
dynamic = ["version", "dependencies", "optional-dependencies"]
dynamic = ["version"]
readme = "README.md"
description = "Open Language Model (OLMo)"
authors = [
{ name = "Allen Institute for Artificial Intelligence", email = "[email protected]" }
]
requires-python = ">=3.8"
license = {file = "LICENSE"}
license = { file = "LICENSE" }
dependencies = [
"numpy",
"torch>=2.0,<2.2",
"omegaconf",
"rich",
"boto3",
"google-cloud-storage",
"tokenizers",
"packaging",
"cached_path",
"transformers",
]

[project.optional-dependencies]
dev = [
"ruff",
"mypy>=1.0,<1.4",
"black>=23.1,<24.0",
"isort>=5.12,<5.13",
"pytest",
"pytest-sphinx",
"twine>=1.11.0",
"setuptools",
"wheel",
"build",
]
train = [
"wandb",
"beaker-gantry",
"click",
"torchmetrics",
"smashed[remote]>=0.21.1",
"safetensors",
"datasets",
"scikit-learn",
"msgspec>=0.14.0",
]
all = [
"ai2-olmo[dev,train]",
]

[project.urls]
Homepage = "https://github.com/allenai/OLMo"
Expand All @@ -25,8 +65,6 @@ olmo = ["py.typed"]

[tool.setuptools.dynamic]
version = { attr = "olmo.version.VERSION" }
dependencies = { file = ["requirements.txt", "hf_olmo/requirements.txt"] }
optional-dependencies = { dev = { file = ["dev-requirements.txt"] } }

[tool.setuptools.packages.find]
include = ["olmo*", "hf_olmo*"]
Expand Down
19 changes: 0 additions & 19 deletions requirements.txt

This file was deleted.

2 changes: 1 addition & 1 deletion scripts/beaker/beaker_interactive.sh
Original file line number Diff line number Diff line change
Expand Up @@ -49,4 +49,4 @@ gh repo clone allenai/LLM
cd LLM

# Install other dependencies.
pip install -e '.[dev]'
pip install -e '.[all]'
2 changes: 1 addition & 1 deletion scripts/test_entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ git checkout --quiet "$COMMIT_SHA"

# Install dependencies.
pip install --upgrade pip
pip install --no-cache-dir '.[dev]'
pip install --no-cache-dir '.[all]'

# Create directory for results.
mkdir -p /results
Expand Down
Loading