diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md new file mode 100644 index 00000000..0d8cb6fc --- /dev/null +++ b/.github/copilot-instructions.md @@ -0,0 +1,140 @@ +# PyRenew: Multi-signal Bayesian Renewal Modeling + +PyRenew is a Python package for simulation and statistical inference of epidemiological models using JAX and NumPyro, emphasizing renewal models for infectious disease forecasting and analysis. + +Always reference these instructions first and fallback to additional search or context gathering only when you encounter unexpected information that does not match the info here. + +## Working Effectively + +### Prerequisites and Installation +- **CRITICAL**: Install `uv` package manager first: `pip install uv` +- **Python version requirement**: Python 3.13+ (as specified in pyproject.toml) +- Install development environment: `uv sync --extra dev` +- Install test dependencies: `uv sync --extra test` +- Install documentation dependencies: `uv sync --extra docs` +- Install ALL dependencies at once: `uv sync --all-extras` + +### Build and Test Process +- **NEVER CANCEL**: Test suite takes ~2.5 minutes. ALWAYS set timeout to 5+ minutes for pytest commands. +- **NEVER CANCEL**: Documentation build takes ~11 seconds, but first-time dependency installation takes 1-2 minutes. +- Test installation: `uv sync --extra dev` -- takes ~1 minute on first run, ~0.3 seconds on subsequent runs +- Run tests: `uv run pytest --mpl --mpl-default-tolerance=10` -- takes 2.5 minutes. NEVER CANCEL. +- Run tests with coverage: `uv run pytest --mpl --mpl-default-tolerance=10 --cov=pyrenew --cov-report term --cov-report xml` + +### Documentation +- **Requires Quarto CLI**: Install from https://github.com/quarto-dev/quarto-cli/releases +- Build documentation: `cd docs && uv run make html` -- takes ~11 seconds +- Documentation source: `docs/source/` +- Tutorials: Quarto (.qmd) files in `docs/source/tutorials/` +- API reference: Auto-generated from docstrings + +### Pre-commit and Code Quality +- Install pre-commit manually: `uv pip install pre-commit` (not included in project dependencies) +- Setup hooks: `uv run pre-commit install` +- **NOTE**: Pre-commit is NOT included in the dev dependencies by design +- Linting tools used: ruff (formatting and linting), numpydoc-validation, secret detection, typos checking + +### Key Commands Summary +```bash +# Installation (choose one based on needs) +uv sync --extra dev # Development dependencies only +uv sync --extra test # Test dependencies only +uv sync --extra docs # Documentation dependencies only +uv sync --all-extras # All dependencies (recommended for full development) + +# Testing - NEVER CANCEL, takes ~2.5 minutes +uv run pytest --mpl --mpl-default-tolerance=10 + +# Documentation build +cd docs && uv run make html + +# Basic Python module test +uv run python -c "import pyrenew; print('PyRenew imported successfully')" +``` + +## Validation Scenarios + +### After Making Code Changes +1. **ALWAYS** test import: `uv run python -c "import pyrenew; print('Import successful')"` +2. **ALWAYS** run affected tests: `uv run pytest test/test_[relevant_module].py -v` +3. **ALWAYS** run full test suite before committing: `uv run pytest --mpl --mpl-default-tolerance=10` -- NEVER CANCEL, takes 2.5 minutes +4. **Check code style**: Run ruff formatting (if pre-commit installed) + +### For Documentation Changes +1. **ALWAYS** rebuild docs: `cd docs && uv run make html` +2. **Check tutorials**: Ensure .qmd files in tutorials/ render correctly +3. **Verify API docs**: Check that module docstrings appear in generated docs + +### For New Features or Models +1. **Test basic functionality**: Create minimal example using the new feature +2. **Run relevant test modules**: Focus on affected areas (e.g., test_model.py for model changes) +3. **Validate with real data**: Use example datasets in pyrenew.datasets if available + +## Repository Structure + +### Key Directories +- `pyrenew/`: Main package source code + - `deterministic/`: Deterministic variables and components + - `distributions/`: Custom probability distributions + - `latent/`: Latent variable models + - `model/`: Complete model implementations + - `observation/`: Observation process models + - `process/`: Time series and stochastic processes + - `randomvariable/`: Random variable abstractions +- `test/`: Comprehensive test suite (174 tests) +- `docs/`: Sphinx documentation with Quarto tutorials +- `hook_scripts/`: Pre-commit utility scripts + +### Important Files +- `pyproject.toml`: Package configuration, dependencies, and tool settings +- `Makefile`: Simplified commands (install, test targets) +- `.pre-commit-config.yaml`: Code quality automation +- `docs/source/conf.py`: Sphinx documentation configuration + +## Common Issues and Solutions + +### Build Issues +- **Missing uv**: Install with `pip install uv` +- **Wrong Python version**: Requires Python 3.13+, check with `python --version` +- **Missing Quarto**: Required for documentation, install from GitHub releases +- **Test failures**: Some tests require specific plot tolerances (--mpl flags) + +### Import Errors +- **Missing dependencies**: Run `uv sync` with appropriate extras +- **Path issues**: Ensure working directory is repository root +- **JAX/NumPyro issues**: These are heavy dependencies; consider environment compatibility + +### Documentation Warnings +- **Import warnings during doc build**: Normal for mocked heavy dependencies (JAX, NumPyro) +- **Duplicate labels**: Expected for placeholder tutorial files + +## Development Workflow + +### Typical Development Session +1. `uv sync --all-extras` (first time or after dependency changes) +2. Make code changes +3. `uv run python -c "import pyrenew"` (quick import test) +4. `uv run pytest test/test_[relevant].py -v` (focused testing) +5. `uv run pytest --mpl --mpl-default-tolerance=10` (full test suite - NEVER CANCEL) +6. `cd docs && uv run make html` (if documentation changed) + +### Adding New Code +- **Follow existing patterns**: Check similar modules for structure +- **Add tests**: Every new feature should have corresponding tests +- **Document thoroughly**: Use NumPy-style docstrings +- **Consider tutorials**: Complex features may benefit from Quarto tutorial examples + +## Timing Expectations +- **uv sync operations**: 0.3-2 minutes depending on cache state +- **Test suite**: 2.5 minutes (NEVER CANCEL) +- **Documentation build**: 11 seconds +- **Simple imports**: <1 second +- **Full environment setup**: 3-5 minutes for completely fresh install + +## CI/CD Integration +- **GitHub Actions**: Uses `uv` for package management +- **Test workflow**: Runs on Ubuntu with Python 3.13 +- **Documentation**: Auto-deployed to GitHub Pages +- **Coverage**: Integrated with Codecov + +Remember: This is a scientific computing package dealing with epidemiological modeling. Precision and reproducibility are critical - always validate mathematical components thoroughly and never skip the full test suite. \ No newline at end of file diff --git a/docs/source/tutorials/getting_started.md b/docs/source/tutorials/getting_started.md index 8e2d77b7..1dcbc84d 100644 --- a/docs/source/tutorials/getting_started.md +++ b/docs/source/tutorials/getting_started.md @@ -1,5 +1,97 @@ -## Placeholder file - Please do not edit this file directly. - This file is just a placeholder. - For the source file, see: - https://github.com/CDCgov/PyRenew/tree/main/docs/source/tutorials/getting_started.qmd +# Getting started with pyrenew + + +`pyrenew` is a flexible tool for simulating and making statistical +inferences from epidemiologic models, with an emphasis on renewal +models. Built on `numpyro`, `pyrenew` provides core components for model +building and pre-defined models for processing various observational +processes. + +## Prerequisites + +This tutorial assumes some pre-existing knowledge of infectious disease +dynamics and Python programming. Before you dive in, we recommend: + +- Installing Python3 (use tools like + [pyenv](https://realpython.com/intro-to-pyenv/) or [compile and + install](https://ubuntuhandbook.org/index.php/2023/05/install-python-3-12-ubuntu/) + from the [release page](https://www.python.org/downloads/)) +- Familiarity with installing and loading modules in python, and with + virtual environment management (we recommend + [uv](https://docs.astral.sh/uv/)) +- Familiarity with the concept of a + [class](https://realpython.com/python-classes/) and + [metaclass](https://realpython.com/courses/python-metaclasses/) in + python +- Familiarity with Bayesian inference, and a working understanding of + MCMC methods used to fit Bayesian models to data (some resources are + available + [here](https://mc-stan.org/docs/2_18/reference-manual/effective-sample-size-section.html), + and [here](https://xcelab.net/rm/)) + +## Installing pyrenew + +You’ll need to install `pyrenew` using either uv or pip. To install +`pyrenew` using `uv`, run the following command from within the +directory containing the `pyrenew` project: + +``` bash +uv sync +``` + +To install `pyrenew` using pip, run the following command: + +``` bash +pip install git+https://github.com/CDCgov/PyRenew@main +``` + +## The fundamentals + +`pyrenew`’s core components are the metaclasses `RandomVariable` and +`Model` (in Python, a *metaclass* is a class whose instances are also +classes, where a *class* is a template for making objects). Within the +`pyrenew` package, a `RandomVariable` is a quantity that models can +estimate and sample from, **including deterministic quantities**. The +benefit of this design is that the definition of the `sample()` function +can be arbitrary, allowing the user to either sample from a distribution +using `numpyro.sample()`, compute fixed quantities (like a mechanistic +equation), or return a fixed value (like a pre-computed PMF.) For +instance, when estimating a PMF, the `RandomVariable` sampling function +may roughly be defined as: + +``` python +# define a new class called MyRandVar that inherits from the RandomVariable class +class MyRandVar(RandomVariable): + #define a method called sample that returns an object of type ArrayLike + def sample(...) -> ArrayLike: + # calls sample function from NumPyro package + return numpyro.sample(...) +``` + +Whereas, in some other cases, we may instead use a fixed quantity for +that variable (like a pre-computed PMF), where the `RandomVariable`’s +sample function could instead be defined as: + +``` python +# instead define MyRandVar to still inherit from the RandVariable class +class MyRandVar(RandomVariable): + #define sample method that still returns an ArrayLike object + def sample(...) -> ArrayLike: + #sampling method is a pre-computed PMF, a JAX NumPy array with explicit elements + return jax.numpy.array([0.2, 0.7, 0.1]) +``` + +Thus, when a `Model` samples from `MyRandVar`, it could be either adding +random variables to be estimated (first case) or just retrieving some +quantity needed for other calculations (second case.) + +The `Model` metaclass provides basic functionality for estimating and +simulation. Like `RandomVariable`, the `Model` metaclass has a +`sample()` method that defines the model structure. Ultimately, models +can be nested (or inherited), providing a straightforward way to add +layers of complexity. At this stage, the `Model` metaclass consist of +two model classes `RtInfectionsRenewalModel` which is basic renewal +model consisting of infections and reproduction numbers and +`HospitalAdmissionsModel` which includes basic renewal model and +hospital admissions. In the subsequent sections, we provide examples of +fitting each of these models.