Skip to content

Commit

Permalink
Final review for docs formatting (#159)
Browse files Browse the repository at this point in the history
* update models page link, limit CLI demo vid width

* highlight correct lines

* remove spaces from chimp&see

* general densepose page edits

* standard capitalization DensePose

* densepose typos

* remove megadetectorlite config default None

* add note about pulling in model defaults

* config formatting fixes

* typos

* correct link

* copy editing

* Update docs/docs/models/densepose.md

* Update docs/docs/yaml-config.md

Co-authored-by: Emily Miller <[email protected]>
  • Loading branch information
klwetstone and ejm714 authored Oct 26, 2021
1 parent d72bc89 commit b4b706a
Show file tree
Hide file tree
Showing 11 changed files with 48 additions and 47 deletions.
2 changes: 1 addition & 1 deletion HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ The core algorithm in `zamba` v1 was a [stacked ensemble](https://en.wikipedia.o
learning models, whose individual predictions were combined in the second level
of the stack to form the final prediction.

In v2, the stacked ensemble algorithm from v1 is replaced with three more powerful [single-model options](../models/index.md): `time_distributed`, `slowfast`, and `european`. The new models utilize state-of-the-art image and video classification architectures, and are able to outperform the much more computationally intensive stacked ensemble model.
In v2, the stacked ensemble algorithm from v1 is replaced with three more powerful [single-model options](models/species-detection.md): `time_distributed`, `slowfast`, and `european`. The new models utilize state-of-the-art image and video classification architectures, and are able to outperform the much more computationally intensive stacked ensemble model.

### New geographies and species

Expand Down
18 changes: 10 additions & 8 deletions docs/docs/configurations.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,9 @@ Here's a helpful diagram which shows how everything is related.

The [`VideoLoaderConfig` class](api-reference/data-video.md#zamba.data.video.VideoLoaderConfig) defines all of the optional parameters that can be specified for how videos are loaded before either inference or training. This includes selecting which frames to use from each video.

All video loading arguments can be specified either in a [YAML file](yaml-config.md) or when instantiating the [`VideoLoaderConfig`](configurations.md#video-loading-arguments) class in Python. Some can also be specified directly in the command line.
All video loading arguments can be specified either in a [YAML file](yaml-config.md) or when instantiating the [`VideoLoaderConfig` class](api-reference/data-video.md#zamba.data.video.VideoLoaderConfig) in Python. Some can also be specified directly in the command line.

Each model comes with a default video loading configuration. If no user-specified video loading configuration is passed - either through a YAML file or the Python `VideoLoaderConfig` class - all video loading arguments will be set based on the defaults for the given model.

=== "YAML file"
```yaml
Expand Down Expand Up @@ -87,7 +89,7 @@ Only load frames that correspond to [scene changes](http://www.ffmpeg.org/ffmpeg

#### `megadetector_lite_config (MegadetectorLiteYoloXConfig, optional)`

The `megadetector_lite_config` is used to specify any parameters that should be passed to the [MegadetectorLite model](models/index.md#megadetectorlite) for frame selection. For all possible options, see the [`MegadetectorLiteYoloXConfig` class](api-reference/models-megadetector_lite_yolox.md#zamba.models.megadetector_lite_yolox.MegadetectorLiteYoloXConfig). If `megadetector_lite_config` is `None` (the default), the MegadetectorLite model will not be used to select frames.
The `megadetector_lite_config` is used to specify any parameters that should be passed to the [MegadetectorLite model](models/species-detection.md#megadetectorlite) for frame selection. For all possible options, see the [`MegadetectorLiteYoloXConfig` class](api-reference/models-megadetector_lite_yolox.md#zamba.models.megadetector_lite_yolox.MegadetectorLiteYoloXConfig). If `megadetector_lite_config` is `None` (the default), the MegadetectorLite model will not be used to select frames.

#### `frame_selection_height (int, optional), frame_selection_width (int, optional)`

Expand Down Expand Up @@ -182,7 +184,7 @@ Path to a model checkpoint to load and use for inference. The default is `None`,

#### `model_name (time_distributed|slowfast|european, optional)`

Name of the model to use for inference. The three model options that ship with `zamba` are `time_distributed`, `slowfast`, and `european`. See the [Available Models](models/index.md) page for details. Defaults to `time_distributed`
Name of the model to use for inference. The three model options that ship with `zamba` are `time_distributed`, `slowfast`, and `european`. See the [Available Models](models/species-detection.md) page for details. Defaults to `time_distributed`

#### `gpus (int, optional)`

Expand Down Expand Up @@ -233,7 +235,7 @@ By default, before kicking off inference `zamba` will iterate through all of the

#### `model_cache_dir (Path, optional)`

Cache directory where downloaded model weights will be saved. If None and the MODEL_CACHE_DIR environment variable is not set, will use your default cache directory (e.g. `~/.cache`). Defaults to `None`
Cache directory where downloaded model weights will be saved. If None and the `MODEL_CACHE_DIR` environment variable is not set, will use your default cache directory (e.g. `~/.cache`). Defaults to `None`

<a id='training-arguments'></a>

Expand Down Expand Up @@ -291,11 +293,11 @@ Path to a model checkpoint to load and resume training from. The default is `Non

#### `scheduler_config (zamba.models.config.SchedulerConfig, optional)`

A [PyTorch learning rate schedule](https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate) to adjust the learning rate based on the number of epochs. Scheduler can either be `default` (the default), `None`, or a [`torch.optim.lr_scheduler`](https://github.com/pytorch/pytorch/blob/master/torch/optim/lr_scheduler.py). If `default`,
A [PyTorch learning rate schedule](https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate) to adjust the learning rate based on the number of epochs. Scheduler can either be `default` (the default), `None`, or a [`torch.optim.lr_scheduler`](https://github.com/pytorch/pytorch/blob/master/torch/optim/lr_scheduler.py).

#### `model_name (time_distributed|slowfast|european, optional)`

Name of the model to use for inference. The three model options that ship with `zamba` are `time_distributed`, `slowfast`, and `european`. See the [Available Models](models/index.md) page for details. Defaults to `time_distributed`
Name of the model to use for inference. The three model options that ship with `zamba` are `time_distributed`, `slowfast`, and `european`. See the [Available Models](models/species-detection.md) page for details. Defaults to `time_distributed`

#### `dry_run (bool, optional)`

Expand All @@ -307,7 +309,7 @@ The batch size to use for training. Defaults to `2`

#### `auto_lr_find (bool, optional)`

Whether to run a [learning rate finder algorithm](https://arxiv.org/abs/1506.01186) when calling `pytorch_lightning.trainer.tune()` to try to find an optimal initial learning rate. The learning rate finder is not guaranteed to find a good learning rate; depending on the dataset, it can select a learning rate that leads to poor model training. Use with caution. See the PyTorch Lightning [docs](https://pytorch-lightning.readthedocs.io/en/latest/common/trainer.html#auto-lr-find) for more details. Defaults to `False`.
Whether to run a [learning rate finder algorithm](https://arxiv.org/abs/1506.01186) when calling `pytorch_lightning.trainer.tune()` to try to find an optimal initial learning rate. The learning rate finder is not guaranteed to find a good learning rate; depending on the dataset, it can select a learning rate that leads to poor model training. Use with caution. See the PyTorch Lightning [docs](https://pytorch-lightning.readthedocs.io/en/latest/common/trainer.html#auto-lr-find) for more details. Defaults to `False`

#### `backbone_finetune_config (zamba.models.config.BackboneFinetuneConfig, optional)`

Expand Down Expand Up @@ -343,7 +345,7 @@ Directory in which to save model checkpoint and configuration file. If not speci

#### `overwrite (bool, optional)`

If `True`, will save outputs in `save_dir` and overwrite the directory if it exists. If False, will create an auto-incremented `version_n` folder within `save_dir` with model outputs. Defaults to `False`.
If `True`, will save outputs in `save_dir` and overwrite the directory if it exists. If False, will create an auto-incremented `version_n` folder within `save_dir` with model outputs. Defaults to `False`

#### `skip_load_validation (bool, optional)`

Expand Down
10 changes: 5 additions & 5 deletions docs/docs/contribute/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@

`zamba` is an open source project, which means _you_ can help make it better!

## Develop the github repository
## Develop the GitHub repository

To get involved, check out the Github [code repository](https://github.com/drivendataorg/zamba).
To get involved, check out the GitHub [code repository](https://github.com/drivendataorg/zamba).
There you can find [open issues](https://github.com/drivendataorg/zamba/issues) with comments and links to help you along.

`zamba` uses continuous integration and test-driven-development to ensure that we always have a working project. So what are you waiting for? `git` going!
`zamba` uses continuous integration and test-driven development to ensure that we always have a working project. So what are you waiting for? `git` going!

## Installation for development

Expand All @@ -22,9 +22,9 @@ $ pip install -r requirements-dev.txt

## Running the `zamba` test suite

The included `Makefile` contains code that uses pytest to run all tests in `zamba/tests`.
The included [`Makefile`](https://github.com/drivendataorg/zamba/blob/master/Makefile) contains code that uses pytest to run all tests in `zamba/tests`.

The command is (from the project root),
The command is (from the project root):

```console
$ make tests
Expand Down
6 changes: 3 additions & 3 deletions docs/docs/extra-options.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Say that you have a large number of videos, and you are more concerned with dete
=== "Python"
In Python, video resizing can be specified when `VideoLoaderConfig` is instantiated:

```python hl_lines="6 7 8"
```python hl_lines="7 8 9"
from zamba.data.video import VideoLoaderConfig
from zamba.models.config import PredictConfig
from zamba.models.model_manager import predict_model
Expand Down Expand Up @@ -111,7 +111,7 @@ A simple option is to sample frames that are evenly distributed throughout a vid

### MegadetectorLite

You can use a pretrained object detection model called [MegadetectorLite](models/index.md#megadetectorlite) to select only the frames that are mostly likely to contain an animal. This is the default strategy for all three pretrained models. The parameter `megadetector_lite_config` is used to specify any arguments that should be passed to the MegadetectorLite model. If `megadetector_lite_config` is None, the MegadetectorLite model will not be used.
You can use a pretrained object detection model called [MegadetectorLite](models/species-detection.md#megadetectorlite) to select only the frames that are mostly likely to contain an animal. This is the default strategy for all three pretrained models. The parameter `megadetector_lite_config` is used to specify any arguments that should be passed to the MegadetectorLite model. If `megadetector_lite_config` is None, the MegadetectorLite model will not be used.

For example, to take the 16 frames with the highest probability of detection:

Expand Down Expand Up @@ -144,7 +144,7 @@ For example, to take the 16 frames with the highest probability of detection:
train_model(video_loader_config=video_loader_config, train_config=train_config)
```

If you are using the [MegadetectorLite](models/index.md#megadetectorlite) for frame selection, there are two ways that you can specify frame resizing:
If you are using the [MegadetectorLite](models/species-detection.md#megadetectorlite) for frame selection, there are two ways that you can specify frame resizing:

- `frame_selection_width` and `frame_selection_height` resize images *before* they are input to the frame selection method. If both are `None`, the full size images will be used during frame selection. Using full size images for selection is recommended for better detection of smaller species, but will slow down training and inference.
- `model_input_height` and `model_input_width` resize images *after* frame selection. These specify the image size that is passed to the actual model.
Expand Down
20 changes: 10 additions & 10 deletions docs/docs/models/denspose.md → docs/docs/models/densepose.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,31 @@
# Densepose
# DensePose

## Background

Facebook AI Research has published a model, DensePose ([Neverova et al, 2021](https://arxiv.org/abs/2011.12438v1)), which can be used to get segmentations for animals that appear in videos. This was trained on the following animals, but often works for other species as well: sheep, zebra, horse, giraffe, elephant, cow, ear, cat, dog. Here's an example of the segmentation output for a frame:
DensePose ([Neverova et al, 2021](https://arxiv.org/abs/2011.12438v1)) is a model published by Facebook AI Research that can be used to get segmentations for animals that appear in videos. The model was trained on the following animals, but often works for other species as well: bear, cat, cow, dog, elephant, giraffe, horse, sheep, zebra. Here's an example of the segmentation output for a frame:

![segmentation of duiker](../media/seg_out.jpg)

Additionally, the model provides mapping of the segmentation output to specific anatomy for chimpanzees. This can be helpful for determining the orientation of chimpanzees in videos and for their behaviors. Here is an example of what that output looks like:
Additionally, the model provides mapping of the segmentation output to specific anatomy for chimpanzees. This can be helpful for determining the orientation of chimpanzees in videos and for understanding their behaviors. Here is an example of what that output looks like:

![chimpanzee texture output](../media/texture_out.png)

For more information on the algorithms and outputs of the DensePose model, see the [Facebook DensePose Github Repository](https://github.com/facebookresearch/detectron2/tree/main/projects/DensePose).

## Outputs

The Zamba package supports running Densepose on videos to generate three types of outputs:
The Zamba package supports running DensePose on videos to generate three types of outputs:

- A `.json` file with details of segmentations per video frame.
- A `.mp4` file where the original video has the segmentation rendered on top of animal so that the output can be vsiually inspected.
- A `.csv` (when `--output-type chimp_anatomy`) that contains the height and width of the bounding box around each chimpanzee, the frame number and timestamp of the observation, and the percentage of pixels in the bounding box that correspond with each anatomical part.
- A `.mp4` file where the original video has the segmentation rendered on top of animal so that the output can be visually inspected.
- A `.csv` that contains the height and width of the bounding box around each chimpanzee, the frame number and timestamp of the observation, and the percentage of pixels in the bounding box that correspond with each anatomical part. This is specified by adding `--output-type chimp_anatomy`.

Generally, running the densepose model is computationally intensive. It is recommended to run the model at a relatively low framerate (e.g., 1 frame per second) to generate outputs for a video. Another caveat is that because the output JSON output contains the full embedding, these files can be quite large. These are not written out by default.
Running the DensePose model is fairly computationally intensive. It is recommended to run the model at a relatively low framerate (e.g., 1 frame per second) to generate outputs for a video. JSON output files can also be quite large because they contain the full embedding. These are not written out by default.

In order to use the densepose model, you must have PyTorch already installed on your system, and then you must install the `densepose` extra:
In order to use the DensePose model, you must have [PyTorch](https://pytorch.org/get-started/locally/) already installed on your system. Then you must install the `densepose` extra:

```bash
pip install torch # see https://pytorch.org/get-started/locally/
pip install torch
pip install "zamba[densepose]"
```

Expand All @@ -47,7 +47,7 @@ Once that is done, here's how to run the DensePose model:

<video controls>
<source src="../../media/densepose_zamba_vid.mp4" type="video/mp4">
</videp>
</video>


## Getting help
Expand Down
5 changes: 2 additions & 3 deletions docs/docs/models/species-detection.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,9 +99,7 @@ The `time_distributed` model was built by re-training a well-known image classif

### Training data

`time_distributed` was trained using data collected and annotated by partners at [The Max Planck Institute for
Evolutionary Anthropology](https://www.eva.mpg.de/index.html) and [Chimp &
See](https://www.chimpandsee.org/).
`time_distributed` was trained using data collected and annotated by partners at [The Max Planck Institute for Evolutionary Anthropology](https://www.eva.mpg.de/index.html) and [Chimp&See](https://www.chimpandsee.org/).

The data included camera trap videos from:

Expand Down Expand Up @@ -266,6 +264,7 @@ video_loader_config:
```

You can choose different frame selection methods and vary the size of the images that are used by passing in a custom [YAML configuration file](../yaml-config.md). The two requirements for the `slowfast` model are that:

- the video loader must return 32 frames
- videos inputted into the model must be at least 200 x 200 pixels

Expand Down
Loading

0 comments on commit b4b706a

Please sign in to comment.