Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLIP Training documentation #14

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 21 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,41 +15,44 @@ We are exploring the generation of new Bliss vocabulary using emerging AI techni

### Prerequisites

* [Python 3](https://www.python.org/downloads/)
* Version 3.9+. On Mac, Homebrew is the easiest way to install.
- [Python 3](https://www.python.org/downloads/)
- Version 3.9+. On Mac, Homebrew is the easiest way to install.

### Clone the Repository

* Clone the project from GitHub. [Create a fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo)
with your GitHub account, then run the following in your command line (make sure to replace `your-username` with
your username):
- Clone the project from GitHub. [Create a fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo)
with your GitHub account, then run the following in your command line (make sure to replace `your-username` with
your username):

```bash
git clone https://github.com/your-username/baby-bliss-bot
cd baby-bliss-bot
```

### Create/Activitate Virtual Environment

Always activate and use the python virtual environment to maintain an isolated environment for project's dependencies.

* [Create the virtual environment](https://docs.python.org/3/library/venv.html)
(one time setup):
- `python -m venv .venv`
- [Create the virtual environment](https://docs.python.org/3/library/venv.html)
(one time setup):

- `python -m venv .venv`

* Activate (every command-line session):
- Activate (every command-line session):
- Windows: `.\.venv\Scripts\activate`
- Mac/Linux: `source .venv/bin/activate`

### Install Python Dependencies

Run in the baby-bliss-bot directory:
* `pip install -r requirements.txt`

- `pip install -r requirements.txt`

## Linting

Run the following command to lint all python scripts:

* `flake8`
- `flake8`

## Model Experiments

Expand All @@ -65,21 +68,26 @@ on how to train this model, training results and the conclusion about how useful

### Texture Inversion

Concolusion: not useful
Concolusion: not useful

See the [Texture Inversion documentation](./notebooks/README.md) for details.

### CLIP Training

See the [Clip Taining documentation](./docs/CLIP-Training.md) for details.

## Notebooks

[`/notebooks`](./notebooks/) directory contains all notebooks used for training or fine-tuning various models.
Each notebook usually comes with a accompanying `dockerfile.yml` to elaborate the environment that the notebook was
running in.

## Jobs

[`/jobs`](./jobs/) directory contains all jobs used for training or fine-tuning various models.

## Utility Scripts

All utility functions are in the [`utils`](./utils) directory.
All utility functions are in the [`utils`](./utils) directory.

See [README.md](./utils/README.md) in the [`utils`](./utils) directory for details.
51 changes: 51 additions & 0 deletions docs/CLIP-Training.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Train CLIP

This article documents training a CLIP model with Bliss symbols image files and an annotated json file containing metadata for each image.

The following steps were followed for this training:

## Set Environment

```
# create new env clip_train
conda create -n clip_train python=3.8.5

# activate clip_train
conda activate clip_train

# install pytorch, torchvision
conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=10.2 -c pytorch

# Added additional depedency
pip install future

# install other dependencies
pip install -r requirements.txt
```

## Clone the repoository [clip-training](https://github.com/revantteotia/clip-training)

This repository contains code to train [CLIP](https://github.com/openai/CLIP) on [MS-COCO](https://cocodataset.org/#home) captions.

## Extract Bliss dataset in the directory data

The structure of the COCO dataset was used to prepare the Bliss Annotated dataset. Bliss dataset including images and annotations can be downloaded [here](https://drive.google.com/file/d/1kSE4egEvg2g5wKZLHCFTE1ZijUf0ZC2_/view?usp=sharing)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you include instructions, or scripts if any, on how to convert the bliss data into the structure of the COCO dataset. Thanks.


## Update [./dataloader/data_config.yaml](./dataloader/data_config.yaml)

```
train_img_dir : 'data/bliss/train'
train_annotation_file : 'data/bliss/annotations/bliss_data_annotated_CLIP.json'
```

## Run train.

Take dataset paths from 'dataloader/data_config.yaml'

```
$ python train.py
```

# Results

The results of the training can be downloaded [here - checkpoint_34_3395.pt.tar.gz](https://drive.google.com/file/d/1J_U2yW9MmRa4f23044brM_Winku507ZL/view?usp=sharing)
24 changes: 15 additions & 9 deletions jobs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,24 +3,30 @@
This directory contains all jobs used for training or fine-tuning various models.

## StyleGAN2-ADA

The [stylegan2-ada](./stylegan2-ada) folder contains:

- `def-styleGan2AdaPytorchDataSetupBatch.sh` is the SBatch script for preparing the training dataset for StyleGAN2-ADA. The script uses the `def-whkchun` cluster.
- `def-styleGAN2AdaPytorchTrainBatch.sh` is the SBatch script for training. The script uses the `def-whkchun` cluster.
- `ctb-styleGAN2AdaPytorchGenerateBatch.sh` is the SBatch script for generating an image from the StyleGAN2-ADA model. The script uses the `ctb-whkchun` cluster.
- `def-styleGAN2AdaPytorchGenerateBatch.sh` is the SBatch script that also can be used to generate images from the StyleGAN2-ADA model. This version uses the `def-whkchun` cluster.
- `requirements.txt` shows the packages used by the PyTorch implementation of StyleGAN2-ADA. Note that this is not used to create the environment, but to document the environment after it was created.
- `def-styleGan2AdaPytorchDataSetupBatch.sh` is the SBatch script for preparing the training dataset for StyleGAN2-ADA. The script uses the `def-whkchun` cluster.
- `def-styleGAN2AdaPytorchTrainBatch.sh` is the SBatch script for training. The script uses the `def-whkchun` cluster.
- `ctb-styleGAN2AdaPytorchGenerateBatch.sh` is the SBatch script for generating an image from the StyleGAN2-ADA model. The script uses the `ctb-whkchun` cluster.
- `def-styleGAN2AdaPytorchGenerateBatch.sh` is the SBatch script that also can be used to generate images from the StyleGAN2-ADA model. This version uses the `def-whkchun` cluster.
- `requirements.txt` shows the packages used by the PyTorch implementation of StyleGAN2-ADA. Note that this is not used to create the environment, but to document the environment after it was created.

See the [StyleGAN2-ADATraining.md](../docs/StyleGAN2-ADATraining.md) in the [documentation](../docs) folder for details on how to set up the environment.

## StyleGAN3

The [stylegan3](./stylegan3) directory contains:

- `requirements.txt` is used with other module installations to set up the environment for training
[the stylegan3 model](https://github.com/NVlabs/stylegan3) with the Bliss single characters.
- `job_stylegan3.sh` is the job script submitted in [the Cedar platform](https://docs.alliancecan.ca/wiki/Cedar)
to perform the training.
- `requirements.txt` is used with other module installations to set up the environment for training
[the stylegan3 model](https://github.com/NVlabs/stylegan3) with the Bliss single characters.
- `job_stylegan3.sh` is the job script submitted in [the Cedar platform](https://docs.alliancecan.ca/wiki/Cedar)
to perform the training.

See the [TrainStyleGAN3Model.md](../docs/TrainStyleGAN3Model.md) in the [documentation](../docs) folder for details on
how to about how to train this model, training results and the conclusion about how useful it is.

## CLIP

See the [CLIP-Training.md](../docs/CLIP-Training.md) in the [documentation](../docs) folder for details on
how to about how to train this model, download the dataset used and the model resulting from the training.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "how to about how to" -> "how to".

I made the same mistake at line 27, could you help to fix it too? Thanks.