chore: update docs and versions to v1.2.0

clementpoiret · Feb 11, 2024 · 334c738 · 334c738
1 parent 375987a
commit 334c738
Show file tree

Hide file tree

Showing 7 changed files with 187 additions and 10 deletions.
diff --git a/LAST_CHANGELOG.md b/LAST_CHANGELOG.md
@@ -2,10 +2,12 @@
 
 ## Software
 
-* Fixed installation on macOS,
-* Backends are now installed as "extras" (e.g. `pip install hsf[gpu]`)
+* Release of the HSF finetuning pipeline,
+* Bug fixes and optimizations,
 * Updated dependencies.
 
 ## Models
 
-* Nothing to report.
+* Models trained on hippocampal subfields from Clark et al. (2023) dataset (https://doi.org/10.1038/s41597-023-02449-9),
+* Models are now hosted on HuggingFace,
+* Bug fixes and optimizations.
diff --git a/README.rst b/README.rst
@@ -4,7 +4,7 @@ Hippocampal Segmentation Factory (HSF)
 
 Exhaustive documentation available at: `hsf.rtfd.io <https://hsf.rtfd.io/>`_
 
-**Current Models version:** 3.0.0
+**Current Models version:** 4.0.0
 
 .. list-table::
     :header-rows: 1
@@ -220,6 +220,13 @@ Changelogs
 HSF
 ---
 
+**Version 1.2.0**
+
+* Released finetuning scripts,
+* New models trained on more data,
+* Models are now hosted on HuggingFace,
+* Bug fixes and optimizations.
+
 **Version 1.1.3**
 
 * Lower onnxruntime dependency to min 1.8.0
@@ -270,6 +277,12 @@ HSF
 Models
 ------
 
+**Version 4.0.0**
+
+* Models trained on hippocampal subfields from Clark et al. (2023) dataset (https://doi.org/10.1038/s41597-023-02449-9),
+* Models are now hosted on HuggingFace,
+* Bug fixes and optimizations.
+
 **Version 3.0.0**
 
 * More data (coming from the Human Connectome Project),
@@ -316,6 +329,18 @@ Authorship:
 
 If you use this work, please cite it as follows:
 
-``C. Poiret, et al. (2021). clementpoiret/HSF. Zenodo. https://doi.org/10.5281/zenodo.5527122``
+```
+@ARTICLE{10.3389/fninf.2023.1130845,
+    AUTHOR={Poiret, Clement and Bouyeure, Antoine and Patil, Sandesh and Grigis, Antoine and Duchesnay, Edouard and Faillot, Matthieu and Bottlaender, Michel and Lemaitre, Frederic and Noulhiane, Marion},
+    TITLE={A fast and robust hippocampal subfields segmentation: HSF revealing lifespan volumetric dynamics},	
+    JOURNAL={Frontiers in Neuroinformatics},
+    VOLUME={17},
+    YEAR={2023},
+    URL={https://www.frontiersin.org/articles/10.3389/fninf.2023.1130845},
+    DOI={10.3389/fninf.2023.1130845},
+    ISSN={1662-5196},
+    ABSTRACT={The hippocampal subfields, pivotal to episodic memory, are distinct both in terms of cyto- and myeloarchitectony. Studying the structure of hippocampal subfields in vivo is crucial to understand volumetric trajectories across the lifespan, from the emergence of episodic memory during early childhood to memory impairments found in older adults. However, segmenting hippocampal subfields on conventional MRI sequences is challenging because of their small size. Furthermore, there is to date no unified segmentation protocol for the hippocampal subfields, which limits comparisons between studies. Therefore, we introduced a novel segmentation tool called HSF short for hippocampal segmentation factory, which leverages an end-to-end deep learning pipeline. First, we validated HSF against currently used tools (ASHS, HIPS, and HippUnfold). Then, we used HSF on 3,750 subjects from the HCP development, young adults, and aging datasets to study the effect of age and sex on hippocampal subfields volumes. Firstly, we showed HSF to be closer to manual segmentation than other currently used tools (p < 0.001), regarding the Dice Coefficient, Hausdorff Distance, and Volumetric Similarity. Then, we showed differential maturation and aging across subfields, with the dentate gyrus being the most affected by age. We also found faster growth and decay in men than in women for most hippocampal subfields. Thus, while we introduced a new, fast and robust end-to-end segmentation tool, our neuroanatomical results concerning the lifespan trajectories of the hippocampal subfields reconcile previous conflicting results.}
+}
+```
 
 This work licensed under MIT license was supported in part by the Fondation de France and the IDRIS/GENCI for the HPE Supercomputer Jean Zay.
diff --git a/docs/about/authorship.md b/docs/about/authorship.md
@@ -10,6 +10,18 @@
 
 If you use this work, please cite it as follows:
 
-`C. Poiret, A. Bouyeure, S. Patil, C. Boniteau, A. Grigis, E. Duchesnay, & M. Noulhiane. (2021). clementpoiret/HSF: Hippocampal Segmentation Factory. Zenodo. https://doi.org/10.5281/zenodo.5527122`
+```
+@ARTICLE{10.3389/fninf.2023.1130845,
+    AUTHOR={Poiret, Clement and Bouyeure, Antoine and Patil, Sandesh and Grigis, Antoine and Duchesnay, Edouard and Faillot, Matthieu and Bottlaender, Michel and Lemaitre, Frederic and Noulhiane, Marion},
+    TITLE={A fast and robust hippocampal subfields segmentation: HSF revealing lifespan volumetric dynamics},	
+    JOURNAL={Frontiers in Neuroinformatics},
+    VOLUME={17},
+    YEAR={2023},
+    URL={https://www.frontiersin.org/articles/10.3389/fninf.2023.1130845},
+    DOI={10.3389/fninf.2023.1130845},
+    ISSN={1662-5196},
+    ABSTRACT={The hippocampal subfields, pivotal to episodic memory, are distinct both in terms of cyto- and myeloarchitectony. Studying the structure of hippocampal subfields in vivo is crucial to understand volumetric trajectories across the lifespan, from the emergence of episodic memory during early childhood to memory impairments found in older adults. However, segmenting hippocampal subfields on conventional MRI sequences is challenging because of their small size. Furthermore, there is to date no unified segmentation protocol for the hippocampal subfields, which limits comparisons between studies. Therefore, we introduced a novel segmentation tool called HSF short for hippocampal segmentation factory, which leverages an end-to-end deep learning pipeline. First, we validated HSF against currently used tools (ASHS, HIPS, and HippUnfold). Then, we used HSF on 3,750 subjects from the HCP development, young adults, and aging datasets to study the effect of age and sex on hippocampal subfields volumes. Firstly, we showed HSF to be closer to manual segmentation than other currently used tools (p < 0.001), regarding the Dice Coefficient, Hausdorff Distance, and Volumetric Similarity. Then, we showed differential maturation and aging across subfields, with the dentate gyrus being the most affected by age. We also found faster growth and decay in men than in women for most hippocampal subfields. Thus, while we introduced a new, fast and robust end-to-end segmentation tool, our neuroanatomical results concerning the lifespan trajectories of the hippocampal subfields reconcile previous conflicting results.}
+}
+```
 
 This work licensed under MIT license was supported in part by the Fondation de France and the IDRIS/GENCI for the HPE Supercomputer Jean Zay.
diff --git a/docs/about/release-notes.md b/docs/about/release-notes.md
@@ -16,6 +16,13 @@ Current maintainers:
 
 ## HSF
 
+### Version 1.2.0 (2024-02-06)
+
+* Released finetuning scripts,
+* New models trained on more data,
+* Models are now hosted on HuggingFace,
+* Bug fixes and optimizations.
+
 ### Version 1.1.1 (2022-04-27)
 
 * Added whole-hippocampus segmentation
@@ -53,6 +60,12 @@ Current maintainers:
 
 ## Models
 
+### Version 4.0.0 (2024-02-06)
+
+* New models trained on more data,
+* Models integrate architecture improvements,
+* Models are now hosted on HuggingFace.
+
 ### Version 3.0.0 (2022-04-24)
 
 * More data (coming from the Human Connectome Project),

diff --git a/docs/index.md b/docs/index.md
@@ -5,9 +5,9 @@
     <br>
     <font size="+2"><b>Hippocampal</b> <i>Segmentation</i> Factory</font>
     <br>
-    <b>Current HSF version:</b> 1.1.3<br>
-    <b>Built-in Models version:</b> 3.0.0<br>
-    <b>Models in the Hub:</b> 6
+    <b>Current HSF version:</b> 1.2.0<br>
+    <b>Built-in Models version:</b> 4.0.0<br>
+    <b>Models in the Hub:</b> 4
 </p>
 
 ____
@@ -82,6 +82,9 @@ which can be used in conjunction with pruned and int8 quantized models
 to deliver a much faster CPU inference speed (see [Hardware Acceleration](user-guide/configuration.md)
 section).
 
+Since v1.2.0, the complete training code is available at [hsf_train](https://github.com/clementpoiret/hsf_train).
+The `hsf_train` repository also contains easy to use scripts to train OR **finetune your own models**.
+
 ____
 
 HSF is distributed under the [MIT license](about/license.md):
@@ -93,5 +96,6 @@ HSF is distributed under the [MIT license](about/license.md):
 !!! note ""
     This work has been partly founded by the Fondation de France.
     HSF has been made possible by the IDRIS/GENCI with the HPE Jean Zay Supercomputer.
+    Latest models have been trained with the help of [Scaleway](https://www.scaleway.com/) and [Hugging Face](https://huggingface.co/).
 
     CEA Saclay | NeuroSpin | UNIACT-Inserm U1141
diff --git a/docs/user-guide/finetuning.md b/docs/user-guide/finetuning.md
@@ -0,0 +1,121 @@
+# Finetuning
+
+## Overview
+
+Since v1.2.0, the complete training code is available at [hsf_train](https://github.com/clementpoiret/hsf_train).
+The `hsf_train` repository also contains easy to use scripts to train OR **finetune your own models**.
+
+## Purpose & Use Cases
+
+The goal of HSF is to provide foundational models that can be used as a starting point for further development and customization in Hippocampal Subfields segmentation.
+
+We are aware that the provided models may not be perfect for every use case, and that's why we provide the training code and the possibility to finetune the models.
+
+If you want to use HSF on MRIs that are very different from the ones used for training, or if you want to segment the hippocampal subfields in a specific way, you are at the right place.
+
+## Configuration
+
+The `hsf_train` repository contains a `conf` directory, analogous to the `conf` directory in the `hsf` repository. This directory contains the configuration files for the training and finetuning scripts.
+
+Here is the default configuration file for the finetuning script:
+
+```yaml
+mode: decoder  # encoder or decoder, defines which part of the model to finetune (see below)
+depth: -1  # -1 for all layers, 0 for the first layer, 1 for the second layer, etc.
+unfreeze_frequency: 4  # how often to unfreeze a layer
+out_channels: 6  # number of output channels (subfields) in case you want to segment a different number of subfields
+```
+
+## Getting Started
+
+### Installation
+
+You will need to clone the repository, install PyTorch, and install the required packages:
+
+```bash
+git clone https://github.com/clementpoiret/hsf_train.git
+cd hsf_train
+conda create -n hsf_train python=3.10
+conda activate hsf_train
+conda install pytorch torchvision torchaudio cudatoolkit=12.1 -c pytorch -c nvidia
+pip install -r requirements.txt
+```
+
+### Custom Dataset
+
+Whoever wants to finetune the models will need to provide their own dataset. The heavier the changes are between the training dataset and the custom dataset, the more data you will need.
+
+You will need to adapt the example `custom_dataset.yaml` file to suit your needs, such as:
+
+```yaml
+main_path: "/mnt/data/hsf/"
+output_path: "/mnt/hsf/models/"
+batch_size: 1
+num_workers: 16
+pin_memory: True
+train_ratio: .9
+replace: False
+k_sample: Null  # i.e. k = train_ratio * num_samples
+train_val_test_idx: Null
+train_on_all: False
+
+datasets:
+  clark:
+    path: "hippocampus_clark_3T"
+    ca_type: "1/23"
+    patterns:
+      right_t2:
+        mri: "**/t2w_Hippocampus_right_ElasticSyN_crop.nii.gz"
+        label: "t2w_Hippocampus_right_ElasticSyN_seg_crop.nii.gz"
+      left_t2:
+        mri: "**/t2w_Hippocampus_left_ElasticSyN_crop.nii.gz"
+        label: "t2w_Hippocampus_left_ElasticSyN_seg_crop.nii.gz"
+      averaged_right_t2:
+        mri: "**/averaged_t2w_Hippocampus_right_ElasticSyN_crop.nii.gz"
+        label: "averaged_t2w_Hippocampus_right_ElasticSyN_seg_crop.nii.gz"
+      averaged_left_t2:
+        mri: "**/averaged_t2w_Hippocampus_left_ElasticSyN_crop.nii.gz"
+        label: "averaged_t2w_Hippocampus_left_ElasticSyN_seg_crop.nii.gz"
+    labels:
+      1: 1
+      2: 2
+      3: 3
+      4: 4
+      5: 5
+      6: 6
+      7: 7
+    labels_names:
+      1: "DG"
+      2: "CA2/3"
+      3: "CA1"
+      4: "PRESUB"
+      5: "UNCUS"
+      6: "PARASUB"
+      7: "KYST"
+```
+
+??? info "Comment on the `out_channels` parameter"
+    You can see in the example above that we have 7 subfields. However, the `out_channels` parameter also includes the background (class 0), so you should set it to 8.
+
+### Finetuning
+
+Then, you can run the finetuning script:
+
+```bash
+python finetune.py \
+  datasets=custom_dataset \
+  finetuning.out_channels=8 \
+  models.lr=1e-3 \
+  models.use_forgiving_loss=False
+```
+
+## Training Tips
+
+Numerous factors can influence the quality of the finetuned model. Here are some tips to help you get the best results:
+
+- **Number of manual segmentations**: The more manual segmentations you have, the better. The more subtle the differences between our original dataset and your custom dataset, the less data you will need,
+- **Learning rate**: We recommend starting with a learning rate of 1e-3,
+- **Depth**: We recommend starting with a depth of -1, which means that all layers will be finetuned. If your changes are small, you can finetune less layers, or even only the final layer (depth = 0),
+- **Number of epochs**: We recommend starting with 16 epochs if the depth is -1, otherwise a rule of thumb might be `epochs = (depth + 1) * unfreeze_frequency`,
+- **Mode**: It depends on the type of changes you want to make. If you change the image modality (e.g. another contrast than T1w or T2w, a magnetic field intensity larger than 7T, etc.), you might want to finetune the encoder. If you want to segment a different number of subfields or use another segmentation guideline, you might want to finetune the decoder.
+- **Unfreeze frequency**: We recommend starting with an unfreeze frequency of 4. Check the learning curves, you should reach a plateau before unfreezing the next layer,
diff --git a/tests/test_hsf.py b/tests/test_hsf.py
@@ -16,7 +16,7 @@
 
 
 def test_version():
-    assert __version__ == '1.1.3'
+    assert __version__ == '1.2.0'
 
 
 # SETUP FIXTURES