docs: update docs

dgcnz · Oct 31, 2024 · 052d4e1 · 052d4e1
1 parent 62de7db
commit 052d4e1
Show file tree

Hide file tree

Showing 4 changed files with 26 additions and 9 deletions.
diff --git a/docs/buildpdf b/docs/buildpdf
@@ -1,2 +1,2 @@
 #!/bin/bash
-jb build --path-output _build/pdf src/ --builder pdfhtml
+jb build --path-output _build/pdf src/ --builder pdflatex
diff --git a/docs/src/_config.yml b/docs/src/_config.yml
@@ -34,4 +34,8 @@ html:
 
 sphinx:
   config:
-    html_show_copyright: false
+    html_show_copyright: false
+    latex_elements:
+      preamble:
+        \usepackage{etoolbox}
+        \AtBeginEnvironment{figure}{\pretocmd{\hyperlink}{\protect}{}{}} 
diff --git a/docs/src/part1/problem.md b/docs/src/part1/problem.md
@@ -5,7 +5,7 @@
 
 ## Motivation
 
-Currently, the most popular approach for deploying an object detection model in a production environment is to use YOLO because it's fast and easy to use. However, to make it usable for your specific task you need to recollect a domain-specific dataset and fine-tune or train the model from scratch. This is a time-consuming and expensive process because such a dataset needs to be comprehensive enough to cover most of the possible scenarios that the model will encounter in the real world (weather conditions, different object scales and textures, etc). 
+Currently, the most popular approach for deploying an object detection model in a production environment is to use YOLO {cite}`yolo` because it's fast and easy to use. However, to make it usable for your specific task you need to recollect a domain-specific dataset and fine-tune or train the model from scratch. This is a time-consuming and expensive process because such a dataset needs to be comprehensive enough to cover most of the possible scenarios that the model will encounter in the real world (weather conditions, different object scales and textures, etc). 
 
 Recently, a new paradigm shift has emerged in the field as described by {cite}`bommasani2022`: instead of training a model from scratch for a specific task, you can use a model that was pre-trained on a generic task with massive data and compute as a backbone for your model and only fine-tune a decoder/head for your specific task (see {numref}`Figure {number} <knowledgetransfer>`). These pre-trained models are called Foundation Models and are great at capturing features that are useful for a wide range of tasks.
 
@@ -33,14 +33,17 @@ The drawback of using these foundation models is that they are large and computa
 
 ## Objectives
 
-According to {cite}`mcip`, practitioners at Apple do the following when asked to deploy a model to some edge device.
-Find a feasibility model A.
-Find a smaller model B equivalent to A that is suitable for the device.
-Compress model B to reach production-ready model C.
+To ground our problem, we can use the framework described by {cite}`mcip` that Apple engineers use to deploy machine learning models on their devices. As we can see on {numref}`Figure {number} <apple_practice>`, this consists on three steps:
+
+1. **Find a feasibility model A**: This involves training a model that can achieve the desired accuracy for the task.
+2. **Find a smaller model B equivalent to A that is suitable for the device**: This may involve distilling the knowledge of model A into a smaller and/or more architecturally efficient model B that can be found through (neural) architecture search.
+3. **Compress model B to reach production-ready model C**: This involves tuning the precision of the model to reduce its latency and memory cost to reach the performance budget.
 
 
 :::{figure-md} apple_practice
 <img src="apple_practice.png" alt="">
 
-Caption
-:::
+Source: {cite}`mcip`
+:::
+
+In this work, we will focus on steps 1 and 3 which will be covered by Part 1 (Finding a Feasibility Model) and Part 2 (Optimization) respectively. 
diff --git a/docs/src/references.bib b/docs/src/references.bib
@@ -151,4 +151,14 @@ @Article{mae2021
   journal = {arXiv:2111.06377},
   title   = {Masked Autoencoders Are Scalable Vision Learners},
   year    = {2021},
+}
+
+@misc{yolo,
+author = {Jocher, Glenn and Qiu, Jing and Chaurasia, Ayush},
+license = {AGPL-3.0},
+month = jan,
+title = {{Ultralytics YOLO}},
+url = {https://github.com/ultralytics/ultralytics},
+version = {8.0.0},
+year = {2023}
 }