diff --git a/docs/cookbook/chain_of_thought.md b/docs/cookbook/chain_of_thought.md
index d320feb8d..cc079a7ff 100644
--- a/docs/cookbook/chain_of_thought.md
+++ b/docs/cookbook/chain_of_thought.md
@@ -7,13 +7,13 @@ In this guide, we use [outlines](https://outlines-dev.github.io/outlines/) to ap
 
 We use [llama.cpp](https://github.com/ggerganov/llama.cpp) using the [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) library. Outlines supports llama-cpp-python, but we need to install it ourselves:
 
-```shell
+```bash
 pip install llama-cpp-python
 ```
 
 We pull a quantized GGUF model, in this guide we pull [Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF) by [NousResearch](https://nousresearch.com/) from [HuggingFace](https://huggingface.co/):
 
-```shell
+```bash
 wget https://hf.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/resolve/main/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
 ```
 
diff --git a/docs/cookbook/dating_profiles.md b/docs/cookbook/dating_profiles.md
index f839b65fe..d0fb9b576 100644
--- a/docs/cookbook/dating_profiles.md
+++ b/docs/cookbook/dating_profiles.md
@@ -170,7 +170,7 @@ parsed_profile = DatingProfile.model_validate_json(profile)
 
 Here are a couple of results:
 
-```
+```json
 {
     "bio": """I'm an ambitious lawyer with a casual and fashionable style. I love
     games and sports, but my true passion is preparing refreshing cocktails at
@@ -199,7 +199,7 @@ Here are a couple of results:
 }
 ```
 
-```
+```json
 {
     "bio": """I’m a sexy lawyer with time on my hands. I love to game and
     play ping pong, but the real reason you should swipe to the right
diff --git a/docs/cookbook/index.md b/docs/cookbook/index.md
index 58e84ae96..a844ce240 100644
--- a/docs/cookbook/index.md
+++ b/docs/cookbook/index.md
@@ -1,5 +1,7 @@
 # Examples
 
+This part of the documentation provides a few cookbooks that you can browse to get acquainted with the library and get some inspiration about what you could do with structured generation. Remember that you can easily change the model that is being used!
+
 - [Classification](classification.md): Classify customer requests.
 - [Named Entity Extraction](extraction.md): Extract information from pizza orders.
 - [Dating Profile](dating_profiles.md): Build dating profiles from descriptions using prompt templating and JSON-structured generation.
diff --git a/docs/cookbook/knowledge_graph_extraction.md b/docs/cookbook/knowledge_graph_extraction.md
index c7e347dd4..c4c1dc75c 100644
--- a/docs/cookbook/knowledge_graph_extraction.md
+++ b/docs/cookbook/knowledge_graph_extraction.md
@@ -4,13 +4,13 @@ In this guide, we use [outlines](https://outlines-dev.github.io/outlines/) to ex
 
 We will use [llama.cpp](https://github.com/ggerganov/llama.cpp) using the [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) library. Outlines supports llama-cpp-python, but we need to install it ourselves:
 
-```shell
+```bash
 pip install llama-cpp-python
 ```
 
 We pull a quantized GGUF model, in this guide we pull [Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF) by [NousResearch](https://nousresearch.com/) from [HuggingFace](https://huggingface.co/):
 
-```shell
+```bash
 wget https://hf.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/resolve/main/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
 ```
 
diff --git a/docs/cookbook/qa-with-citations.md b/docs/cookbook/qa-with-citations.md
index cb39befe9..c2111617f 100644
--- a/docs/cookbook/qa-with-citations.md
+++ b/docs/cookbook/qa-with-citations.md
@@ -4,13 +4,13 @@ This tutorial is adapted from the [instructor-ollama notebook](https://github.co
 
 We will use [llama.cpp](https://github.com/ggerganov/llama.cpp) using the [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) library. Outlines supports llama-cpp-python, but we need to install it ourselves:
 
-```shell
+```bash
 pip install llama-cpp-python
 ```
 
 We pull a quantized GGUF model [Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF) by [NousResearch](https://nousresearch.com/) from [HuggingFace](https://huggingface.co/):
 
-```shell
+```bash
 wget https://hf.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/resolve/main/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
 ```
 
diff --git a/docs/cookbook/react_agent.md b/docs/cookbook/react_agent.md
index 74930b70b..15fb964a0 100644
--- a/docs/cookbook/react_agent.md
+++ b/docs/cookbook/react_agent.md
@@ -8,13 +8,13 @@ Additionally, we give the LLM the possibility of using a scratchpad described in
 
 We use [llama.cpp](https://github.com/ggerganov/llama.cpp) using the [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) library. Outlines supports llama-cpp-python, but we need to install it ourselves:
 
-```shell
+```bash
 pip install llama-cpp-python
 ```
 
 We pull a quantized GGUF model, in this guide we pull [Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B-GGUF) by [NousResearch](https://nousresearch.com/) from [HuggingFace](https://huggingface.co/):
 
-```shell
+```bash
 wget https://hf.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/resolve/main/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf
 ```
 
@@ -55,9 +55,8 @@ def wikipedia(q):
         "srsearch": q,
         "format": "json"
     }).json()["query"]["search"][0]["snippet"]
-```
 
-```python
+
 def calculate(numexp):
     return eval(numexp)
 ```
diff --git a/docs/reference/generation/generation.md b/docs/reference/generation/generation.md
index 88b963c72..0c090f8a7 100644
--- a/docs/reference/generation/generation.md
+++ b/docs/reference/generation/generation.md
@@ -208,3 +208,9 @@ result = generator("What is 2+2?")
 print(result)
 # 4
 ```
+
+
+[jsonschema]: https://json-schema.org/learn/getting-started-step-by-step
+[pydantic]: https://docs.pydantic.dev/latest
+[cfg]: https://en.wikipedia.org/wiki/Context-free_grammar
+[ebnf]: https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form
diff --git a/docs/reference/generation/structured_generation_explanation.md b/docs/reference/generation/structured_generation_explanation.md
index 0dedf060b..aa27a7a85 100644
--- a/docs/reference/generation/structured_generation_explanation.md
+++ b/docs/reference/generation/structured_generation_explanation.md
@@ -1,8 +1,4 @@
----
-title: Structured Generation Explanation
----
-
-# Structured Generation Explanation
+# How does Outlines work?
 
 
 Language models generate text token by token, using the previous token sequence as input and sampled logits as output. This document explains the structured generation process, where only legal tokens are considered for the next step based on a predefined automata, e.g. a regex-defined [finite-state machine](https://en.wikipedia.org/wiki/Finite-state_machine) (FSM) or [Lark](https://lark-parser.readthedocs.io/en/stable/) grammar.`
diff --git a/docs/reference/models/models.md b/docs/reference/models/models.md
index dadfd34ad..34b5be4cf 100644
--- a/docs/reference/models/models.md
+++ b/docs/reference/models/models.md
@@ -4,60 +4,20 @@ title: Models
 
 # Models
 
-Outlines supports generation using a number of inference engines (`outlines.models`)
-
-Loading a model using outlines follows a similar interface between inference engines.
+Outlines supports generation using a number of inference engines (`outlines.models`). Loading a model using outlines follows a similar interface between inference engines:
 
 ```python
 import outlines
-```
-
-## [Transformers](./transformers.md)
-
-```python
-model = outlines.models.transformers("microsoft/Phi-3-mini-128k-instruct", model_kwargs={})
-```
-
-For additional arguments and use of other Huggingface Transformers model types see [Outlines' Transformers documentation](./transformers.md).
-
 
-## [Transformers Vision](./transformers_vision.md)
-
-```python
+model = outlines.models.transformers("microsoft/Phi-3-mini-128k-instruct")
 model = outlines.models.transformers_vision("llava-hf/llava-v1.6-mistral-7b-hf")
-```
-
-For examples of generation and other details, see [Outlines' Transformers Vision documentation](./transformers_vision.md).
-
-## [vLLM](./vllm.md)
-
-```python
 model = outlines.models.vllm("microsoft/Phi-3-mini-128k-instruct")
-```
-
-## [llama.cpp](./llamacpp.md)
-
-```python
-model = outlines.models.llamacpp("microsoft/Phi-3-mini-4k-instruct-gguf", "Phi-3-mini-4k-instruct-q4.gguf")
-```
-
-Additional llama.cpp parameters can be found in the [Outlines' llama.cpp documentation](./llamacpp.md).
-
-## [ExLlamaV2](./exllamav2.md)
-
-```python
+model = outlines.models.llamacpp(
+    "microsoft/Phi-3-mini-4k-instruct-gguf", "Phi-3-mini-4k-instruct-q4.gguf"
+)
 model = outlines.models.exllamav2("bartowski/Phi-3-mini-128k-instruct-exl2")
-```
-
-## [MLXLM](./mlxlmx.md)
-
-```python
 model = outlines.models.mlxlm("mlx-community/Phi-3-mini-4k-instruct-4bit")
-```
 
-## [OpenAI](./openai.md)
-
-```python
 model = outlines.models.openai(
     "gpt-4o-mini",
     api_key=os.environ["OPENAI_API_KEY"]
@@ -66,7 +26,7 @@ model = outlines.models.openai(
 
 
 # Feature Matrix
-|                   | Transformers | Transformers Vision | vLLM | llama.cpp | ExLlamaV2 | MLXLM | OpenAI* |
+|                   | [Transformers](transformers.md) | [Transformers Vision](transformers_vision.md) | [vLLM](vllm.md) | [llama.cpp](llamacpp.md) | [ExLlamaV2](exllamav2.md) | [MLXLM](mlxlm.md) | [OpenAI](openai.md)* |
 |-------------------|--------------|---------------------|------|-----------|-----------|-------|---------|
 | **Device**        |              |                     |      |           |           |       |         |
 | Cuda              | ✅           | ✅                  | ✅   | ✅        | ✅        | ❌    | N/A     |
diff --git a/docs/reference/models/transformers.md b/docs/reference/models/transformers.md
index 15eabb682..2a13e28ec 100644
--- a/docs/reference/models/transformers.md
+++ b/docs/reference/models/transformers.md
@@ -33,14 +33,15 @@ model = models.Transformers(llm, tokenizer)
 # Using Logits Processors
 
 There are two ways to use Outlines Structured Generation with HuggingFace Transformers:
-- 1) Use Outlines generation wrapper, `outlines.models.transformers`
-- 2) Use `OutlinesLogitsProcessor` with `transformers.AutoModelForCausalLM`
+
+1. Use Outlines generation wrapper, `outlines.models.transformers`
+2. Use `OutlinesLogitsProcessor` with `transformers.AutoModelForCausalLM`
 
 Outlines supports a myriad of logits processors for structured generation. In these example, we will use the `RegexLogitsProcessor` which guarantees generated text matches the specified pattern.
 
-## Example: `outlines.models.transformers`
+## Using `outlines.models.transformers`
 
-```
+```python
 import outlines
 
 time_regex_pattern = r"(0?[1-9]|1[0-2]):[0-5]\d\s?(am|pm)?"
@@ -53,9 +54,9 @@ print(output)
 # 2:30 pm
 ```
 
-## Example: Direct `transformers` library use
+## Using models initialized via the `transformers`  library
 
-```
+```python
 import outlines
 import transformers
 
@@ -117,8 +118,9 @@ model = outlines.models.transformers(
 )
 ```
 
-Further Reading:
-- https://huggingface.co/docs/transformers/en/model_doc/mamba
+
+
+Read [`transformers`'s documentation](https://huggingface.co/docs/transformers/en/model_doc/mamba) for more information.
 
 ### Encoder-Decoder Models
 
@@ -144,8 +146,3 @@ model_bart = models.transformers(
     model_class=AutoModelForSeq2SeqLM,
 )
 ```
-
-
-### Multi-Modal Models
-
-/Coming soon/
diff --git a/docs/stylesheets/extra.css b/docs/stylesheets/extra.css
index 4078215af..c4539ab80 100644
--- a/docs/stylesheets/extra.css
+++ b/docs/stylesheets/extra.css
@@ -96,6 +96,14 @@
   background: #FFFFFF ! important
 }
 
+.language-text {
+  background: #FFFFFF ! important
+}
+
+.language-json {
+  background: #FFFFFF ! important
+}
+
 h1.title {
   color: #FFFFFF;
   margin: 0px 0px 5px;
diff --git a/docs/welcome.md b/docs/welcome.md
index 4c327c020..a7800f7ad 100644
--- a/docs/welcome.md
+++ b/docs/welcome.md
@@ -6,7 +6,7 @@ Outlines〰 is a Python library that allows you to use Large Language Model in a
 
 ## What models do you support?
 
-We support [Openai](reference/models/openai.md), but the true power of Outlines〰 is unleashed with Open Source models available via the [transformers](reference/models/transformers.md), [llama.cpp](reference/models/llamacpp.md), [exllama2](reference/models/exllamav2.md) and [mamba_ssm](reference/models/mamba.md) libraries. If you want to build and maintain an integration with another library, [get in touch][discord].
+We support [Openai](reference/models/openai.md), but the true power of Outlines〰 is unleashed with Open Source models available via the [transformers](reference/models/transformers.md), [llama.cpp](reference/models/llamacpp.md), [exllama2](reference/models/exllamav2.md), [mlx-lm](reference/models/mlxlm.md) and [vllm](reference/models/vllm.md) models. If you want to build and maintain an integration with another library, [get in touch][discord].
 
 ## What are the main features?
 
@@ -17,7 +17,7 @@ We support [Openai](reference/models/openai.md), but the true power of Outlines
 
     No more invalid JSON outputs, 100% guaranteed
 
-    [:octicons-arrow-right-24: Generate JSON](reference/json.md)
+    [:octicons-arrow-right-24: Generate JSON](reference/generation/json.md)
 
 -   :material-keyboard-outline:{ .lg .middle } __JSON mode for vLLM__
 
@@ -34,7 +34,7 @@ We support [Openai](reference/models/openai.md), but the true power of Outlines
 
     Generate text that parses correctly 100% of the time
 
-    [:octicons-arrow-right-24: Guide LLMs](reference/regex.md)
+    [:octicons-arrow-right-24: Guide LLMs](reference/generation/regex.md)
 
 -    :material-chat-processing-outline:{ .lg .middle } __Powerful Prompt Templating__
 
diff --git a/mkdocs.yml b/mkdocs.yml
index d24ca9a63..afc56528b 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -121,24 +121,24 @@ nav:
   - Docs:
     - reference/index.md
     - Generation:
-        - Generation Overview: reference/generation/generation.md
+        - Overview: reference/generation/generation.md
         - Text: reference/text.md
         - Samplers: reference/samplers.md
         - Structured generation:
+            - How does it work?: reference/generation/structured_generation_explanation.md
             - Classification: reference/generation/choices.md
             - Regex: reference/generation/regex.md
             - Type constraints: reference/generation/format.md
             - JSON (function calling): reference/generation/json.md
             - Grammar: reference/generation/cfg.md
             - Custom FSM operations: reference/generation/custom_fsm_ops.md
-            - Structured Generation Technical Explanation: reference/generation/structured_generation_explanation.md
     - Utilities:
         - Serve with vLLM: reference/serve/vllm.md
-        - Custom types: reference/types.md
+        - Custom types: reference/generation/types.md
         - Prompt templating: reference/prompting.md
         - Outlines functions: reference/functions.md
     - Models:
-        - Models Overview: reference/models/models.md
+        - Overview: reference/models/models.md
         - Open source:
           - Transformers: reference/models/transformers.md
           - Transformers Vision: reference/models/transformers_vision.md