carpentries-incubator · dsmits · Nov 1, 2023 · Aug 16, 2023 · Nov 1, 2023
diff --git a/episodes/1-introduction.Rmd b/episodes/1-introduction.Rmd
@@ -38,7 +38,7 @@
 The image below shows some differences between artificial intelligence, Machine Learning and Deep Learning.


 ![](../fig/01_AI_ML_DL_differences.png){alt='An infographics showing the relation of AI, ML, NN and DL. NN are methods in DL which is a subset of ML algorithms that falls within the umbrella of AI'}

 The image above is by Tukijaaliwa, CC BY-SA 4.0, via Wikimedia Commons, [original source]( https://en.wikipedia.org/wiki/File:AI-ML-DL.svg)

@@ -59,13 +59,13 @@
 - one example equation to calculate the output for a neuron is: $output = ReLU(\sum_{i} (x_i*w_i) + bias)$


 ![](../fig/01_neuron.png){alt='A diagram of a single artificial neuron combining inputs and weights using an activation function.' width='600px'}

 ##### Combining multiple neurons into a network
 Multiple neurons can be joined together by connecting the output of one to the input of another. These connections are associated with weights that determine the 'strength' of the connection, the weights are adjusted during training. In this way, the combination of neurons and connections describe a computational graph, an example can be seen in the image below. In most neural networks neurons are aggregated into layers. Signals travel from the input layer to the output layer, possibly through one or more intermediate layers called hidden layers.
 The image below shows an example of a neural network with three layers, each circle is a neuron, each line is an edge and the arrows indicate the direction data moves in.

 ![](../fig/01_neural_net.png){alt='A diagram of a three layer neural network with an input layer, one hidden layer, and an output layer.'}
 The image above is by Glosser.ca, CC BY-SA 3.0 <https://creativecommons.org/licenses/by-sa/3.0>, via Wikimedia Commons, [original source](https://commons.wikimedia.org/wiki/File:Colored_neural_network.svg)

 ::: challenge
@@ -88,7 +88,7 @@

 Have a look at the following network:

 ![](../fig/01_xor_exercise.png){alt='A diagram of a neural network with 2 inputs, 2 hidden layer neurons, and 1 output.' width='400px'}

 a. Calculate the output of the network for the following combinations of inputs:

@@ -127,13 +127,13 @@
 ## Activation functions
 Look at the following activation functions:

 ![](../fig/01_sigmoid.svg){alt='Plot of the sigmoid function' width='200px'}
 A. Sigmoid activation function

 ![](../fig/01_relu.svg){alt='Plot of the ReLU function' width='200px'}
 B. ReLU activation function

 ![](../fig/01_identity_function.svg){alt='Plot of the Identity function' width='200px'}
 C. Identity (or linear) activation function

 Combine the following statements to the correct activation function:
@@ -172,7 +172,7 @@
 The input (left most) layer of the network is an image and the final (right most) layer of the network outputs a zero or one to determine if the input data belongs to the class of data we are interested in.
 This image is from the paper ["An Efficient Pedestrian Detection Method Based on YOLOv2" by Zhongmin Liu, Zhicai Chen, Zhanming Li, and Wenjin Hu published in Mathematical Problems in Engineering, Volume 2018](https://doi.org/10.1155/2018/3518959)

 ![](../fig/01_deep_network.png){alt='An example of a deep neural network'}

 ### How do neural networks learn?
 What happens in a neural network during the training process?
@@ -207,7 +207,7 @@
 Below you see the Huber loss (green, delta = 1) and Squared error loss (blue)
 as a function of `y_true - y_pred`.

 ![](../fig/01_huber_loss.png){alt='Huber loss (green, delta = 1) and squared error loss (blue)
 as a function of y_true - y_pred' width='400px'}

 Which loss function is more sensitive to outliers?
@@ -325,7 +325,10 @@
 
 Many datasets are not ready for immediate use in a neural network and will require some preparation. Neural networks can only really deal with numerical data, so any non-numerical data (for example words) will have to be somehow converted to numerical data.
 
-Next we will need to divide the data into multiple sets. One of these will be used by the training process and we will call it the training set. Another will be used to evaluate the accuracy of the training and we will call that one the test set. Sometimes we will also use a 3rd set known as a validation set to tune hyperparameters.
+Next we will need to divide the data into multiple sets.
+One of these will be used by the training process and we will call it the training set.
+Another will be used to evaluate the accuracy of the training and we will call that one the test set.
+Sometimes we will also use a 3rd set known as a validation set to refine the model.
 
 ### 4. Choose a pre-trained model or build a new architecture from scratch
 
@@ -345,7 +348,7 @@

 We can now go ahead and start training our neural network. We will probably keep doing this for a given number of iterations through our training dataset (referred to as _epochs_) or until the loss function gives a value under a certain threshold. The graph below show the loss against the number of _epochs_, generally the loss will go down with each _epoch_, but occasionally it will see a small rise.

 ![](../fig/training-0_to_1500.svg){alt='A graph showing an exponentially decreasing loss over the first 1500 epochs of training an example network.'}

 ### 7. Perform a Prediction/Classification

@@ -358,9 +361,12 @@
 
 Once we trained the network we want to measure its performance. To do this we use some additional data that was not part of the training, this is known as a test set. There are many different methods available for measuring performance and which one is best depends on the type of task we are attempting. These metrics are often published as an indication of how well our network performs.
 
-### 9. Tune Hyperparameters
+### 9. Refine the model
 
-Hyperparameters are all the parameters set by the person configuring the machine learning instead of those learned by the algorithm itself. The hyperparameters include the number of epochs or the parameters for the optimizer. It might be necessary to adjust these and re-run the training many times before we are happy with the result.
+We refine the model further. We can for example slightly change the architecture of the model, or change the number of nodes in a layer.
+Hyperparameters are all the parameters set by the person configuring the machine learning instead of those learned by the algorithm itself.
+The hyperparameters include the number of epochs or the parameters for the optimizer.
+It might be necessary to adjust these and re-run the training many times before we are happy with the result, this is often done automatically and that is referred to as hyperparameter tuning.
 
 ### 10. Share Model
 
@@ -452,7 +458,7 @@
 - "Deep Learning is a machine learning technique based on using many artificial neurons arranged in layers."
 - "Neural networks learn by minimizing a loss function."
 - "Deep Learning is well suited to classification and prediction problems such as image recognition."
-- "To use Deep Learning effectively we need to go through a workflow of: defining the problem, identifying inputs and outputs, preparing data, choosing the type of network, choosing a loss function, training the model, tuning Hyperparameters, measuring performance before we can classify data."
+- "To use Deep Learning effectively we need to go through a workflow of: defining the problem, identifying inputs and outputs, preparing data, choosing the type of network, choosing a loss function, training the model, refine the model, measuring performance before we can classify data."
 - "Keras is a Deep Learning library that is easier to use than many of the alternatives such as TensorFlow and PyTorch."
 
 ::::::::::::::::::::::::::::::::::::::::::::::::
diff --git a/episodes/2-keras.Rmd b/episodes/2-keras.Rmd
@@ -40,7 +40,7 @@ As a reminder below are the steps of the deep learning workflow:
 6. Train the model
 7. Perform a Prediction/Classification
 8. Measure performance
-9. Tune hyperparameters
+9. Refine the model
 10. Save model
 
 In this episode we will focus on a minimal example for each of these steps, later episodes will build on this knowledge to go into greater depth for some or all of these steps.
@@ -327,7 +327,7 @@ The instantiation here has 2 parameters and a seemingly strange combination of p
 let us take a closer look.
 The first parameter `10` is the number of neurons we want in this layer, this is one of the
 hyperparameters of our system and needs to be chosen carefully. We will get back to this in the section
-on hyperparameter tuning.
+on refining the model.
 The second parameter is the activation function to use, here we choose relu which is 0
 for inputs that are 0 and below and the identity function (returning the same value)
 for inputs above 0.
@@ -593,7 +593,7 @@ Length: 69, dtype: object
 ## 8. Measuring performance
 Now that we have a trained neural network it is important to assess how well it performs.
 We want to know how well it will perform in a realistic prediction scenario, measuring
-performance will also come back when tuning the hyperparameters.
+performance will also come back when refining the model.
 
 We have created a test set (i.e. y_test) during the data preparation stage which we will use
 now to create a confusion matrix.
@@ -667,18 +667,19 @@ We can try many things to improve the performance from here.
 One of the first things we can try is to balance the dataset better.
 Other options include: changing the network architecture or changing the
 training parameters
+
+Note that the outcome you have might be slightly different from what is shown in this tutorial.
 ::::
 :::
 
-## 9. Tune hyperparameters
+## 9. Refine the model
 As we discussed before the design and training of a neural network comes with
-many hyper parameter choices.
-We will go into more depth of these hyperparameters in later episodes.
+many hyperparameter and model architecture choices.
+We will go into more depth of these choices in later episodes.
 For now it is important to realize that the parameters we chose were
 somewhat arbitrary and more careful consideration needs to be taken to
 pick hyperparameter values. 
 
-Note that the outcome you have might be slightly different from what is shown in this tutorial.
 
 ## 10. Share model
 It is very useful to be able to use the trained neural network at a later

diff --git a/episodes/3-monitor-the-model.Rmd b/episodes/3-monitor-the-model.Rmd
@@ -505,7 +505,7 @@ randomly predicting a number, so the problem is not impossible to solve with mac
 ::::
 :::
 
-## 9. Tune hyperparameters
+## 9. Refine the model
 
 ### Watch your model training closely
 

diff --git a/episodes/4-advanced-layer-types.Rmd b/episodes/4-advanced-layer-types.Rmd
@@ -465,7 +465,7 @@ As you can see the validation accuracy only reaches about 35%, whereas the CNN r
 This demonstrates that convolutional layers are a big improvement over dense layers for this kind of datasets.
 :::
 
-## 9. Tune hyperparameters
+## 9. Refine the model
 
 ::: challenge
 ## Network depth

diff --git a/episodes/fig/graphviz/pipeline.dot b/episodes/fig/graphviz/pipeline.dot
@@ -12,11 +12,11 @@ digraph {
     train [label=<<B>Train</B><BR/>the model>]
     predict [label=<<B>Perform</B><BR/>Prediction>]
     quality [label=<<B>Measure</B><BR/>Performance>]
-    tune [label=<<B>Tune</B><BR/>Hyperparameters>]
+    refine [label=<<B>Refine</B><BR/>the model>]
     share [label=<<B>Share</B><BR/>the model>]
 
     #the graph
     formulate -> i_o -> prepare 
     prepare -> create_model -> loss
-    loss -> train -> predict -> quality -> tune -> share
+    loss -> train -> predict -> quality -> refine -> share
 }
-Original file line number
+Diff line change
@@ Expand Up @@
     ::::
     :::
-    ## 9. Tune hyperparameters
+    ## 9. Refine the model
     ### Watch your model training closely
@@ Expand Down @@