markdown source builds

Auto-generated via {sandpaper} Source : 40fe409 Branch : main Author : Carsten Schnober <[email protected]> Time : 2024-07-22 12:51:30 +0000 Message : Merge pull request #496 from carpentries-incubator/fix/minor_comments_review_sarah Fix/minor comments review sarah
carpentries-incubator · Jul 22, 2024 · 6b6c891 · 6b6c891
1 parent 1496a74
commit 6b6c891
Show file tree

Hide file tree

Showing 7 changed files with 63 additions and 51 deletions.
diff --git a/1-introduction.md b/1-introduction.md
@@ -127,7 +127,9 @@ https://commons.wikimedia.org/w/index.php?curid=44920600, https://commons.wikime
 
 
 ##### Combining multiple neurons into a network
-Multiple neurons can be joined together by connecting the output of one to the input of another. These connections are associated with weights that determine the 'strength' of the connection, the weights are adjusted during training. In this way, the combination of neurons and connections describe a computational graph, an example can be seen in the image below. In most neural networks neurons are aggregated into layers. Signals travel from the input layer to the output layer, possibly through one or more intermediate layers called hidden layers.
+Multiple neurons can be joined together by connecting the output of one to the input of another. These connections are associated with weights that determine the 'strength' of the connection, the weights are adjusted during training. In this way, the combination of neurons and connections describe a computational graph, an example can be seen in the image below.
+
+In most neural networks, neurons are aggregated into layers. Signals travel from the input layer to the output layer, possibly through one or more intermediate layers called hidden layers.
 The image below shows an example of a neural network with three layers, each circle is a neuron, each line is an edge and the arrows indicate the direction data moves in.
 
 ![
@@ -197,9 +199,13 @@ b. This solves the XOR logical problem, the output is 1 if only one of the two i
 :::
 
 ##### What makes deep learning deep learning?
-Neural networks aren't a new technique, they have been around since the late 1940s. But until around 2010 neural networks tended to be quite small, consisting of only 10s or perhaps 100s of neurons. This limited them to only solving quite basic problems. Around 2010 improvements in computing power and the algorithms for training the networks made much larger and more powerful networks practical. These are known as deep neural networks or deep learning.
+Neural networks aren't a new technique, they have been around since the late 1940s. But until around 2010 neural networks tended to be quite small, consisting of only 10s or perhaps 100s of neurons. This limited them to only solving quite basic problems. Around 2010, improvements in computing power and the algorithms for training the networks made much larger and more powerful networks practical. These are known as deep neural networks or deep learning.
+
+Deep learning requires extensive training using example data which shows the network what output it should produce for a given input. One common application of deep learning is [classifying](https://glosario.carpentries.org/en/#classification) images. Here the network will be trained by being "shown" a series of images and told what they contain. Once the network is trained it should be able to take another image and correctly classify its contents.
+
+But we are not restricted to just using images, any kind of data can be learned by a deep learning neural network. This makes them able to appear to learn a set of complex rules only by being shown what the inputs and outputs of those rules are instead of being taught the actual rules. Using these approaches, deep learning networks have been taught to play video games and even drive cars.
 
-Deep Learning requires extensive training using example data which shows the network what output it should produce for a given input. One common application of deep learning is classifying images. Here the network will be trained by being "shown" a series of images and told what they contain. Once the network is trained it should be able to take another image and correctly classify its contents. But we are not restricted to just using images, any kind of data can be learned by a deep learning neural network. This makes them able to appear to learn a set of complex rules only by being shown what the inputs and outputs of those rules are instead of being taught the actual rules. Using these approaches deep learning networks have been taught to play video games and even drive cars. The data on which networks are trained usually has to be quite extensive, typically including thousands of examples. For this reason they are not suited to all applications and should be considered just one of many machine learning techniques which are available.
+The data on which networks are trained usually has to be quite extensive, typically including thousands of examples. For this reason they are not suited to all applications and should be considered just one of many machine learning techniques which are available.
 
 While traditional "shallow" networks might have had between three and five layers, deep networks often have tens or even hundreds of layers. This leads to them having millions of individual weights.
 The image below shows a diagram of all the layers (there are too many neurons to draw them all) on a deep learning network designed to detect pedestrians in images.
@@ -302,13 +308,13 @@ Here are just a few examples of how deep learning has been applied to some resea
 
 ### What sort of problems can deep learning solve, but should not be used for?
 
-Deep learning needs a lot of computational power, for this reason it often relies on specialised hardware like graphical processing units (GPUs). Many computational problems can be solved using less intensive techniques, but could still technically be solved with deep learning.
+Deep learning needs a lot of computational power, for this reason it often relies on specialised hardware like [graphical processing units (GPUs)](https://glosario.carpentries.org/en/#gpu). Many computational problems can be solved using less intensive techniques, but could still technically be solved with deep learning.
 
 The following could technically be achieved using deep learning, but it would probably be a very wasteful way to do it:
 
-* Logic operations, such as computing totals, averages, ranges etc. (see [this example](https://joelgrus.com/2016/05/23/fizz-buzz-in-tensorflow) applying deep learning to solve the "FizzBuzz" problem often used for programming interviews)
+* Logic operations, such as computing totals, averages, ranges etc. (see [this example](https://joelgrus.com/2016/05/23/fizz-buzz-in-tensorflow) applying deep learning to solve the ["FizzBuzz" problem](https://en.wikipedia.org/wiki/Fizz_buzz) often used for programming interviews)
 * Modelling well defined systems, where the equations governing them are known and understood.
-* Basic computer vision tasks such as edge detection, decreasing colour depth or blurring an image.
+* Basic computer vision tasks such as [edge detection](https://en.wikipedia.org/wiki/Edge_detection), decreasing colour depth or blurring an image.
 
 ::: challenge
 ## Deep Learning Problems Exercise
@@ -347,7 +353,7 @@ In case you have too little data available to train a complex network from scrat
 
 To apply deep learning to a problem there are several steps we need to go through:
 
-### 1. Formulate/ Outline the problem
+### 1. Formulate/Outline the problem
 
 Firstly we must decide what it is we want our deep learning system to do. Is it going to classify some data into one of a few categories? For example if we have an image of some hand written characters, the neural network could classify which character it is being shown. Or is it going to perform a prediction? For example trying to predict what the price of something will be tomorrow given some historical data on pricing and current trends.
 
@@ -444,7 +450,7 @@ There are many software libraries available for deep learning including:
 
 [Keras](https://keras.io/) is designed to be easy to use and usually requires fewer lines of code than other libraries. We have chosen it for this workshop for that reason. Keras can actually work on top of TensorFlow (and several other libraries), hiding away the complexities of TensorFlow while still allowing you to make use of their features.
 
-The performance of Keras is sometimes not as good as other libraries and if you are going to move on to create very large networks using very large datasets then you might want to consider one of the other libraries. But for many applications the performance difference will not be enough to worry about and the time you will save with simpler code will exceed what you will save by having the code run a little faster.
+The processing speed of Keras is sometimes not as high as with other libraries and if you are going to move on to create very large networks using very large datasets then you might want to consider one of the other libraries. But for many applications, the difference will not be enough to worry about and the time you will save with simpler code will exceed what you will save by having the code run a little faster.
 
 Keras also benefits from a very good set of [online documentation](https://keras.io/guides/) and a large user community. You will find that most of the concepts from Keras translate very well across to the other libraries if you wish to learn them at a later date.
 
@@ -453,16 +459,17 @@ Keras also benefits from a very good set of [online documentation](https://keras
 Follow the instructions in the [setup]({{ page.root }}//setup) document to install Keras, Seaborn and scikit-learn.
 
 ## Testing Keras Installation
-Lets check you have a suitable version of tensorflow installed.
+Keras is available as a module within TensorFlow, as described in the [setup]({{ page.root }}//setup).
+Let's therefore check whether you have a suitable version of TensorFlow installed.
 Open up a new Jupyter notebook or interactive python console and run the following commands:
 ```python
 import tensorflow
 print(tensorflow.__version__)
 ```
 ```output
-2.15.0
+2.17.0
 ```
-You should get a version number reported. At the time of writing 2.15.0 is the latest version.
+You should get a version number reported. At the time of writing 2.17.0 is the latest version.
 
 ## Testing Seaborn Installation
 Lets check you have a suitable version of seaborn installed.
@@ -472,9 +479,9 @@ import seaborn
 print(seaborn.__version__)
 ```
 ```output
-0.12.2
+0.13.2
 ```
-You should get a version number reported. At the time of writing 0.12.2 is the latest version.
+You should get a version number reported. At the time of writing 0.13.2 is the latest version.
 
 ## Testing scikit-learn Installation
 Lets check you have a suitable version of scikit-learn installed.
@@ -484,9 +491,9 @@ import sklearn
 print(sklearn.__version__)
 ```
 ```output
-1.2.2
+1.5.1
 ```
-You should get a version number reported. At the time of writing 1.2.2 is the latest version.
+You should get a version number reported. At the time of writing 1.5.1 is the latest version.
 
 
 :::::::::::::::::::::::::::::::::::::: keypoints

diff --git a/2-keras.md b/2-keras.md
@@ -55,15 +55,15 @@ It can be a good idea to park some of these questions for discussion in episode
 
 ::: callout
 ## GPU usage
-For this lesson having a GPU (graphics card) available is not needed.
+For this lesson having a [GPU (graphics processing unit)](https://glosario.carpentries.org/en/#gpu) available is not needed.
 We specifically use very small toy problems so that you do not need one.
 However, Keras will use your GPU automatically when it is available.
 Using a GPU becomes necessary when tackling larger datasets or complex problems which
 require a more complex neural network.
 :::
 
 ## 1. Formulate/outline the problem: penguin classification
-In this episode we will be using the [penguin dataset](https://zenodo.org/record/3960218), this is a dataset that was published in 2020 by Allison Horst and contains data on three different species of the penguins.
+In this episode we will be using the [penguin dataset](https://zenodo.org/record/3960218). This is a dataset that was published in 2020 by Allison Horst and contains data on three different species of the penguins.
 
 We will use the penguin dataset to train a neural network which can classify which species a
 penguin belongs to, based on their physical characteristics.
@@ -117,15 +117,15 @@ penguins.head()
  | 3 | Adelie | Torgersen | NaN  | NaN  | NaN   | NaN    | NaN    |
  | 4 | Adelie | Torgersen | 36.7 | 19.3 | 193.0 | 3450.0 | Female |
 
- All columns but the 'species' columns are features that we can use.
+ We can use all columns as features to predict the species of the penguin, except for the `species` column itself.
 
 Let's look at the shape of the dataset:
 
  ```python
  penguins.shape
  ```
 
-There are 344 samples and 7 columns, so 6 features
+There are 344 samples and 7 columns (plus the index column), so 6 features.
 
 ### Visualization
 Looking at numbers like this usually does not give a very good intuition about the data we are
@@ -218,15 +218,15 @@ Second, the target data is also in a format that cannot be used in training.
 A neural network can only take numerical inputs and outputs, and learns by
 calculating how "far away" the species predicted by the neural network is
 from the true species.
-When the target is a string category column as we have here it is very difficult to determine this "distance" or error.
-Therefore we will transform this column into a more suitable format.
-Again there are many ways to do this, however we will be using the one-hot encoding.
+
+When the target is a string category column as we have here, we need to transform this column into a numerical format first.
+Again, there are many ways to do this. We will be using the one-hot encoding.
 This encoding creates multiple columns, as many as there are unique values, and
 puts a 1 in the column with the corresponding correct class, and 0's in
 the other columns.
-For instance, for a penguin of the Adelie species the one-hot encoding would be 1 0 0
+For instance, for a penguin of the Adelie species the one-hot encoding would be 1 0 0.
 
-Fortunately pandas is able to generate this encoding for us.
+Fortunately, Pandas is able to generate this encoding for us.
 ```python
 import pandas as pd
 
@@ -285,7 +285,6 @@ This is a good time for switching instructor and/or a break.
 ## 4. Build an architecture from scratch or choose a pretrained model
 
 ### Keras for neural networks
-We will now build our first neural network from scratch. Although this sounds like a daunting task, you will experience that with [Keras](https://keras.io/) it is actually surprisingly straightforward.
 
 Keras is a machine learning framework with ease of use as one of its main features.
 It is part of the tensorflow python package and can be imported using `from tensorflow import keras`.
@@ -342,24 +341,27 @@ let us take a closer look.
 The first parameter `10` is the number of neurons we want in this layer, this is one of the
 hyperparameters of our system and needs to be chosen carefully. We will get back to this in the section
 on refining the model.
-The second parameter is the activation function to use, here we choose relu which is 0
+
+The second parameter is the activation function to use. We choose `relu` which returns 0
 for inputs that are 0 and below and the identity function (returning the same value)
 for inputs above 0.
 This is a commonly used activation function in deep neural networks that is proven to work well.
-Next we see an extra set of parenthenses with inputs in them, this means that after creating an
+
+Next we see an extra set of parenthenses with inputs in them. This means that after creating an
 instance of the Dense layer we call it as if it was a function.
 This tells the Dense layer to connect the layer passed as a parameter, in this case the inputs.
-Finally we store a reference so we can pass it to the output layer in a minute.
+
+Finally we store a reference in the `hidden_layer` variable so we can pass it to the output layer in a minute.
 
 Now we create another layer that will be our output layer.
 Again we use a Dense layer and so the call is very similar to the previous one.
 ```python
 output_layer = keras.layers.Dense(3, activation="softmax")(hidden_layer)
 ```
 
-Because we chose the one-hot encoding, we use `3` neurons for the output layer.
+Because we chose the one-hot encoding, we use three neurons for the output layer.
 
-The softmax activation ensures that the three output neurons produce values in the range
+The `softmax` activation ensures that the three output neurons produce values in the range
 (0, 1) and they sum to 1.
 We can interpret this as a kind of 'probability' that the sample belongs to a certain
 species.

diff --git a/3-monitor-the-model.md b/3-monitor-the-model.md
@@ -28,7 +28,7 @@ exercises: 80
 In this episode we first introduce a simple approach to the problem,
 then we iterate on that a few times to, step-by-step,
 working towards a more complex solution.
-Unfortunately this involves using the same code repeatedly over and over again,
+Unfortunately, this involves using the same code repeatedly over and over again,
 only slightly adapting it.
 
 To avoid too much typing, it can help to copy-paste code from higher up in the notebook.
@@ -160,7 +160,7 @@ This is a good time for switching instructor and/or a break.
 
 In episode 2 we trained a dense neural network on a *classification task*. For this one hot encoding was used together with a `Categorical Crossentropy` loss function.
 This measured how close the distribution of the neural network outputs corresponds to the distribution of the three values in the one hot encoding.
-Now we want to work on a *regression task*, thus not predicting a class label (or integer number) for a datapoint. In regression, we like to predict one (and sometimes many) values of a feature. This is typically a floating point number.
+Now we want to work on a *regression task*, thus not predicting a class label (or integer number) for a datapoint. In regression, we predict one (and sometimes many) values of a feature. This is typically a floating point number.
 
 ::: challenge
 ## Exercise: Architecture of the network
@@ -336,8 +336,9 @@ That is indeed a good first indicator if things are working alright, i.e. if the
 However, when models become more complicated then also the loss functions often become less intuitive.
 That is why it is good practice to monitor the training process with additional, more intuitive metrics.
 They are not used to optimize the model, but are simply recorded during training.
-With Keras such additional metrics can be added via `metrics=[...]` parameter and can contain one or multiple metrics of interest.
-Here we could for instance chose to use `'mae'` the mean absolute error, or the the *root mean squared error* (RMSE) which unlike the *mse* has the same units as the predicted values. For the sake of units, we choose the latter.
+
+With Keras, such additional metrics can be added via `metrics=[...]` parameter and can contain one or multiple metrics of interest.
+Here we could for instance chose `mae` ([mean absolute error](https://glosario.carpentries.org/en/#mean_absolute_error)), or the the [*root mean squared error* (RMSE)](https://glosario.carpentries.org/en/#root_mean_squared_error) which unlike the *mse* has the same units as the predicted values. For the sake of units, we choose the latter.
 
 ```python
 model.compile(optimizer='adam',