Skip to content

Commit

Permalink
markdown source builds
Browse files Browse the repository at this point in the history
Auto-generated via {sandpaper}
Source  : 40fe409
Branch  : main
Author  : Carsten Schnober <[email protected]>
Time    : 2024-07-22 12:51:30 +0000
Message : Merge pull request #496 from carpentries-incubator/fix/minor_comments_review_sarah

Fix/minor comments review sarah
  • Loading branch information
actions-user committed Jul 22, 2024
1 parent 1496a74 commit 6b6c891
Show file tree
Hide file tree
Showing 7 changed files with 63 additions and 51 deletions.
37 changes: 22 additions & 15 deletions 1-introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,9 @@ https://commons.wikimedia.org/w/index.php?curid=44920600, https://commons.wikime


##### Combining multiple neurons into a network
Multiple neurons can be joined together by connecting the output of one to the input of another. These connections are associated with weights that determine the 'strength' of the connection, the weights are adjusted during training. In this way, the combination of neurons and connections describe a computational graph, an example can be seen in the image below. In most neural networks neurons are aggregated into layers. Signals travel from the input layer to the output layer, possibly through one or more intermediate layers called hidden layers.
Multiple neurons can be joined together by connecting the output of one to the input of another. These connections are associated with weights that determine the 'strength' of the connection, the weights are adjusted during training. In this way, the combination of neurons and connections describe a computational graph, an example can be seen in the image below.

In most neural networks, neurons are aggregated into layers. Signals travel from the input layer to the output layer, possibly through one or more intermediate layers called hidden layers.
The image below shows an example of a neural network with three layers, each circle is a neuron, each line is an edge and the arrows indicate the direction data moves in.

![
Expand Down Expand Up @@ -197,9 +199,13 @@ b. This solves the XOR logical problem, the output is 1 if only one of the two i
:::

##### What makes deep learning deep learning?
Neural networks aren't a new technique, they have been around since the late 1940s. But until around 2010 neural networks tended to be quite small, consisting of only 10s or perhaps 100s of neurons. This limited them to only solving quite basic problems. Around 2010 improvements in computing power and the algorithms for training the networks made much larger and more powerful networks practical. These are known as deep neural networks or deep learning.
Neural networks aren't a new technique, they have been around since the late 1940s. But until around 2010 neural networks tended to be quite small, consisting of only 10s or perhaps 100s of neurons. This limited them to only solving quite basic problems. Around 2010, improvements in computing power and the algorithms for training the networks made much larger and more powerful networks practical. These are known as deep neural networks or deep learning.

Deep learning requires extensive training using example data which shows the network what output it should produce for a given input. One common application of deep learning is [classifying](https://glosario.carpentries.org/en/#classification) images. Here the network will be trained by being "shown" a series of images and told what they contain. Once the network is trained it should be able to take another image and correctly classify its contents.

But we are not restricted to just using images, any kind of data can be learned by a deep learning neural network. This makes them able to appear to learn a set of complex rules only by being shown what the inputs and outputs of those rules are instead of being taught the actual rules. Using these approaches, deep learning networks have been taught to play video games and even drive cars.

Deep Learning requires extensive training using example data which shows the network what output it should produce for a given input. One common application of deep learning is classifying images. Here the network will be trained by being "shown" a series of images and told what they contain. Once the network is trained it should be able to take another image and correctly classify its contents. But we are not restricted to just using images, any kind of data can be learned by a deep learning neural network. This makes them able to appear to learn a set of complex rules only by being shown what the inputs and outputs of those rules are instead of being taught the actual rules. Using these approaches deep learning networks have been taught to play video games and even drive cars. The data on which networks are trained usually has to be quite extensive, typically including thousands of examples. For this reason they are not suited to all applications and should be considered just one of many machine learning techniques which are available.
The data on which networks are trained usually has to be quite extensive, typically including thousands of examples. For this reason they are not suited to all applications and should be considered just one of many machine learning techniques which are available.

While traditional "shallow" networks might have had between three and five layers, deep networks often have tens or even hundreds of layers. This leads to them having millions of individual weights.
The image below shows a diagram of all the layers (there are too many neurons to draw them all) on a deep learning network designed to detect pedestrians in images.
Expand Down Expand Up @@ -302,13 +308,13 @@ Here are just a few examples of how deep learning has been applied to some resea

### What sort of problems can deep learning solve, but should not be used for?

Deep learning needs a lot of computational power, for this reason it often relies on specialised hardware like graphical processing units (GPUs). Many computational problems can be solved using less intensive techniques, but could still technically be solved with deep learning.
Deep learning needs a lot of computational power, for this reason it often relies on specialised hardware like [graphical processing units (GPUs)](https://glosario.carpentries.org/en/#gpu). Many computational problems can be solved using less intensive techniques, but could still technically be solved with deep learning.

The following could technically be achieved using deep learning, but it would probably be a very wasteful way to do it:

* Logic operations, such as computing totals, averages, ranges etc. (see [this example](https://joelgrus.com/2016/05/23/fizz-buzz-in-tensorflow) applying deep learning to solve the "FizzBuzz" problem often used for programming interviews)
* Logic operations, such as computing totals, averages, ranges etc. (see [this example](https://joelgrus.com/2016/05/23/fizz-buzz-in-tensorflow) applying deep learning to solve the ["FizzBuzz" problem](https://en.wikipedia.org/wiki/Fizz_buzz) often used for programming interviews)
* Modelling well defined systems, where the equations governing them are known and understood.
* Basic computer vision tasks such as edge detection, decreasing colour depth or blurring an image.
* Basic computer vision tasks such as [edge detection](https://en.wikipedia.org/wiki/Edge_detection), decreasing colour depth or blurring an image.

::: challenge
## Deep Learning Problems Exercise
Expand Down Expand Up @@ -347,7 +353,7 @@ In case you have too little data available to train a complex network from scrat

To apply deep learning to a problem there are several steps we need to go through:

### 1. Formulate/ Outline the problem
### 1. Formulate/Outline the problem

Firstly we must decide what it is we want our deep learning system to do. Is it going to classify some data into one of a few categories? For example if we have an image of some hand written characters, the neural network could classify which character it is being shown. Or is it going to perform a prediction? For example trying to predict what the price of something will be tomorrow given some historical data on pricing and current trends.

Expand Down Expand Up @@ -444,7 +450,7 @@ There are many software libraries available for deep learning including:

[Keras](https://keras.io/) is designed to be easy to use and usually requires fewer lines of code than other libraries. We have chosen it for this workshop for that reason. Keras can actually work on top of TensorFlow (and several other libraries), hiding away the complexities of TensorFlow while still allowing you to make use of their features.

The performance of Keras is sometimes not as good as other libraries and if you are going to move on to create very large networks using very large datasets then you might want to consider one of the other libraries. But for many applications the performance difference will not be enough to worry about and the time you will save with simpler code will exceed what you will save by having the code run a little faster.
The processing speed of Keras is sometimes not as high as with other libraries and if you are going to move on to create very large networks using very large datasets then you might want to consider one of the other libraries. But for many applications, the difference will not be enough to worry about and the time you will save with simpler code will exceed what you will save by having the code run a little faster.

Keras also benefits from a very good set of [online documentation](https://keras.io/guides/) and a large user community. You will find that most of the concepts from Keras translate very well across to the other libraries if you wish to learn them at a later date.

Expand All @@ -453,16 +459,17 @@ Keras also benefits from a very good set of [online documentation](https://keras
Follow the instructions in the [setup]({{ page.root }}//setup) document to install Keras, Seaborn and scikit-learn.

## Testing Keras Installation
Lets check you have a suitable version of tensorflow installed.
Keras is available as a module within TensorFlow, as described in the [setup]({{ page.root }}//setup).
Let's therefore check whether you have a suitable version of TensorFlow installed.
Open up a new Jupyter notebook or interactive python console and run the following commands:
```python
import tensorflow
print(tensorflow.__version__)
```
```output
2.15.0
2.17.0
```
You should get a version number reported. At the time of writing 2.15.0 is the latest version.
You should get a version number reported. At the time of writing 2.17.0 is the latest version.

## Testing Seaborn Installation
Lets check you have a suitable version of seaborn installed.
Expand All @@ -472,9 +479,9 @@ import seaborn
print(seaborn.__version__)
```
```output
0.12.2
0.13.2
```
You should get a version number reported. At the time of writing 0.12.2 is the latest version.
You should get a version number reported. At the time of writing 0.13.2 is the latest version.

## Testing scikit-learn Installation
Lets check you have a suitable version of scikit-learn installed.
Expand All @@ -484,9 +491,9 @@ import sklearn
print(sklearn.__version__)
```
```output
1.2.2
1.5.1
```
You should get a version number reported. At the time of writing 1.2.2 is the latest version.
You should get a version number reported. At the time of writing 1.5.1 is the latest version.


:::::::::::::::::::::::::::::::::::::: keypoints
Expand Down
32 changes: 17 additions & 15 deletions 2-keras.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,15 +55,15 @@ It can be a good idea to park some of these questions for discussion in episode

::: callout
## GPU usage
For this lesson having a GPU (graphics card) available is not needed.
For this lesson having a [GPU (graphics processing unit)](https://glosario.carpentries.org/en/#gpu) available is not needed.
We specifically use very small toy problems so that you do not need one.
However, Keras will use your GPU automatically when it is available.
Using a GPU becomes necessary when tackling larger datasets or complex problems which
require a more complex neural network.
:::

## 1. Formulate/outline the problem: penguin classification
In this episode we will be using the [penguin dataset](https://zenodo.org/record/3960218), this is a dataset that was published in 2020 by Allison Horst and contains data on three different species of the penguins.
In this episode we will be using the [penguin dataset](https://zenodo.org/record/3960218). This is a dataset that was published in 2020 by Allison Horst and contains data on three different species of the penguins.

We will use the penguin dataset to train a neural network which can classify which species a
penguin belongs to, based on their physical characteristics.
Expand Down Expand Up @@ -117,15 +117,15 @@ penguins.head()
| 3 | Adelie | Torgersen | NaN | NaN | NaN | NaN | NaN |
| 4 | Adelie | Torgersen | 36.7 | 19.3 | 193.0 | 3450.0 | Female |

All columns but the 'species' columns are features that we can use.
We can use all columns as features to predict the species of the penguin, except for the `species` column itself.

Let's look at the shape of the dataset:

```python
penguins.shape
```

There are 344 samples and 7 columns, so 6 features
There are 344 samples and 7 columns (plus the index column), so 6 features.

### Visualization
Looking at numbers like this usually does not give a very good intuition about the data we are
Expand Down Expand Up @@ -218,15 +218,15 @@ Second, the target data is also in a format that cannot be used in training.
A neural network can only take numerical inputs and outputs, and learns by
calculating how "far away" the species predicted by the neural network is
from the true species.
When the target is a string category column as we have here it is very difficult to determine this "distance" or error.
Therefore we will transform this column into a more suitable format.
Again there are many ways to do this, however we will be using the one-hot encoding.

When the target is a string category column as we have here, we need to transform this column into a numerical format first.
Again, there are many ways to do this. We will be using the one-hot encoding.
This encoding creates multiple columns, as many as there are unique values, and
puts a 1 in the column with the corresponding correct class, and 0's in
the other columns.
For instance, for a penguin of the Adelie species the one-hot encoding would be 1 0 0
For instance, for a penguin of the Adelie species the one-hot encoding would be 1 0 0.

Fortunately pandas is able to generate this encoding for us.
Fortunately, Pandas is able to generate this encoding for us.
```python
import pandas as pd

Expand Down Expand Up @@ -285,7 +285,6 @@ This is a good time for switching instructor and/or a break.
## 4. Build an architecture from scratch or choose a pretrained model

### Keras for neural networks
We will now build our first neural network from scratch. Although this sounds like a daunting task, you will experience that with [Keras](https://keras.io/) it is actually surprisingly straightforward.

Keras is a machine learning framework with ease of use as one of its main features.
It is part of the tensorflow python package and can be imported using `from tensorflow import keras`.
Expand Down Expand Up @@ -342,24 +341,27 @@ let us take a closer look.
The first parameter `10` is the number of neurons we want in this layer, this is one of the
hyperparameters of our system and needs to be chosen carefully. We will get back to this in the section
on refining the model.
The second parameter is the activation function to use, here we choose relu which is 0

The second parameter is the activation function to use. We choose `relu` which returns 0
for inputs that are 0 and below and the identity function (returning the same value)
for inputs above 0.
This is a commonly used activation function in deep neural networks that is proven to work well.
Next we see an extra set of parenthenses with inputs in them, this means that after creating an

Next we see an extra set of parenthenses with inputs in them. This means that after creating an
instance of the Dense layer we call it as if it was a function.
This tells the Dense layer to connect the layer passed as a parameter, in this case the inputs.
Finally we store a reference so we can pass it to the output layer in a minute.

Finally we store a reference in the `hidden_layer` variable so we can pass it to the output layer in a minute.

Now we create another layer that will be our output layer.
Again we use a Dense layer and so the call is very similar to the previous one.
```python
output_layer = keras.layers.Dense(3, activation="softmax")(hidden_layer)
```

Because we chose the one-hot encoding, we use `3` neurons for the output layer.
Because we chose the one-hot encoding, we use three neurons for the output layer.

The softmax activation ensures that the three output neurons produce values in the range
The `softmax` activation ensures that the three output neurons produce values in the range
(0, 1) and they sum to 1.
We can interpret this as a kind of 'probability' that the sample belongs to a certain
species.
Expand Down
9 changes: 5 additions & 4 deletions 3-monitor-the-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ exercises: 80
In this episode we first introduce a simple approach to the problem,
then we iterate on that a few times to, step-by-step,
working towards a more complex solution.
Unfortunately this involves using the same code repeatedly over and over again,
Unfortunately, this involves using the same code repeatedly over and over again,
only slightly adapting it.

To avoid too much typing, it can help to copy-paste code from higher up in the notebook.
Expand Down Expand Up @@ -160,7 +160,7 @@ This is a good time for switching instructor and/or a break.

In episode 2 we trained a dense neural network on a *classification task*. For this one hot encoding was used together with a `Categorical Crossentropy` loss function.
This measured how close the distribution of the neural network outputs corresponds to the distribution of the three values in the one hot encoding.
Now we want to work on a *regression task*, thus not predicting a class label (or integer number) for a datapoint. In regression, we like to predict one (and sometimes many) values of a feature. This is typically a floating point number.
Now we want to work on a *regression task*, thus not predicting a class label (or integer number) for a datapoint. In regression, we predict one (and sometimes many) values of a feature. This is typically a floating point number.

::: challenge
## Exercise: Architecture of the network
Expand Down Expand Up @@ -336,8 +336,9 @@ That is indeed a good first indicator if things are working alright, i.e. if the
However, when models become more complicated then also the loss functions often become less intuitive.
That is why it is good practice to monitor the training process with additional, more intuitive metrics.
They are not used to optimize the model, but are simply recorded during training.
With Keras such additional metrics can be added via `metrics=[...]` parameter and can contain one or multiple metrics of interest.
Here we could for instance chose to use `'mae'` the mean absolute error, or the the *root mean squared error* (RMSE) which unlike the *mse* has the same units as the predicted values. For the sake of units, we choose the latter.

With Keras, such additional metrics can be added via `metrics=[...]` parameter and can contain one or multiple metrics of interest.
Here we could for instance chose `mae` ([mean absolute error](https://glosario.carpentries.org/en/#mean_absolute_error)), or the the [*root mean squared error* (RMSE)](https://glosario.carpentries.org/en/#root_mean_squared_error) which unlike the *mse* has the same units as the predicted values. For the sake of units, we choose the latter.

```python
model.compile(optimizer='adam',
Expand Down
Loading

0 comments on commit 6b6c891

Please sign in to comment.