Skip to content

Commit

Permalink
Enable doctests
Browse files Browse the repository at this point in the history
  • Loading branch information
Saransh-cpp committed Jul 6, 2022
1 parent 07b8305 commit 69215fe
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 16 deletions.
1 change: 1 addition & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
[deps]
BSON = "fbb218c0-5317-5bc6-957e-2ee96dd4b1f0"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
Functors = "d9f16b24-f501-4c13-a1f2-28368ffc5196"
MLDatasets = "eb30cadb-4394-5ae3-aed4-317e484a6458"
Expand Down
4 changes: 2 additions & 2 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
using Documenter, Flux, NNlib, Functors, MLUtils, BSON, Plots, MLDatasets, Statistics
using Documenter, Flux, NNlib, Functors, MLUtils, BSON, Plots, MLDatasets, Statistics, DataFrames


DocMeta.setdocmeta!(Flux, :DocTestSetup, :(using Flux); recursive = true)

makedocs(
modules = [Flux, NNlib, Functors, MLUtils, BSON, Plots, MLDatasets, Statistics],
modules = [Flux, NNlib, Functors, MLUtils, BSON, Plots, MLDatasets, Statistics, DataFrames],
doctest = false,
sitename = "Flux",
pages = [
Expand Down
29 changes: 15 additions & 14 deletions docs/src/getting_started/linear_regression.md
Original file line number Diff line number Diff line change
Expand Up @@ -270,20 +270,22 @@ julia> using MLDatasets: BostonHousing
### Data
Let's start by initializing our dataset. We will be using the [`BostonHousing`](https://juliaml.github.io/MLDatasets.jl/stable/datasets/misc/#MLDatasets.BostonHousing) dataset consisting of `506` data points. Each of these data points has `13` features and a corresponding label, the house's price. The `x`s are still mapped to a single `y`, but now, a single `x` data point has 13 features.

```julia linear_regression_complex
```jldoctest linear_regression_complex
julia> using DataFrames
julia> dataset = BostonHousing()
dataset BostonHousing:
metadata => Dict{String, Any} with 5 entries
features => 506×13 DataFrame
targets => 506×1 DataFrame
dataframe => 506×14 DataFrame
julia> x, y = BostonHousing(as_df=false)[:]
julia> x, y = BostonHousing(as_df=false)[:];
```

We can now split the obtained data into training and testing data -

```julia linear_regression_complex
```jldoctest linear_regression_complex
julia> x_train, x_test, y_train, y_test = x[:, 1:400], x[:, 401:end], y[:, 1:400], y[:, 401:end];
julia> x_train |> size, x_test |> size, y_train |> size, y_test |> size
Expand All @@ -292,14 +294,14 @@ julia> x_train |> size, x_test |> size, y_train |> size, y_test |> size

This data contains a diverse number of features, which means that the features have different scales. A wise option here would be to `normalise` the data, making the training process more efficient and fast. Let's check the standard deviation of the training data before normalising it.

```julia linear_regression_complex; filter = r"[+-]?([0-9]*[.])?[0-9]+"
```jldoctest linear_regression_complex; filter = r"[+-]?([0-9]*[.])?[0-9]+"
julia> std(x_train)
134.06784844377117
```

The data is indeed not normalised. We can use the [`Flux.normalise`](@ref) function to normalise the training data.

```julia linear_regression_complex; filter = r"[+-]?([0-9]*[.])?[0-9]+"
```jldoctest linear_regression_complex; filter = r"[+-]?([0-9]*[.])?[0-9]+"
julia> x_train_n = Flux.normalise(x_train);
julia> std(x_train_n)
Expand All @@ -308,7 +310,7 @@ julia> std(x_train_n)

The standard deviation is now close to one! The last step for this section would be to wrap the `x`s and `y`s together to create the training data.

```julia linear_regression_complex
```jldoctest linear_regression_complex
julia> train_data = [(x_train_n, y_train)];
```

Expand All @@ -317,14 +319,14 @@ Our data is ready!
### Model
We can now directly use `Flux` and let it do all the work internally! Let's define a model that takes in 13 inputs (13 features) and gives us a single output (the label). We will then pass our entire data through this model in one go, and `Flux` will handle everything for us! Remember, we could have declared a model in plain `Julia` as well. The model will have 14 parameters, 13 weights, and one bias.

```julia linear_regression_complex
```jldoctest linear_regression_complex
julia> model = Dense(13 => 1)
Dense(13 => 1) # 14 parameters
```

Same as before, our next step would be to define a loss function to quantify our accuracy somehow. The lower the loss, the better the model!

```julia linear_regression_complex; filter = r"[+-]?([0-9]*[.])?[0-9]+"
```jldoctest linear_regression_complex; filter = r"[+-]?([0-9]*[.])?[0-9]+"
julia> function loss(x, y)
ŷ = model(x)
Flux.mse(ŷ, y)
Expand All @@ -339,7 +341,7 @@ We can now proceed to the training phase!
### Training
Before training the model, let's initialize the optimiser and let `Flux` know that we want all the derivatives of all the parameters of our `model`.

```julia linear_regression_complex
```jldoctest linear_regression_complex
julia> opt = Descent(0.05);
julia> params = Flux.params(model);
Expand All @@ -348,16 +350,15 @@ julia> params = Flux.params(model);
Contrary to our last training procedure, let's say that this time we don't want to hardcode the number of epochs. We want the training procedure to stop when the loss converges, that is, when `change in loss < δ`. The quantity `δ` can be altered according to a user's need, but let's fix it to `10⁻³` for this tutorial.

We can write such custom training loops effortlessly using Flux and plain Julia!
```julia linear_regression_complex
```jldoctest linear_regression_complex
julia> loss_init = Inf;
julia> while true
Flux.train!(loss, params, data, opt)
Flux.train!(loss, params, train_data, opt)
if loss_init == Inf
loss_init = loss(x_train_n, y_train)
continue
end

if abs(loss_init - loss(x_train_n, y_train)) < 1e-3
break
else
Expand All @@ -372,7 +373,7 @@ This custom loop works! This shows how easily a user can write down any custom t

Let's have a look at the loss -

```julia linear_regression_complex; filter = r"[+-]?([0-9]*[.])?[0-9]+"
```jldoctest linear_regression_complex; filter = r"[+-]?([0-9]*[.])?[0-9]+"
julia> loss(x_train_n, y_train)
27.127200028562164
```
Expand All @@ -382,7 +383,7 @@ The loss went down significantly! It can be minimized further by choosing an eve
### Testing
The last step of this tutorial would be to test our model using the testing data. We will first normalise the testing data and then calculate the corresponding loss.

```julia linear_regression_complex; filter = r"[+-]?([0-9]*[.])?[0-9]+"
```jldoctest linear_regression_complex; filter = r"[+-]?([0-9]*[.])?[0-9]+"
julia> x_test_n = Flux.normalise(x_test);
julia> loss(x_test_n, y_test)
Expand Down

0 comments on commit 69215fe

Please sign in to comment.