diff --git a/docs/en/week02/02-2.md b/docs/en/week02/02-2.md index 6c57b9520..0e9bba7e8 100644 --- a/docs/en/week02/02-2.md +++ b/docs/en/week02/02-2.md @@ -194,7 +194,7 @@ Essentially, regularization tries to tell the system to minimize the cost functi ### Weight initialisation -The weights need to be initialised at random, however, they shouldn't be too large or too small such that output is roughly of the same variance as that of input. There are various weight initialisation tricks built into PyTorch. One of the tricks that works well for deep models is Kaiming initialisation where the variance of the weights is inversely proportional to square root of number of inputs. +The weights need to be initialised at random, however, they shouldn't be too large or too small such that output is roughly of the same variance as that of input. There are various weight initialisation tricks built into PyTorch. One of the tricks that works well for deep models is Kaiming initialisation where the standard deviation of the weights is inversely proportional to square root of number of inputs. ### Use dropout