Skip to content

Commit

Permalink
Merge pull request fengdu78#45 from Liam-Yang/master
Browse files Browse the repository at this point in the history
修改神经网络代价函数的正则化项:(s_l)+1--->s_(l+1)
  • Loading branch information
fengdu78 authored Apr 11, 2019
2 parents 6667635 + 36636b7 commit 12acc43
Show file tree
Hide file tree
Showing 15 changed files with 19 additions and 13 deletions.
Binary file removed images/1fd3017dfa554642a5e1805d6d2b1fa6.jpg
Binary file not shown.
Binary file added images/1fd3017dfa554642a5e1805d6d2b1fa6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed images/2ea8f5ce4c3df931ee49cf8d987ef25d.jpg
Binary file not shown.
Binary file added images/2ea8f5ce4c3df931ee49cf8d987ef25d.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed images/432c906875baca78031bd337fe0c8682.jpg
Binary file not shown.
Binary file added images/432c906875baca78031bd337fe0c8682.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed images/4c44e69a12b48efdff2fe92a0a698768.jpg
Binary file not shown.
Binary file added images/4c44e69a12b48efdff2fe92a0a698768.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed images/57480b04956f1dc54ecfc64d68a6b357.jpg
Binary file not shown.
Binary file added images/57480b04956f1dc54ecfc64d68a6b357.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed images/6a0954ad41f959d7f272e8f53d4ee2de.jpg
Binary file not shown.
Binary file removed images/7527e61b1612dcf84dadbcf7a26a22fb.jpg
Binary file not shown.
Binary file added images/7527e61b1612dcf84dadbcf7a26a22fb.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 8 additions & 8 deletions markdown/week4.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,24 +199,24 @@ ${{z}^{\left( 2 \right)}}={{\Theta }^{\left( 1 \right)}}\times {{X}^{T}} $

下图的神经元(三个权重分别为-30,20,20)可以被视为作用同于逻辑与(**AND**):

![](../images/57480b04956f1dc54ecfc64d68a6b357.jpg)
![](../images/57480b04956f1dc54ecfc64d68a6b357.png)

下图的神经元(三个权重分别为-10,20,20)可以被视为作用等同于逻辑或(**OR**):

![](../images/7527e61b1612dcf84dadbcf7a26a22fb.jpg)
![](../images/7527e61b1612dcf84dadbcf7a26a22fb.png)

下图的神经元(两个权重分别为 10,-20)可以被视为作用等同于逻辑非(**NOT**):

![](../images/1fd3017dfa554642a5e1805d6d2b1fa6.jpg)
![](../images/1fd3017dfa554642a5e1805d6d2b1fa6.png)

我们可以利用神经元来组合成更为复杂的神经网络以实现更复杂的运算。例如我们要实现**XNOR** 功能(输入的两个值必须一样,均为1或均为0),即 $\text{XNOR}=( \text{x}_1\, \text{AND}\, \text{x}_2 )\, \text{OR} \left( \left( \text{NOT}\, \text{x}_1 \right) \text{AND} \left( \text{NOT}\, \text{x}_2 \right) \right)$
首先构造一个能表达$\left( \text{NOT}\, \text{x}_1 \right) \text{AND} \left( \text{NOT}\, \text{x}_2 \right)$部分的神经元:
我们可以利用神经元来组合成更为复杂的神经网络以实现更复杂的运算。例如我们要实现**XNOR** 功能(输入的两个值必须一样,均为1或均为0),即 $\text{XNOR}=( \text{x}_1\, \text{AND}\, \text{x}_2 )\, \text{OR} \left( \left( \text{NOT}\, \text{x}_1 \right) \text{AND} \left( \text{NOT}\, \text{x}_2 \right) \right)$
首先构造一个能表达$\left( \text{NOT}\, \text{x}_1 \right) \text{AND} \left( \text{NOT}\, \text{x}_2 \right)$部分的神经元:

![](../images/4c44e69a12b48efdff2fe92a0a698768.jpg)
![](../images/4c44e69a12b48efdff2fe92a0a698768.png)

然后将表示 **AND** 的神经元和表示$\left( \text{NOT}\, \text{x}_1 \right) \text{AND} \left( \text{NOT}\, \text{x}_2 \right)$的神经元以及表示 OR 的神经元进行组合:
然后将表示 **AND** 的神经元和表示$\left( \text{NOT}\, \text{x}_1 \right) \text{AND} \left( \text{NOT}\, \text{x}_2 \right)$的神经元以及表示 OR 的神经元进行组合:

![](../images/432c906875baca78031bd337fe0c8682.jpg)
![](../images/432c906875baca78031bd337fe0c8682.png)

我们就得到了一个能实现 $\text{XNOR}$ 运算符功能的神经网络。

Expand Down
16 changes: 11 additions & 5 deletions markdown/week5.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,11 @@ $ J\left(\theta \right)=-\frac{1}{m}\left[\sum_\limits{i=1}^{m}{y}^{(i)}\log{h_
在逻辑回归中,我们只有一个输出变量,又称标量(**scalar**),也只有一个因变量$y$,但是在神经网络中,我们可以有很多输出变量,我们的$h_\theta(x)$是一个维度为$K$的向量,并且我们训练集中的因变量也是同样维度的一个向量,因此我们的代价函数会比逻辑回归更加复杂一些,为:$\newcommand{\subk}[1]{ #1_k }$
$$h_\theta\left(x\right)\in \mathbb{R}^{K}$$ $${\left({h_\theta}\left(x\right)\right)}_{i}={i}^{th} \text{output}$$

$J(\Theta) = -\frac{1}{m} \left[ \sum\limits_{i=1}^{m} \sum\limits_{k=1}^{k} {y_k}^{(i)} \log \subk{(h_\Theta(x^{(i)}))} + \left( 1 - y_k^{(i)} \right) \log \left( 1- \subk{\left( h_\Theta \left( x^{(i)} \right) \right)} \right) \right] + \frac{\lambda}{2m} \sum\limits_{l=1}^{L-1} \sum\limits_{i=1}^{s_l} \sum\limits_{j=1}^{s_l+1} \left( \Theta_{ji}^{(l)} \right)^2$
$J(\Theta) = -\frac{1}{m} \left[ \sum\limits_{i=1}^{m} \sum\limits_{k=1}^{k} {y_k}^{(i)} \log \subk{(h_\Theta(x^{(i)}))} + \left( 1 - y_k^{(i)} \right) \log \left( 1- \subk{\left( h_\Theta \left( x^{(i)} \right) \right)} \right) \right] + \frac{\lambda}{2m} \sum\limits_{l=1}^{L-1} \sum\limits_{i=1}^{s_l} \sum\limits_{j=1}^{s_{l+1}} \left( \Theta_{ji}^{(l)} \right)^2$

这个看起来复杂很多的代价函数背后的思想还是一样的,我们希望通过代价函数来观察算法预测的结果与真实情况的误差有多大,唯一不同的是,对于每一行特征,我们都会给出$K$个预测,基本上我们可以利用循环,对每一行特征都预测$K$个不同结果,然后在利用循环在$K$个预测中选择可能性最高的一个,将其与$y$中的实际数据进行比较。
这个看起来复杂很多的代价函数背后的思想还是一样的,我们希望通过代价函数来观察算法预测的结果与真实情况的误差有多大,唯一不同的是,对于每一行特征,我们都会给出$K$个预测,基本上我们可以利用循环,对每一行特征都预测$K$个不同结果,然后在利用循环在$K$个预测中选择可能性最高的一个,将其与$y$中的实际数据进行比较。

正则化的那一项只是排除了每一层$\theta_0$后,每一层的$\theta$ 矩阵的和。最里层的循环$j$循环所有的行(由$s_l​$ +1 层的激活单元数决定),循环$i$则循环所有的列,由该层($s_l$层)的激活单元数所决定。即:$h_\theta(x)$与真实值之间的距离为每个样本-每个类输出的加和,对参数进行**regularization****bias**项处理所有参数的平方和。
正则化的那一项只是排除了每一层$\theta_0$后,每一层的$\theta$ 矩阵的和。最里层的循环$j$循环所有的行(由$s_{l+1}$ 层的激活单元数决定),循环$i$则循环所有的列,由该层($s_l$层)的激活单元数所决定。即:$h_\theta(x)$与真实值之间的距离为每个样本-每个类输出的加和,对参数进行**regularization****bias**项处理所有参数的平方和。

### 9.2 反向传播算法

Expand All @@ -47,9 +47,9 @@ $J(\Theta) = -\frac{1}{m} \left[ \sum\limits_{i=1}^{m} \sum\limits_{k=1}^{k} {y_

前向传播算法:

![](../images/2ea8f5ce4c3df931ee49cf8d987ef25d.jpg)
![](../images/2ea8f5ce4c3df931ee49cf8d987ef25d.png)

![](../images/6a0954ad41f959d7f272e8f53d4ee2de.jpg)
下面的公式推导过程见:<https://blog.csdn.net/qq_29762941/article/details/80343185>

我们从最后一层的误差开始计算,误差是激活单元的预测(${a^{(4)}}$)与实际值($y^k$)之间的误差,($k=1:k$)。
我们用$\delta$来表示误差,则:$\delta^{(4)}=a^{(4)}-y$
Expand Down Expand Up @@ -119,6 +119,12 @@ Theta1 = reshape(thetaVec(221:231, 1, 11);

![](../images/1542307ad9033e39093e7f28d0c7146c.png)

**感悟**:上图中的 $\delta^{(l)}_{j}="error" \ of cost \ for \ a^{(l)}_{j} \ (unit \ j \ in \ layer \ l)$ 理解如下:

$\delta^{(l)}_{j}$ 相当于是第 $l$ 层的第 $j$ 单元中得到的激活项的“误差”,即”正确“的 $a^{(l)}_{j}$ 与计算得到的 $a^{(l)}_{j}$ 的差。

而 $a^{(l)}_{j}=g(z^{(l)})$ ,(g为sigmoid函数)。我们可以想象 $\delta^{(l)}_{j}$ 为函数求导时迈出的那一丁点微分,所以更准确的说 $\delta^{(l)}_{j}=\frac{\partial}{\partial z^{(l)}_{j}}cost(i)$

### 9.4 实现注意:展开参数

参考视频: 9 - 4 - Implementation Note\_ Unrolling Parameters (8 min).mkv
Expand Down

0 comments on commit 12acc43

Please sign in to comment.