Merge pull request BVLC#1100 from cNikolaou/issue1099

Polish mnist + cifar10 examples.
egalev · Sep 18, 2014 · c3a69b7 · c3a69b7
2 parents a77ca76 + 9a7f0a0
commit c3a69b7
Show file tree

Hide file tree

Showing 2 changed files with 26 additions and 6 deletions.
diff --git a/examples/cifar10/readme.md b/examples/cifar10/readme.md
@@ -11,7 +11,7 @@ Alex's CIFAR-10 tutorial, Caffe style
 
 Alex Krizhevsky's [cuda-convnet](https://code.google.com/p/cuda-convnet/) details the model definitions, parameters, and training procedure for good performance on CIFAR-10. This example reproduces his results in Caffe.
 
-We will assume that you have Caffe successfully compiled. If not, please refer to the [Installation page](installation.html). In this tutorial, we will assume that your caffe installation is located at `CAFFE_ROOT`.
+We will assume that you have Caffe successfully compiled. If not, please refer to the [Installation page](/installation.html). In this tutorial, we will assume that your caffe installation is located at `CAFFE_ROOT`.
 
 We thank @chyojn for the pull request that defined the model schemas and solver configurations.
 
@@ -32,12 +32,12 @@ If it complains that `wget` or `gunzip` are not installed, you need to install t
 The Model
 ---------
 
-The CIFAR-10 model is a CNN that composes layers of convolution, pooling, rectified linear unit (ReLU) nonlinearities, and local contrast normalization with a linear classifier on top of it all. We have defined the model in the `CAFFE_ROOT/examples/cifar10` directory's `cifar10_quick_train.prototxt`.
+The CIFAR-10 model is a CNN that composes layers of convolution, pooling, rectified linear unit (ReLU) nonlinearities, and local contrast normalization with a linear classifier on top of it all. We have defined the model in the `CAFFE_ROOT/examples/cifar10` directory's `cifar10_quick_train_test.prototxt`.
 
 Training and Testing the "Quick" Model
 --------------------------------------
 
-Training the model is simple after you have written the network definition protobuf and solver protobuf files. Simply run `train_quick.sh`, or the following command directly:
+Training the model is simple after you have written the network definition protobuf and solver protobuf files (refer to [MNIST Tutorial](../examples/mnist.html)). Simply run `train_quick.sh`, or the following command directly:
 
     cd $CAFFE_ROOT/examples/cifar10
     ./train_quick.sh

diff --git a/examples/mnist/readme.md b/examples/mnist/readme.md
@@ -8,7 +8,7 @@ priority: 1
 
 # Training MNIST with Caffe
 
-We will assume that you have caffe successfully compiled. If not, please refer to the [Installation page](installation.html). In this tutorial, we will assume that your caffe installation is located at `CAFFE_ROOT`.
+We will assume that you have Caffe successfully compiled. If not, please refer to the [Installation page](/installation.html). In this tutorial, we will assume that your Caffe installation is located at `CAFFE_ROOT`.
 
 ## Prepare Datasets
 
@@ -29,7 +29,7 @@ The design of LeNet contains the essence of CNNs that are still used in larger m
 
 ## Define the MNIST Network
 
-This section explains the prototxt file `lenet_train.prototxt` used in the MNIST demo. We assume that you are familiar with [Google Protobuf](https://developers.google.com/protocol-buffers/docs/overview), and assume that you have read the protobuf definitions used by Caffe, which can be found at `$CAFFE_ROOT/src/caffe/proto/caffe.proto`.
+This section explains the `lenet_train_test.prototxt` model definition that specifies the LeNet model for MNIST handwritten digit classification. We assume that you are familiar with [Google Protobuf](https://developers.google.com/protocol-buffers/docs/overview), and assume that you have read the protobuf definitions used by Caffe, which can be found at `$CAFFE_ROOT/src/caffe/proto/caffe.proto`.
 
 Specifically, we will write a `caffe::NetParameter` (or in python, `caffe.proto.caffe_pb2.NetParameter`) protobuf. We will start by giving the network a name:
 
@@ -173,6 +173,25 @@ Finally, we will write the loss!
 
 The `softmax_loss` layer implements both the softmax and the multinomial logistic loss (that saves time and improves numerical stability). It takes two blobs, the first one being the prediction and the second one being the `label` provided by the data layer (remember it?). It does not produce any outputs - all it does is to compute the loss function value, report it when backpropagation starts, and initiates the gradient with respect to `ip2`. This is where all magic starts.
 
+
+### Additional Notes: Writing Layer Rules
+
+Layer definitions can include rules for whether and when they are included in the network definition, like the one below:
+
+    layers {
+      // ...layer definition...
+      include: { phase: TRAIN }
+    }
+
+This is a rule, which controls layer inclusion in the network, based on current network's state.
+You can refer to `$CAFFE_ROOT/src/caffe/proto/caffe.proto` for more information about layer rules and model schema.
+
+In the above example, this layer will be included only in `TRAIN` phase.
+If we change `TRAIN` with `TEST`, then this layer will be used only in test phase.
+By default, that is without layer rules, a layer is always included in the network.
+Thus, `lenet_train_test.prototxt` has two `DATA` layers defined (with different `batch_size`), one for the training phase and one for the testing phase.
+Also, there is an `ACCURACY` layer which is included only in `TEST` phase for reporting the model accuracy every 100 iteration, as defined in `lenet_solver.prototxt`.
+
 ## Define the MNIST Solver
 
 Check out the comments explaining each line in the prototxt `$CAFFE_ROOT/examples/mnist/lenet_solver.prototxt`:
@@ -203,9 +222,10 @@ Check out the comments explaining each line in the prototxt `$CAFFE_ROOT/example
     # solver mode: CPU or GPU
     solver_mode: GPU
 
+
 ## Training and Testing the Model
 
-Training the model is simple after you have written the network definition protobuf and solver protobuf files. Simply run `train_mnist.sh`, or the following command directly:
+Training the model is simple after you have written the network definition protobuf and solver protobuf files. Simply run `train_lenet.sh`, or the following command directly:
 
     cd $CAFFE_ROOT/examples/mnist
     ./train_lenet.sh