Merge pull request #258 from mdkessler/mdkessler-patch-6

[ci skip] This build is based on 881a98c. This commit was created by the following CI build and job: https://github.com/Benjamin-Lee/deep-rules/commit/881a98c3d37cd00c0a52f00f89576889db90b203/checks https://github.com/Benjamin-Lee/deep-rules/runs/309999788
Benjamin-Lee · Oct 16, 2020 · 1856fc3 · 1856fc3
1 parent c7aa620
commit 1856fc3
Show file tree

Hide file tree

Showing 20 changed files with 21,937 additions and 235 deletions.
diff --git a/README.md b/README.md
@@ -1,7 +1,7 @@
 # Output directory containing the formatted manuscript
 
 The [`gh-pages`](https://github.com/Benjamin-Lee/deep-rules/tree/gh-pages) branch hosts the contents of this directory at <https://Benjamin-Lee.github.io/deep-rules/>.
-The permalink for this webpage version is <https://Benjamin-Lee.github.io/deep-rules/v/d32411669c82800d16e694a5d123c6367f1de5aa/>.
+The permalink for this webpage version is <https://Benjamin-Lee.github.io/deep-rules/v/881a98c3d37cd00c0a52f00f89576889db90b203/>.
 To redirect to the permalink for the latest manuscript version at anytime, use the link <https://Benjamin-Lee.github.io/deep-rules/v/freeze/>.
 
 ## Files
@@ -35,4 +35,4 @@ Verifying timestamps with the `ots verify` command requires running a local bitc
 ## Source
 
 The manuscripts in this directory were built from
-[`d32411669c82800d16e694a5d123c6367f1de5aa`](https://github.com/Benjamin-Lee/deep-rules/commit/d32411669c82800d16e694a5d123c6367f1de5aa).
+[`881a98c3d37cd00c0a52f00f89576889db90b203`](https://github.com/Benjamin-Lee/deep-rules/commit/881a98c3d37cd00c0a52f00f89576889db90b203).
diff --git a/index.html b/index.html
diff --git a/manuscript.pdf b/manuscript.pdf
diff --git a/v/881a98c3d37cd00c0a52f00f89576889db90b203/images/github.svg b/v/881a98c3d37cd00c0a52f00f89576889db90b203/images/github.svg
diff --git a/v/881a98c3d37cd00c0a52f00f89576889db90b203/images/orcid.svg b/v/881a98c3d37cd00c0a52f00f89576889db90b203/images/orcid.svg
diff --git a/v/881a98c3d37cd00c0a52f00f89576889db90b203/images/overfitting.ipynb b/v/881a98c3d37cd00c0a52f00f89576889db90b203/images/overfitting.ipynb
@@ -0,0 +1,172 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Overfitting Figure Generation\n",
+    "We're going to generate `n_points` points distributed along a line, remembering that the formula for a line is $y = mx+b$. Modified (slightly) from [here](https://stackoverflow.com/a/35730618/8068638)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "%matplotlib inline\n",
+    "import matplotlib.pyplot as plt"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "n_points = 12\n",
+    "m = 1\n",
+    "b = 0\n",
+    "delta_range = 6\n",
+    "np.random.seed(14)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now, we need to generate the testing and training \"data\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "points_x = np.arange(n_points)\n",
+    "training_delta = np.random.uniform(delta_range / -2.0, delta_range / 2.0, size=(n_points))\n",
+    "training_points_y = m*points_x + b + training_delta\n",
+    "\n",
+    "testing_points_x = points_x + n_points\n",
+    "testing_delta = np.random.uniform(delta_range / -2.0, delta_range / 2.0, size=(n_points))\n",
+    "testing_points_y = m*testing_points_x + b + testing_delta"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We'll overfit by generating a $n$-dimensional polynomial"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "overfitted = np.poly1d(np.polyfit(points_x, training_points_y, n_points - 1))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "x_space = np.linspace(-(n_points/5), 2*n_points+(n_points/5), n_points*100)\n",
+    "overfitted_x_space = np.linspace(-(n_points/5), 2*n_points+(n_points/5), n_points*100)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "y_overfitted = overfitted(x_space)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Plot it\n",
+    "Colors chosen from [Wong, B. (2011). Points of view: Color blindness. *Nature Methods, 8*(6), 441–441. doi:10.1038/nmeth.1618](doi.org/10.1038/nmeth.1618). I had to do some magic to make the arrays colors play nicely with matplotlib"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def rgb_to_np_rgb(r, g, b):\n",
+    "    return (r / 255, g / 255, b / 255) \n",
+    "\n",
+    "orange = rgb_to_np_rgb(230, 159, 0)\n",
+    "blueish_green = rgb_to_np_rgb(0, 158, 115)\n",
+    "vermillion = rgb_to_np_rgb(213, 94, 0)\n",
+    "blue = rgb_to_np_rgb(0, 114, 178)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# configure the plot\n",
+    "plt.rcParams[\"figure.figsize\"] = (12.8 * 0.75, 9.6 * 0.75)\n",
+    "plt.rcParams['svg.fonttype'] = 'path'\n",
+    "plt.rcParams['axes.spines.left'] = True\n",
+    "plt.rcParams['axes.spines.right'] = False\n",
+    "plt.rcParams['axes.spines.top'] = False\n",
+    "plt.rcParams['axes.spines.bottom'] = True\n",
+    "plt.rcParams[\"xtick.labelbottom\"] = False\n",
+    "plt.rcParams[\"xtick.bottom\"] = False\n",
+    "plt.rcParams[\"ytick.left\"] = False\n",
+    "plt.rcParams[\"ytick.labelleft\"] = False\n",
+    "plt.xkcd() # for fun (see https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003858#s12)\n",
+    "\n",
+    "# plot the data\n",
+    "plt.scatter(points_x, training_points_y, zorder=3,label=\"Training data\", s=100, c=[blue])\n",
+    "plt.scatter(testing_points_x, testing_points_y, zorder=3,label=\"Out-of-range data\", s=100, c=[vermillion])\n",
+    "\n",
+    "plt.plot(x_space, m*x_space + b, zorder=2, label=\"Properly fit model\", c=blueish_green)\n",
+    "plt.plot(x_space, y_overfitted, zorder=1, label=\"Overfit model\", c=orange)\n",
+    "\n",
+    "plt.xlim(-(n_points/5) - 1, max(testing_points_x) + 1)\n",
+    "plt.ylim(-(n_points/5) - 1, max(testing_points_y)+(n_points/5) + 1)\n",
+    "\n",
+    "# plt.rcParams[\"figure.figsize\"] = [6.4*2, 4.8*2]\n",
+    "plt.legend(loc=4)\n",
+    "plt.savefig('overfitting.svg', bbox_inches='tight')\n",
+    "plt.savefig('overfitting.png', dpi=150, bbox_inches='tight')"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/v/881a98c3d37cd00c0a52f00f89576889db90b203/images/overfitting.png b/v/881a98c3d37cd00c0a52f00f89576889db90b203/images/overfitting.png