Correct typos, make Markdown cells more markdowny (#231)

dask · Nov 17, 2022 · 702f19d · 702f19d
1 parent c2aa3de
commit 702f19d
Showing 1 changed file with 15 additions and 5 deletions.
diff --git a/machine-learning/parallel-prediction.ipynb b/machine-learning/parallel-prediction.ipynb
@@ -12,7 +12,8 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Sometimes you'll train on a smaller dataset that fits in memory, but need to predict or score for a much larger (possibly larger than memory) dataset. Perhaps your [learning curve](http://scikit-learn.org/stable/modules/learning_curve.html) has leveled off, or you only have labels for a subset of the data.\n",
+    "Sometimes you'll train on a smaller dataset that fits in memory, but need to predict or score for a much larger (possibly larger than memory) dataset.\n",
+    "Perhaps your [learning curve](http://scikit-learn.org/stable/modules/learning_curve.html) has leveled off, or you only have labels for a subset of the data.\n",
     "\n",
     "In this situation, you can use [ParallelPostFit](http://ml.dask.org/modules/generated/dask_ml.wrappers.ParallelPostFit.html) to parallelize and distribute the scoring or prediction steps."
    ]
@@ -25,7 +26,7 @@
    "source": [
     "from dask.distributed import Client, progress\n",
     "\n",
-    "# Scale up: connect to your own cluster with bmore resources\n",
+    "# Scale up: connect to your own cluster with more resources\n",
     "# see http://dask.pydata.org/en/latest/setup.html\n",
     "client = Client(processes=False, threads_per_worker=4,\n",
     "                n_workers=1, memory_limit='2GB')\n",
@@ -155,9 +156,11 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "y_pred is Dask arary. Workers can write the predicted values to a shared file system, without ever having to collect the data on a single machine.\n",
+    "`y_pred` is a Dask array.\n",
+    "Workers can write the predicted values to a shared file system, without ever having to collect the data on a single machine.\n",
     "\n",
-    "Or we can check the models score on the entire large dataset. The computation will be done in parallel, and no single machine will have to hold all the data."
+    "Or we can check the models score on the entire large dataset.\n",
+    "The computation will be done in parallel, and no single machine will have to hold all the data."
    ]
   },
   {
@@ -168,6 +171,13 @@
    "source": [
     "clf.score(X_large, y_large)"
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
   }
  ],
  "metadata": {
@@ -186,7 +196,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.9.12"
+   "version": "3.10.6"
   }
  },
  "nbformat": 4,