From aad41cc4e252f7b35afc24af7d6734fdc4cd53e6 Mon Sep 17 00:00:00 2001
From: Sarah M Brown <brownsarahm@uri.edu>
Date: Fri, 23 Oct 2020 15:15:55 -0400
Subject: [PATCH] note on numpy array indexing

---
 notes/2020-10-23.md | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/notes/2020-10-23.md b/notes/2020-10-23.md
index 97517efd..acee74d7 100644
--- a/notes/2020-10-23.md
+++ b/notes/2020-10-23.md
@@ -20,14 +20,14 @@ kernelspec:
 1. Accept assignment 7
 ```
 
-<!-- annotate: Assignment 7  --> 
-## Assignment 7 
+<!-- annotate: Assignment 7  -->
+## Assignment 7
 
 Make a plan with a group:
 - what methods do you need to use in part 1?
 - try to outline with psuedocode what you'll do for part 2 & 3
 
-Share any questions you have. 
+Share any questions you have.
 
 Followup:
 1. assignment clarified to require 3 values for the parameter in part 2
@@ -35,7 +35,7 @@ Followup:
 
 +++
 
-<!-- annotate: Complexity of Decision Trees --> 
+<!-- annotate: Complexity of Decision Trees -->
 ## Complexity of Decision Trees
 
 ```{code-cell} ipython3
@@ -54,6 +54,13 @@ df6= pd.read_csv(d6_url,usecols=[1,2,3])
 df6.head()
 ```
 
+````{margin}
+```{note}
+`df6.values` is a numpy array, which is a good datastructure for storing matrices of data.  We can index into numpy arrays using `[rows, columns]`.  Here, `df6.values[:,:2]` we take all the rows (`:`) and the columns up to, but not including index 2 for the features (X) `:2` and use columns at index 2 for the target(y). 
+```
+````
+
+
 ```{code-cell} ipython3
 X_train, X_test, y_train,  y_test = train_test_split(df6.values[:,:2],df6.values[:,2],
                                                      train_size=.8)
@@ -89,7 +96,7 @@ dt.score(X_test,y_test)
 df6.shape
 ```
 
-<!-- annotate: Training, Test set size and Cross Validation --> 
+<!-- annotate: Training, Test set size and Cross Validation -->
 ## Training, Test set size and Cross Validation
 
 ```{code-cell} ipython3