Skip to content

Commit

Permalink
note on numpy array indexing
Browse files Browse the repository at this point in the history
  • Loading branch information
brownsarahm committed Oct 23, 2020
1 parent 6eba0a3 commit aad41cc
Showing 1 changed file with 12 additions and 5 deletions.
17 changes: 12 additions & 5 deletions notes/2020-10-23.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,22 +20,22 @@ kernelspec:
1. Accept assignment 7
```

<!-- annotate: Assignment 7 -->
## Assignment 7
<!-- annotate: Assignment 7 -->
## Assignment 7

Make a plan with a group:
- what methods do you need to use in part 1?
- try to outline with psuedocode what you'll do for part 2 & 3

Share any questions you have.
Share any questions you have.

Followup:
1. assignment clarified to require 3 values for the parameter in part 2
1. more tips on finding data sets added to assignment text

+++

<!-- annotate: Complexity of Decision Trees -->
<!-- annotate: Complexity of Decision Trees -->
## Complexity of Decision Trees

```{code-cell} ipython3
Expand All @@ -54,6 +54,13 @@ df6= pd.read_csv(d6_url,usecols=[1,2,3])
df6.head()
```

````{margin}
```{note}
`df6.values` is a numpy array, which is a good datastructure for storing matrices of data. We can index into numpy arrays using `[rows, columns]`. Here, `df6.values[:,:2]` we take all the rows (`:`) and the columns up to, but not including index 2 for the features (X) `:2` and use columns at index 2 for the target(y).
```
````


```{code-cell} ipython3
X_train, X_test, y_train, y_test = train_test_split(df6.values[:,:2],df6.values[:,2],
train_size=.8)
Expand Down Expand Up @@ -89,7 +96,7 @@ dt.score(X_test,y_test)
df6.shape
```

<!-- annotate: Training, Test set size and Cross Validation -->
<!-- annotate: Training, Test set size and Cross Validation -->
## Training, Test set size and Cross Validation

```{code-cell} ipython3
Expand Down

0 comments on commit aad41cc

Please sign in to comment.