Skip to content

Added stringsAsFactors = TRUE to several read commands. #353

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions _episodes_rmd/06-vector-open-shapefile-in-r.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -165,9 +165,10 @@ ggplot() +
>
> > ## Answers
> >
> > First we import the data:
> > First we import the data. The HARV_roads object has some fields that we want to
> > ensure are read as categorical (factor) data:
> > ```{r import-point-line, echo=TRUE}
> > lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp")
> > lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp", stringsAsFactors = TRUE)
> > point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp")
> > ```
> >
Expand Down
21 changes: 12 additions & 9 deletions _episodes_rmd/07-vector-shapefile-attributes-in-r.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -28,14 +28,6 @@ library(dplyr)
library(sf)
```

```{r load-data, echo=FALSE, results='hide'}
# learners will have this data loaded from previous episodes
point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp")
lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp")
aoi_boundary_HARV <- st_read(
"data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp")
```

> ## Things You’ll Need To Complete This Episode
> See the [lesson homepage]({{ site.baseurl }}) for detailed information about the software,
> data, and other prerequisites you will need to work through the examples in this episode.
Expand All @@ -53,6 +45,16 @@ We will continue using the `sf`, `raster` and `ggplot2` packages in this episode
continue to work with the three shapefiles (vector layers) that we loaded in the
[Open and Plot Shapefiles in R]({{site.baseurl}}/06-vector-open-shapefile-in-r/) episode.

```{r load-data, results='hide'}
point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp")
lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp",
stringsAsFactors = TRUE)
aoi_boundary_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp")
```

To ensure that all of the strings in the HARV_roads object are read as categorical data,
we include the `stringsAsFactors = TRUE` argument in our `st_read` command.

## Query Vector Feature Metadata

As we discussed in the
Expand Down Expand Up @@ -462,7 +464,8 @@ ggplot() +
> > in the `region` column:
> > ``` {r}
> > state_boundary_US <-
> > st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-State-Boundaries-Census-2014.shp")
> > st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-State-Boundaries-Census-2014.shp",
> > stringsAsFactors = TRUE)
> >
> > levels(state_boundary_US$region)
> > ```
Expand Down
6 changes: 4 additions & 2 deletions _episodes_rmd/08-vector-plot-shapefiles-custom-legend.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,8 @@ library(sf)
```{r load-data, echo = FALSE, results='hide', warning=FALSE}
# learners will have this data loaded from an earlier episode
aoi_boundary_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp")
lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp")
lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp",
stringsAsFactors = TRUE)
point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp")
CHM_HARV <- raster("data/NEON-DS-Airborne-Remote-Sensing/HARV/CHM/HARV_chmCrop.tif")
CHM_HARV_df <- as.data.frame(CHM_HARV, xy = TRUE)
Expand Down Expand Up @@ -150,7 +151,8 @@ symbol of `shape` value.
> > unique soils are represented in the `soilTypeOr` attribute.
> >
> > ```{r}
> > plot_locations <- st_read("data/NEON-DS-Site-Layout-Files/HARV/PlotLocations_HARV.shp")
> > plot_locations <- st_read("data/NEON-DS-Site-Layout-Files/HARV/PlotLocations_HARV.shp",
> > stringsAsFactors = TRUE)
> >
> > levels(plot_locations$soilTypeOr)
> > ```
Expand Down
3 changes: 2 additions & 1 deletion _episodes_rmd/09-vector-when-data-dont-line-up-crs.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,8 @@ We will continue to work with the three shapefiles that we loaded in the
```{r load-data, echo = FALSE, results = 'hide', warning = FALSE, message = FALSE}
# learners will have this data loaded from previous episodes
aoi_boundary_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp")
lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp")
lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp",
stringsAsFactors = TRUE)
point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp")
CHM_HARV <- raster("data/NEON-DS-Airborne-Remote-Sensing/HARV/CHM/HARV_chmCrop.tif")
CHM_HARV_df <- as.data.frame(CHM_HARV, xy = TRUE)
Expand Down
11 changes: 7 additions & 4 deletions _episodes_rmd/10-vector-csv-to-shapefile-in-r.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@ library(sf)

```{r load-data, echo = FALSE, results='hide'}
# Learners will have this data loaded from earlier episodes
lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp")
lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp",
stringsAsFactors = TRUE)
aoi_boundary_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp")
country_boundary_US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-Boundary-Dissolved-States.shp")
point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp")
Expand Down Expand Up @@ -71,12 +72,13 @@ that new object:

```{r read-csv }
plot_locations_HARV <-
read.csv("data/NEON-DS-Site-Layout-Files/HARV/HARV_PlotLocations.csv")
read.csv("data/NEON-DS-Site-Layout-Files/HARV/HARV_PlotLocations.csv",
stringsAsFactors = TRUE)

str(plot_locations_HARV)
```

We now have a data frame that contains 21 locations (rows) and 16 variables (attributes). Note that all of our character data was imported into R as factor (categorical) data. Next, let's explore the dataframe to determine whether it contains columns with coordinate values. If we are lucky, our `.csv` will contain columns labeled:
We now have a data frame that contains 21 locations (rows) and 16 variables (attributes). The `stringsAsFactors = TRUE` argument ensures that all of our character data is imported into R as factor (categorical) data. Next, let's explore the dataframe to determine whether it contains columns with coordinate values. If we are lucky, our `.csv` will contain columns labeled:

* "X" and "Y" OR
* Latitude and Longitude OR
Expand Down Expand Up @@ -227,7 +229,8 @@ That's really handy!
> >
> > ```{r}
> > newplot_locations_HARV <-
> > read.csv("data/NEON-DS-Site-Layout-Files/HARV/HARV_2NewPhenPlots.csv")
> > read.csv("data/NEON-DS-Site-Layout-Files/HARV/HARV_2NewPhenPlots.csv",
> > stringsAsFactors = TRUE)
> > str(newplot_locations_HARV)
> > ```
> >
Expand Down
9 changes: 6 additions & 3 deletions _episodes_rmd/11-vector-raster-integration.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ library(dplyr)
# Learners will have this data loaded from earlier episodes
# shapefiles
point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp")
lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp")
lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp",
stringsAsFactors = TRUE)
aoi_boundary_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp")

# CHM
Expand All @@ -44,7 +45,8 @@ CHM_HARV_df <- as.data.frame(CHM_HARV, xy = TRUE)

# plot locations
plot_locations_HARV <-
read.csv("data/NEON-DS-Site-Layout-Files/HARV/HARV_PlotLocations.csv")
read.csv("data/NEON-DS-Site-Layout-Files/HARV/HARV_PlotLocations.csv",
stringsAsFactors = TRUE)
utm18nCRS <- st_crs(point_HARV)
plot_locations_sp_HARV <- st_as_sf(plot_locations_HARV,
coords = c("easting", "northing"),
Expand Down Expand Up @@ -86,7 +88,8 @@ we have worked with in this workshop:
CHM_HARV_sp <- st_as_sf(CHM_HARV_df, coords = c("x", "y"), crs = utm18nCRS)
# approximate the boundary box with a random sample of raster points
CHM_rand_sample <- sample_n(CHM_HARV_sp, 10000)
lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp")
lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp",
stringsAsFactors = TRUE)
plots_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/PlotLocations_HARV.shp")
```

Expand Down
3 changes: 2 additions & 1 deletion _episodes_rmd/12-time-series-raster.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,8 @@ of that dataframe:

```{r view-temp-data}
har_met_daily <-
read.csv("data/NEON-DS-Met-Time-Series/HARV/FisherTower-Met/hf001-06-daily-m.csv")
read.csv("data/NEON-DS-Met-Time-Series/HARV/FisherTower-Met/hf001-06-daily-m.csv",
stringsAsFactors = TRUE)

str(har_met_daily)
```
Expand Down