forked from geocompx/geocompr
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path06-read-write-plot.Rmd
494 lines (389 loc) · 29.4 KB
/
06-read-write-plot.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
# Geographic data I/O {#read-write}
## Prerequisites {-}
- This chapter requires the following packages:
```{r, message=FALSE}
library(sf)
library(raster)
library(tidyverse)
library(spData)
```
## Introduction
This chapter is about reading and writing geographic data.
Geographic data *import* is an essential part of geocomputational software because without data real-world applications are impossible.
The skills taught in this book will enable you to *add value* to data meaning that, for others to benefit from the results, data *output* is also vital.
These two processes go hand-in-hand and are referred to as I/O --- short for input/output --- in Computer Science [@gillespie_efficient_2016].
Hence the title of this chapter.
Geographic data I/O is almost always part of a wider process.
It depends on knowing which datasets are *available*, where they can be *found* and how to *retrieve* them, topics covered in section \@ref(retrieving-data).
This section demonstrates how to access open access *geoportals* which collectively contain many terrabytes of data.
There is a wide range of geographic file formats, each of which has pros and cons.
These are described in section \@ref(file-formats).
The process of actually reading and writing such file formats efficiently is not covered until sections \@ref(data-input) and \@ref(data-output) respectively.
The final section (\@ref(visual-outputs)) demonstrates methods for saving visual outputs (maps), in preparation for a subsequent chapter dedicated to visualization.
<!-- Todo: Add reference to vis chapter (RL) -->
## Retrieving open data {#retrieving-data}
Nowadays, a vast amount of spatial data is available on the internet.
Best of all, much of it is freely available.
You just have to know where to look.
While we cannot provide a comprehensive guide to all available geodata, we point to a few of the most important sources.
Various 'geoportals' (web services providing geographic data such as the geospatial datasets in [Data.gov](https://catalog.data.gov/dataset?metadata_type=geospatial)) are a good place to start, providing a wide range of geographic data.
Geoportals are a very useful data source but often only contain data for a specific locations (see the [Wikipedia page on geoportals](https://en.wikipedia.org/wiki/Geoportal) for a list of geoportals covering many areas of the world).
To overcome this limitation some global geoportals have been developed.
The [GEOSS portal](http://www.geoportal.org/), for example, provides global remote sensing data.
Additional geoportals with global coverage and an emphasis on raster data include the [EarthExplorer](https://earthexplorer.usgs.gov/) and the [Copernicus Open Access Hub](https://scihub.copernicus.eu/).
A wealth of European data is available from the [INSPIRE geoportal](http://inspire-geoportal.ec.europa.eu/).
Typically, geoportals provide an interface that lets you query interactively the existing data (spatial and temporal extent, and product).
However, in this book we encourage you to create reproducible workflows.
In many cases data downloads can be scripted via download calls to static URLs or APIs (see the [Sentinel API](https://scihub.copernicus.eu/twiki/do/view/SciHubWebPortal/APIHubDescription) for example), saving time and enabling others to repeat and update the unglamorous data download process.
Traditionally, files have been stored on servers.
<!-- that's probably not the best example - replace it with something better -->
<!-- btw naturalearth website has some problems today - the link will probably change in the future -->
You can easily download such files with the `download.file()` command.
For example, to download National Park Service units in the United States, run:
```{r, eval=FALSE}
url = file.path("http://www.naturalearthdata.com/http//www.naturalearthdata.com",
"download/10m/cultural/ne_10m_parks_and_protected_lands.zip")
download.file(url = url,
destfile = "USA_parks.zip")
unzip(zipfile = "USA_parks.zip")
usa_parks = st_read("ne_10m_parks_and_protected_lands_area.shp")
```
Specific R packages providing an interface to spatial libraries or geoportals are even more user-friendly (Table \@ref(tab:datapackages)).
<!-- add sentinel2 package as soon as it is published on CRAN https://github.com/IVFL-BOKU/sentinel2-->
```{r datapackages, echo=FALSE}
datapackages = tibble::tribble(~`Package name`, ~Description,
"osmdata", "Download and import of OpenStreetMap data.",
"raster", "The `getData()` function downloads and imports administrative country, SRTM/ASTER elevation, WorldClim data.",
"rnaturalearth", "Functions to download Natural Earth vector and raster data, including world country borders.",
"rnoaa", "An R interface to National Oceanic and Atmospheric Administration (NOAA) climate data.",
"rWBclimate", "An access to the World Bank climate data."
)
knitr::kable(datapackages, caption = "Selected R packages for spatial data retrieval.")
```
<!-- https://cdn.rawgit.com/Nowosad/Intro_to_spatial_analysis/05676e29/Intro_to_spatial_analysis.html#39 -->
<!-- Maybe add a section to Data I/O on where and how to retrieve data (with a focus on free data): osmdata (OpenStreetMap; maybe mention TomTom, HERE), Landsat (wrspathrow), Sentinel (mention Python API), AVHRR, RapidEye rgbif, letsR, etc. Of course, point to Transforming science through open data project (https://www.ropensci.org) -->
<!-- https://github.com/ropensci/GSODR -->
<!-- https://github.com/lbusett/MODIStsp -->
<!-- https://github.com/walkerke/tigris -->
<!-- https://github.com/ropensci/hddtools/ -->
For example, you can get the borders of any country with the `ne_countries()` function from the **rnaturalearth** package:
```{r}
library(rnaturalearth)
usa = ne_countries(country = "United States of America") # United States borders
class(usa)
# you can do the same with raster::getData()
# getData("GADM", country = "USA", level = 0)
```
As a default, **rnaturalearth** returns the output as a `Spatial*` class.
You can easily convert it to the `sf` class with `st_as_sf()`:
```{r}
usa_sf = st_as_sf(usa)
```
As a second example, we will download a raster dataset.
The code below downloads a series of rasters that contains global monthly precipitation sums.
The spatial resolution is ten minutes.
The result is a multilayer object of class `RasterStack`.
```{r}
library(raster)
worldclim_prec = getData(name = "worldclim", var = "prec", res = 10)
class(worldclim_prec)
```
A third example uses the recently developed package **osmdata** [@R-osmdata] to find parks from the OpenStreetMap (OSM) database.
As illustrated in the code-chunk below, queries begin with the function `opq()` (short for OpenStreetMap query), the first argument of which is bounding box, or text string representing a bounding box (the city of Leeds in this case).
The result is passed to a function for selecting which OSM elements we're interested in (parks in this case), represented by *key-value pairs*, which in turn is passed to the function `osmdata_sf()` which does the work of downloading the data and converting it into a list of `sf` objects (see `vignette('osmdata')` for further details):
```{r, eval=FALSE}
library(osmdata)
parks = opq(bbox = "leeds uk") %>%
add_osm_feature(key = "leisure", value = "park") %>%
osmdata_sf()
```
OpenStreetMap is a vast global database of crowd-sourced data and it is growing by the minute.
Although the quality is not as spatially consistent as many official datasets, OSM data have many advantages: they are globally available free of charge and using crowd-source data can encourage 'citizen science' and contributions back to the digital commons.
Further examples of **osmdata** in action are provided in Chapters \@ref(transport) and \@ref(location).
<!-- Todo: Replace the above with "Chapters \@ref(location) and \@ref(gis)." when gis is online (RL) -->
Finally, R packages might contain or just consist of spatial data, such as **spData** which provides example datasets used in this book.
You can access such data with the `data()` function.
For example, you can get a list of dataset in a package, `data(package = "spData")`.
To attach the dataset to the global environment specify the name of a dataset (`data("cycle_hire", package = "spData")`).
Sometimes, packages come also with the original files.^[Data loaded with `data()` often is a R dataset (`.Rdata` ord `.rda`).]
To load such a file from the package, you need to specify the package name and the relative path to the dataset, for example:
```{r}
world_raw_filepath = system.file("shapes/world.gpkg", package = "spData")
world_raw = st_read(world_raw_filepath)
```
Find more information on getting data using R packages in [section 5.5](https://csgillespie.github.io/efficientR/input-output.html#download) and [section 5.6](https://csgillespie.github.io/efficientR/input-output.html#accessing-data-stored-in-packages) of @gillespie_efficient_2016.
## File formats
Spatial datasets are usually stored as files or in spatial databases.
File formats can either store vector or raster data, while spatial databases such as [PostGIS](https://trac.osgeo.org/postgis/wiki/WKTRaster) can store both.
Today file formats may seem bewildering but there has been much consolidation and standardisation since the beginnings of GIS software in the 1960s when the first widely distributed program ([SYMAP](https://news.harvard.edu/gazette/story/2011/10/the-invention-of-gis/)) for spatial analysis was created at Harvard University [@coppock_history_1991].
GDAL,^[[GDAL](http://www.gdal.org/) should be prounounced "goo-dal", with the double o making a reference to object-orientation.] the Geospatial Data Abstraction Library, has resolved many issues associated with incompatibility between file formats since its release in 2000.
GDAL provides a unified and high-performance interface for reading and writing of many raster and vector data formats.
Many open and proprietary GIS programs, including GRASS, ArcGIS and QGIS, use GDAL behind their GUIs for doing the legwork of ingesting and spitting-out geographic data in appropriate formats.
<!-- GDAL (it's great - you can read, convert, and very often (though not always) write) -->
<!-- GDAL info "it is possible to have smaller number of supported formats than there are on the GDAL webpage; you may need to recompile..." -->
An important development ensuring greater standardisation and open-sourcing of file formats was the founding of the Open Geospatial Consortium (OGC) in 1994.^[See [opengeospatial.org](http://www.opengeospatial.org).]
The OGC coordinates the development of open standards for geospatial content including file formats such as KML and GeoPackage.
As described in Chapter \@ref(spatial-class) the OGC publishes the simple feature data model, which underlies the vector data classes provided by **sf** and used in this book.
Open file formats of the kind endorsed by the OGC have several advantages over proprietary formats: the standards are published, ensuring transparency and enabling innovation to improve the file formats.
There are more than 100 spatial data formats exist available to R users via GDAL.
<!-- In the same time, they could differ in many ways. -->
<!-- Spatial data could be stored as a single file (e.g. GeoPackage), multiple files (e.g. ESRI Shapefile), or folders (ESRI ArcInfo Coverages). -->
<!-- way of storage (single file, multiple files, folders) -->
Table \@ref(tab:formats) presents some basic information about selected and often used spatial file formats.
<!-- simple features are missing from this table-->
```{r formats, echo=FALSE}
file_formats = tibble::tribble(~Name, ~Extension, ~Info, ~Type, ~Model,
"ESRI Shapefile", ".shp (the main file)", "One of the most popular vector file formats. Consists of at least three files. The main files size cannot exceed 2 GB. It lacks support for mixed type. Column names are limited to 10 characters, and number of columns are limited at 255. It has poor support for Unicode standard. ", "Vector", "Partially open",
"GeoJSON", ".geojson", "Extends the JSON exchange format by including a subset of the simple feature representation.", "Vector", "Open",
"KML", ".kml", "XML-based format for spatial visualization, developed for use with Google Earth. Zipped KML file forms the KMZ format.", "Vector", "Open",
"GPX", ".gpx", "XML schema created for exchange of GPS data.", "Vector", "Open",
"GeoTIFF", ".tiff", "GeoTIFF is one of the most popular raster formats. Its structure is similar to the regular `.tif` format, however, additionally stores the raster header.", "Raster", "Open",
"Arc ASCII", ".asc", "Text format where the first six lines represent the raster header, followed by the raster cell values arranged in rows and columns.", "Raster", "Open",
# "JPEG??"
"R-raster", ".gri, .grd", "Native raster format of the R-package raster.", "Raster", "Open",
# "NCDF??"
# "HDF??"
"SQLite/SpatiaLite", ".sqlite", "SQLite is a standalone, relational database management system. It is used as a default database driver in GRASS GIS 7. SpatiaLite is the spatial extension of SQLite providing support for simple features.", "Vector and raster", "Open",
# "WKT/WKB??",
"ESRI FileGDB", ".gdb", "Collection of spatial and nonspatial objects created in the ArcGIS software. It allows storage of multiple feature classes and enables use of topological definitions. Limited access to this format is provided by GDAL with the use of the OpenFileGDB and FileGDB drivers.", "Vector and raster", "Proprietary",
"GeoPackage", ".gpkg", "Lightweight database container based on SQLite allowing an easy and platform-independent exchange of geodata", "Vector and raster", "Open"
# "WKT/WKB"??
)
knitr::kable(file_formats, caption = "Selected spatial file formats.")
```
<!-- http://switchfromshapefile.org/ -->
<!-- 3. JPEG - (possibly mention SAGA's sdat, Erdas Imagine) -->
<!-- 1. SQLite/SpatialLite + mention GRASS (uses SQLite) -->
<!-- 3. WKT/WKB for transfering and storing geometry data on databases. PostGIS (has even its own raster WKT (https://trac.osgeo.org/postgis/wiki/WKTRasterTutorial01); WKT also supported by Spatiallite, Oracle, MySQL, etc. (https://en.wikipedia.org/wiki/Well-known_text#RDBMS_Engines_that_provide_support) -->
<!-- 4. ESRI geodatabase, Oracle spatial database (mention + gdal support?) -->
## Data Input (I) {#data-input}
Executing commands such as `sf::st_read()` (the main function we use for loading vector data) or `raster::raster()` (the main function used for loading raster data) silently sets off a chain of events that reads data from files.
<!-- transition is unclear, not sure what you would like to say -->
Moreover, there are many R packages containing a wide range of spatial data or providing simple access to different data sources.
All of them load the data into R or, more precisely, assign objects to your workspace, stored in RAM accessible from the `.GlobalEnv`^[See http://adv-r.had.co.nz/Environments.html for more information on the environment] of your current R session.
### Vector data
Spatial vector data comes in a wide variety of file formats, most of which can be read-in via the **sf** function `st_read()`.
Behind the scenes this calls GDAL.
To find out which data formats **sf** supports, run `st_drivers()`.
Here, we show only the first five drivers (see Table \@ref(tab:drivers)):
```{r, eval=FALSE}
sf_drivers = st_drivers()
head(sf_drivers, n = 5)
```
```{r drivers, echo=FALSE}
sf_drivers = st_drivers() %>% dplyr::filter(name %in% c("ESRI Shapefile", "GeoJSON", "KML", "GPX", "GPKG"))
knitr::kable(head(sf_drivers, n = 5), caption = "Sample of available drivers for reading/writing vector data (it could vary between different GDAL versions).")
```
<!-- One of the major advantages of **sf** is that it is fast. -->
<!-- reference to the vignette -->
The first argument of `st_read()` is `dsn`, which should be a text string or an object containing a single text string.
The content of a text string could vary between different drivers.
In most cases, as with the ESRI Shapefile (`.shp`) or the `GeoPackage` format (`.gpkg`), the `dsn` would be a file name.
`st_read()` guesses the driver based on the file extension, as illustrated for a `.gpkg` file below:
```{r}
vector_filepath = system.file("shapes/world.gpkg", package = "spData")
world = st_read(vector_filepath)
```
For some drivers, `dsn` could be provided as a folder name, access credentials for a database, or a GeoJSON string representation (see the examples of the `st_read()` help page for more details).
Some vector driver formats can store multiple data layers.
By default, `st_read` automatically reads the first layer of the file specified in `dsn`, however, using the `layer` argument you can specify any other layer.
Naturally, some options are specific to certain drivers.^[A list of supported vector formats and theirs options can be found at http://www.gdal.org/ogr_formats.html.]
For example, think of coordinates stored in a spreadsheet format (`.csv`).
To read in such files as spatial objects, we naturally have to specify the names of the columns (`X` and `Y` in our example below) representing the coordinates.
We can do this with the help of the `options` parameter.
To find out about possible options, please refer to the 'Open Options' section of the corresponding GDAL driver description.
For the comma-separated value (csv) format, visit http://www.gdal.org/drv_csv.html.
```{r, results='hide'}
cycle_hire_txt = system.file("misc/cycle_hire_xy.csv", package = "spData")
cycle_hire_xy = st_read(cycle_hire_txt, options = c("X_POSSIBLE_NAMES=X",
"Y_POSSIBLE_NAMES=Y"))
```
Instead of columns describing xy-coordinates, a single column can also contain the geometry information.
Well-known text (WKT), well-known binary (WKB), and the GeoJSON formats are examples of this.
For instance, the `world_wkt.csv` file has a column named `WKT` representing polygons of the world's countries.
We will again use the `options` parameter to indicate this.
Here, we will use `read_sf()` which does exactly the same as `st_read()` except it does not print the driver name to the console and stores strings as characters instead of factors.
```{r, results='hide'}
world_txt = system.file("misc/world_wkt.csv", package = "spData")
world_wkt = read_sf(world_txt, options = "GEOM_POSSIBLE_NAMES=WKT")
# the same as
world_wkt = st_read(world_txt, options = "GEOM_POSSIBLE_NAMES=WKT",
quiet = TRUE, stringsAsFactors = FALSE)
```
```{block2 type='rmdnote'}
Not all of the supported vector file formats store information about their coordinate reference system.
In these situations, it is possible to add the missing information using the `st_set_crs()` function.
Please refer also to section \@ref(crs-intro) for more information.
```
As a final example, we will show how `st_read()` also reads KML files.
A KML file stores geographic information in XML format - a data format for the creation of web pages and the transfer of data in an application-independent way [@nolan_xml_2014].
Here, we access a KML file from the web.
This file contains more than one layer.
`st_layers()` lists all available layers.
We choose the first layer `Placemarks` and say so with the help of the `layer` parameter in `read_sf()`.
```{r}
url = "https://developers.google.com/kml/documentation/KML_Samples.kml"
st_layers(url)
kml = read_sf(url, layer = "Placemarks")
```
### Raster data
Similar to vector data, raster data comes in many file formats with some of them supporting even multilayer files.
**raster**'s `raster()` command reads in a single layer.
```{r, message=FALSE}
raster_filepath = system.file("raster/srtm.tif", package = "spDataLarge")
single_layer = raster(raster_filepath)
```
In case you want to read in a single band from a multilayer file use the `band` parameter to indicate a specific layer.
```{r}
multilayer_filepath = system.file("raster/landsat.tif", package = "spDataLarge")
band3 = raster(multilayer_filepath, band = 3)
```
If you want to read in all bands, use `brick()` or `stack()`.
```{r}
multilayer_brick = brick(multilayer_filepath)
multilayer_stack = stack(multilayer_filepath)
```
Please refer to section \@ref(raster-classes) for information on the difference between raster stacks and bricks.
<!-- ### Databases -->
<!-- postgis input example -->
## Data output (O) {#data-output}
<!--maybe we can come up with an intro which is a bit more compelling-->
Writing spatial data allows you to convert from one format to another and to save newly created objects.
Depending on the data type (vector or raster), object class (e.g `multipoint` or `RasterLayer`), and type and amount of stored information (e.g. object size, range of values) - it is important to know how to store spatial files in the most efficient way.
The next two sections will demonstrate how to do this.
<!-- should we add a note about recommended way to decide on a file name, for example "don't use spaces in the name", "create descriptive names" -->
### Vector data
```{r, echo=FALSE, results='hide'}
world_files = list.files(pattern = "world\\.")
file.remove(world_files)
```
The counterpart of `st_read()` is `st_write()`.
It allows you to write **sf** objects to a wide range of geographic vector file formats, including the most common such as `.geojson`, `.shp` and `.gpkg`.
Based on the file name, `st_write()` decides automatically which driver to use.
The speed of the writing process depends also on the driver.
<!-- ref to the vignette -->
```{r}
st_write(obj = world, dsn = "world.gpkg")
```
**Note**: if you try to write to the same data source again, the function will fail:
```{r, error=TRUE}
st_write(obj = world, dsn = "world.gpkg")
```
<!-- ## GDAL Error 1: Layer world.gpkg already exists, CreateLayer failed. -->
<!-- ## Use the layer creation option OVERWRITE=YES to replace it. -->
The error message provides some information as to why the function failed.
The `GDAL Error 1` statement makes clear that the failure occurred at the GDAL level.
Additionally, the suggestion to use `OVERWRITE=YES` provides a clue about how to fix the problem.
However, this is not a `st_write()` argument, it is a GDAL option.
Luckily, `st_write` provides a `layer_options` argument through which we can pass driver-dependent options:
```{r, results='hide'}
st_write(obj = world, dsn = "world.gpkg", layer_options = "OVERWRITE=YES")
```
Another solution is to use the `st_write()` argument `delete_layer`. Setting it to `TRUE` deletes already existing layers in the data source before the function attempts to write (note there is also a `delete_dsn` argument):
```{r, results='hide'}
st_write(obj = world, dsn = "world.gpkg", delete_layer = TRUE)
```
You can achieve the same with `write_sf()` since it is equivalent to (technically an *alias* for) `st_write()`, except that its defaults for `delete_layer` and `quiet` is `TRUE`.
```{r}
write_sf(obj = world, dsn = "world.gpkg")
```
<!-- how about saving multilayer gpkg? -->
The `layer_options` argument could be also used for many different purposes.
One of them is to write spatial data to a text file.
This can be done by specifying `GEOMETRY` inside of `layer_options`.
It could be either `AS_XY` for simple point datasets (it creates two new columns for coordinates) or `AS_WKT` for more complex spatial data (one new column is created which contains the well-known-text representation of spatial objects).
```{r, eval=FALSE}
st_write(cycle_hire_xy, "cycle_hire_xy.csv", layer_options = "GEOMETRY=AS_XY")
st_write(world_wkt, "world_wkt.csv", layer_options = "GEOMETRY=AS_WKT")
```
### Raster data
The `writeRaster()` function saves `Raster*` objects to files on disk.
The function expects input regarding output datatype and file format, but also accepts GDAL options specific to a selected file format (see `?writeRaster` for more details).
<!-- datatypes -->
The **raster** package offers nine datatypes when saving a raster: LOG1S, INT1S, INT1U, INT2S, INT2U, INT4S, INT4U, FLT4S, and FLT8S.^[Using INT4U is not recommended as R does not support 32-bit unsigned integers.<!--recheck this info-->]
The datatype determines the bit representation of the raster object written to disk (\@ref(tab:datatypes)).
Which datatype to use depends on the range of the values of your raster object.
The more values a datatype can represent, the larger the file will get on disk.
Commonly, one would use LOG1S for bitmap (binary) rasters.
Unsigned integers (INT1U, INT2U, INT4U) are suitable for categorical data, while float numbers (FLT4S and FLTS8S) usually represent continuous data.
`writeRaster()` uses FLT4S as the default.
While this works in most cases, the size of the output file will be unnecessarly large if you save binary or categorical data.
Therefore, we would recommend to use the datatype that needs the least storage space but is still able to represent all values (check the range of values with the `summary()` function).
```{r datatypes, echo=FALSE}
dT = tibble::tribble(
~Datatype, ~`Minimum value`, ~`Maximum value`,
"LOG1S", "FALSE (0)", "TRUE (1)",
"INT1S", "-127", "127",
"INT1U", "0", "255",
"INT2S", "-32,767", "32,767",
"INT2U", "0", "65,534",
"INT4S", "-2,147,483,647", "2,147,483,647",
"INT4U", "0", "4,294,967,296",
"FLT4S", "-3.4e+38", "3.4e+38",
"FLT8S", "-1.7e+308", "1.7e+308"
)
knitr::kable(dT, caption = "Datatypes supported by the raster package.")
```
The file extension determines the output file when saving a `Raster*` object to disk.
For example, the `.tif` extension will create a GeoTIFF file:
```{r, eval=FALSE}
writeRaster(x = single_layer,
filename = "my_raster.tif",
datatype = "INT2U")
```
The `raster` file format (native to the `raster` package) is used when a file extension is invalid or missing.
Some raster file formats come with additional options.
You can use them with the `options` parameter.^[Full list of supported raster formats with theirs options can be found at http://www.gdal.org/formats_list.html.]
For example, GeoTIFF allows you to compress the output raster with the `COMPRESS` option^[Find out about GeoTIFF options under http://www.gdal.org/frmt_gtiff.html.]:
```{r, eval=FALSE}
writeRaster(x = single_layer,
filename = "my_raster.tif",
datatype = "INT2U",
options = c("COMPRESS=DEFLATE"),
overwrite = TRUE)
```
Note that `writeFormats()` returns a list with all supported file formats on your computer.
<!-- ### Databases -->
<!-- postgis output example -->
## Visual outputs
R supports many different static and interactive graphics formats.
The most general method to save a static plot is to open a graphic device, create a plot, and close it, for example:
```{r, eval=FALSE}
png(filename = "lifeExp.png", width = 500, height = 350)
plot(world["lifeExp"])
dev.off()
```
Other available graphic devices include `pdf()`, `bmp()`, `jpeg()`, `png()`, and `tiff()`.
You can specify several properties of the output plot, including width, height and resolution.
Additionally, several graphic packages provide thier own functions to save a graphical output.
For example, the **tmap** package has the `tmap_save()` function.
You can save a `tmap` object to different graphic formats by specifying the object name and a file path to a new graphic file.
```{r, eval=FALSE}
library(tmap)
tmap_obj = tm_shape(world) +
tm_polygons(col = "lifeExp")
tmap_save(tm = tmap_obj, filename = "lifeExp_tmap.png")
```
<!-- Note about that the `plot` function do not create an object -->
<!-- ```{r} -->
<!-- a = plot(world["lifeExp"]) -->
<!-- ``` -->
On the other hand, you can save interactive maps created in the `mapview` package as an HTML file or image using the `mapshot()` function:
<!-- example doesn't work, problem with colors I guess -->
```{r, eval=FALSE}
library(mapview)
mapview_obj = mapview(world, zcol = "lifeExp", legend = TRUE)
mapshot(mapview_obj, file = "my_interactive_map.html")
```
## Exercises
1. List and describe three types of vector, raster, and geodatabase formats.
1. Name at least two differences between `read_sf()` and the more well-known function `st_read()`.
1. Read the `cycle_hire_xy.csv` file from the **spData** package (Hint: it is located in the `misc\` folder).
What is a geometry type of the loaded object?
1. Download the borders of Germany using **rnaturalearth**, and create a new object called `germany_borders`.
Write this new object to a file of the GeoPackage format.
1. Download the global monthly minimum temperature with a spatial resolution of five minutes using the **raster** package.
Extract the June values, and save them to a file named `tmin_june.tif` file (hint: use `raster::subset()`).
1. Create a static map of Germany's borders, and save it to a PNG file.
1. Create an interactive map using data from the `cycle_hire_xy.csv` file.
Export this map to a file called `cycle_hire.html`.