-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
205 lines (161 loc) · 7.79 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# geslaR
Get And Manipulate the GESLA Dataset
<!-- badges: start -->
<!-- [![R-CMD-check](https://github.com/EireExtremes/geslaR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/EireExtremes/geslaR/actions/workflows/R-CMD-check.yaml) -->
<!-- Main branch -->
<!-- [![R-CMD-check](https://github.com/EireExtremes/geslaR/actions/workflows/R-CMD-check.yaml/badge.svg?branch=dev)](https://github.com/EireExtremes/geslaR/actions/workflows/R-CMD-check.yaml) -->
<!-- Development branch -->
<!-- [![test-coverage](https://github.com/EireExtremes/geslaR/actions/workflows/test-coverage.yaml/badge.svg?branch=main)](https://github.com/EireExtremes/geslaR/actions/workflows/test-coverage.yaml) Main branch -->
<!-- [![test-coverage](https://github.com/EireExtremes/geslaR/actions/workflows/test-coverage.yaml/badge.svg?branch=dev)](https://github.com/EireExtremes/geslaR/actions/workflows/test-coverage.yaml) Development branch -->
<!-- [![Codecov test coverage](https://codecov.io/gh/EireExtremes/geslaR/branch/main/graph/badge.svg)](https://app.codecov.io/gh/EireExtremes/geslaR?branch=main) -->
<!-- badges: end -->
```{r echo=FALSE}
cmd_check <-
"[![R-CMD-check](https://github.com/EireExtremes/geslaR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/EireExtremes/geslaR/actions/workflows/R-CMD-check.yaml)"
cover <-
"[![test-coverage](https://github.com/EireExtremes/geslaR/actions/workflows/test-coverage.yaml/badge.svg?branch=main)](https://github.com/EireExtremes/geslaR/actions/workflows/test-coverage.yaml)"
pkg_down <-
"[![pkgdown](https://github.com/EireExtremes/geslaR/actions/workflows/pkgdown.yaml/badge.svg?branch=main)](https://github.com/EireExtremes/geslaR/actions/workflows/pkgdown.yaml)"
code_cov <-
"[![Codecov test coverage](https://codecov.io/gh/EireExtremes/geslaR/branch/main/graph/badge.svg)](https://app.codecov.io/gh/EireExtremes/geslaR?branch=main)"
```
| **Action** | `R-CMD-check` | `test-coverage` | `pkgdown` | `codecov` |
|:----------:|---------------|-----------------|--------------|--------------|
| **Status** | `r cmd_check` | `r cover` | `r pkg_down` | `r code_cov` |
The **geslaR** package was developed to deal with the
[GESLA](https://gesla787883612.wordpress.com) (Global Extreme Sea Level
Analysis) dataset.
![](https://www.r-pkg.org/badges/version-last-release/geslaR)
The GESLA (Global Extreme Sea Level Analysis) project aims to provide a
global database of higher-frequency sea-level records for researchers to
study tides, storm surges, extreme sea levels, and related processes.
Three versions of the GESLA dataset are available for download,
including a zip file containing the entire dataset, a CSV file
containing metadata, and a KML file for plotting the location of all
station records in Google Earth. The **geslaR** R package developed here
aims to facilitate the access to the GESLA dataset by providing
functions to download it entirely, or query subsets of it directly into
R, without the need of downloading the full dataset. Also, it provides a
built-in web-application, so that users can apply basic filters to
select the data of interest, generating informative plots, and showing
the selected sites all over the world. Users can download the selected
subset of data in CSV or Parquet file formats, with the latter being
recommended due to its smaller size and the ability to handle it in many
programming languages through the Apache Arrow language for in-memory
analytics. The web interface was developed using the Shiny R package,
with the CSV files from the GESLA dataset converted to the Parquet
format and stored in an Amazon AWS bucket.
To get started with the package, please see the vignette [Dealing with
the GESLA dataset in R][], where you will find a besic introduction to
all the functions available and how to use each one of them. To learn
how to use the Apache Arrow framework to deal with the dataset in R, see
the vignette [Introduction to Apache Arrow framework][].
## Installation
You can install the latest **stable** version of geslaR from
[CRAN](https://cran.r-project.org/package=geslaR) with:
```{r install-cran, eval=FALSE}
install.packages("geslaR")
```
To be able to use the built-in web-application, all the package
dependencies should also be installed with:
```{r install-cran2, eval=FALSE}
install.packages("geslaR", dependencies = TRUE)
```
To install the **development** version from
[GitHub](https://github.com/EireExtremes/geslaR) use:
```{r install, eval=FALSE}
## install.packages("devtools")
devtools::install_github("EireExtremes/geslaR")
## Or
devtools::install_github("EireExtremes/geslaR", dependencies = TRUE)
```
## Examples
```{r lib, eval=FALSE}
library(geslaR)
```
To read files from the GESLA dataset, use the `read_gesla()` function.
```{r ex-read, eval=FALSE}
##------------------------------------------------------------------
## Import an internal example Parquet file
tmp <- tempdir()
file.copy(system.file(
"extdata", "ireland.parquet", package = "geslaR"), tmp)
da <- read_gesla(paste0(tmp, "/ireland.parquet"))
## Check size in memory
object.size(da)
##------------------------------------------------------------------
## Import an internal example CSV file
tmp <- tempdir()
file.copy(system.file(
"extdata", "ireland.csv", package = "geslaR"), tmp)
da <- read_gesla(paste0(tmp, "/ireland.csv"))
## Check size in memory
object.size(da)
##------------------------------------------------------------------
## Import an internal example Parquet file as data.frame
tmp <- tempdir()
file.copy(system.file(
"extdata", "ireland.parquet", package = "geslaR"), tmp)
da <- read_gesla(paste0(tmp, "/ireland.parquet"),
as_data_frame = TRUE)
## Check size in memory
object.size(da)
##------------------------------------------------------------------
## Import an internal example CSV file as data.frame
tmp <- tempdir()
file.copy(system.file(
"extdata", "ireland.csv", package = "geslaR"), tmp)
da <- read_gesla(paste0(tmp, "/ireland.csv"),
as_data_frame = TRUE)
## Check size in memory
object.size(da)
```
To make a query to the GESLA dataset and load it directly into R, one
can use the `query_gesla()` function.
```{r ex-query, eval=FALSE}
## Query a subset of the GESLA dataset, without the need of downloading
## all the dataset
de <- query_gesla(country = "IRL", year = 2020:2021, as_data_frame = FALSE)
class(de)
```
To download the full dataset locally, use the `download_gesla()` function.
```{r ex-down, eval=FALSE}
## Download the whole dataset (parquet files) into a specific location
download_gesla(dest = "./gesla_dataset")
## ℹ The total size of the dataset is about 7GB, and the download time will depend on
## your internet connection
## Do you want to download the whole dataset?
## 1: Yes
## 2: No
## Selection: 1
## ℹ Wait while the dataset is downloaded...
```
To open the built-in web-application, use the `run_gesla_app()` function
(note that this will need the installation of **geslaR** with all of its
dependencies).
```{r ex-run, eval=FALSE}
## This function will download the whole dataset (if not yet done), and
## open the geslar-app web interface locally on your browser
run_gesla_app()
```
# Acknowledgements
This work has emanated from research conducted with the financial
support of Science Foundation Ireland and co-funded by GSI under Grant
number 20/FFP-P/8610.
```{r fig, echo=FALSE}
knitr::include_graphics("man/figures/logos2.png")
```
[Introduction to Apache Arrow framework]: https://eireextremes.github.io/geslaR/articles/intro-to-arrow.html
[Dealing with the GESLA dataset in R]: https://eireextremes.github.io/geslaR/articles/intro-to-geslaR.html