README.Rmd

---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
set.seed(42)
```

# hpa.spatial <a href='https://healthpolicyanalysis.github.io/hpa.spatial/'><img src='man/figures/hex.png' align="right" style="height:150px"/></a>

<!-- badges: start -->
[![R-CMD-check](https://github.com/healthpolicyanalysis/hpa.spatial/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/healthpolicyanalysis/hpa.spatial/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->

The goal of hpa.spatial is to make relevant shape files and data easily available and include helpful functions for the analysis of spatial data, focusing on the Australian (health) context.

## Notes on other packages

Most shape files are available within [`{absmapsdata}`](https://github.com/wfmackey/absmapsdata) and can be loaded using [`{strayr}`](https://github.com/runapp-aus/strayr).

The way that data are accessed with hpa.spatial both uses these packages as well as replicating their approach to access data from [`{hpa.spatial.data}`](https://github.com/healthpolicyanalysis/hpa.spatial.data).


## Installation

hpa.spatial is only available here on GitHub. You can install the development or release versions of it using the code below:

``` r
# install.packages("remotes")
remotes::install_github("healthpolicyanalysis/hpa.spatial") # development version
remotes::install_github("healthpolicyanalysis/hpa.spatial@*release") # latest release version
```

```{r, message=FALSE, warning=FALSE}
library(hpa.spatial)
library(sf)
library(dplyr)
library(ggplot2)
```

## Getting shapefiles

`get_polygon()` is used to get shape files from the abs.

```{r, warning=FALSE}
sa2_2016 <- get_polygon(area = "sa2", year = 2016)
head(sa2_2016)

sa2_2016 |>
  ggplot() +
  geom_sf() +
  theme_bw() +
  ggtitle("SA2 (2016)")

lga_2016 <- get_polygon(area = "lga", year = 2016)
head(lga_2016)

lga_2016 |>
  ggplot() +
  geom_sf() +
  theme_bw() +
  ggtitle("LGA (2016)")
```

This is used in the same way as `strayr::read_absmap()` except it also includes a `simplify_keep` argument for simplifying the polygon.

```{r, eval=FALSE}
sa2_2016_simple <- get_polygon(area = "sa2", year = 2016, simplify_keep = 0.1)

sa2_2016 |>
  filter(gcc_name_2016 == "Greater Brisbane") |>
  ggplot() +
  geom_sf() +
  scale_x_continuous(limits = c(152.9, 153.1)) +
  scale_y_continuous(limits = c(-27.4, -27.6)) +
  theme_bw()

sa2_2016_simple |>
  filter(gcc_name_2016 == "Greater Brisbane") |>
  ggplot() +
  geom_sf() +
  scale_x_continuous(limits = c(152.9, 153.1)) +
  scale_y_continuous(limits = c(-27.4, -27.6)) +
  theme_bw()
```


```{r, echo=FALSE, message=FALSE, warning=FALSE}
sa2_2016_simple <- get_polygon(area = "sa2", year = 2016, simplify_keep = 0.1)

p1 <- sa2_2016 |>
  filter(gcc_name_2016 == "Greater Brisbane") |>
  ggplot() +
  geom_sf() +
  scale_x_continuous(limits = c(152.9, 153.1)) +
  scale_y_continuous(limits = c(-27.4, -27.6)) +
  theme_bw()

p2 <- sa2_2016_simple |>
  filter(gcc_name_2016 == "Greater Brisbane") |>
  ggplot() +
  geom_sf() +
  scale_x_continuous(limits = c(152.9, 153.1)) +
  scale_y_continuous(limits = c(-27.4, -27.6)) +
  theme_bw()

gridExtra::grid.arrange(p1, p2, ncol = 2)
```

### Other shapefiles

Aside from the built in shapefiles that are hosted by `{absmapsdata}`, `get_polygon()` can also access shapefiles for local hospital networks (LHNs) and Primary Health Networks (PHNs).

For example, these can be accessed by the `"area"` or `"name"` arguments as "LHN".

```{r, warning=FALSE}
qld_hhs <- get_polygon(area = "LHN") |> filter(state == "QLD")
head(qld_hhs)

qld_hhs |>
  ggplot() +
  geom_sf() +
  theme_bw()
```


## Mapping data between ASGS editions

`map_data_with_correspondence()` is used to map data across ASGS editions.

When used with unit level data, it will randomly allocate the value to the code of the updated edition based on the population-weighted proportions (as probabilities) on the relevant correspondence table.

```{r}
map_data_with_correspondence(
  codes = c(107011130, 107041149),
  values = c(10, 10),
  from_area = "sa2",
  from_year = 2011,
  to_area = "sa2",
  to_year = 2016,
  value_type = "units"
)
```

When used with aggregate data, it will split the value among the codes of the updated edition based on the population-weighted proportions on the relevant correspondence table.

```{r}
map_data_with_correspondence(
  codes = c(107011130, 107041149),
  values = c(10, 10),
  from_area = "sa2",
  from_year = 2011,
  to_area = "sa2",
  to_year = 2016,
  value_type = "aggs"
)
```

### Example

Suppose we have counts of patients within SA2s (2011) and we want to aggregate these up into SA3s (2016 edition). Here is how we could do this with `map_data_correspondence()` mapping up both to a higher level of ASGS and to a more recent edition.

```{r, warning=FALSE, message=FALSE}
sa2_2011 <- get_polygon("sa22011")
sa2_2011$patient_counts <- rpois(n = nrow(sa2_2011), lambda = 30)

sa3_counts <- map_data_with_correspondence(
  codes = sa2_2011$sa2_code_2011,
  values = sa2_2011$patient_counts,
  from_area = "sa2",
  from_year = 2011,
  to_area = "sa3",
  to_year = 2016,
  value_type = "aggs"
) |>
  rename(patient_counts = values)

sa3_2016 <- get_polygon("sa32016") |>
  left_join(sa3_counts)
```

```{r, eval=FALSE}
sa2_2011 |>
  ggplot() +
  geom_sf(aes(fill = patient_counts)) +
  ggtitle("Patient counts by SA2 (2011)") +
  theme_bw()

sa3_2016 |>
  ggplot() +
  geom_sf(aes(fill = patient_counts)) +
  ggtitle("Patient counts by SA3 (2016)") +
  theme_bw()
```

```{r, echo=FALSE, message=FALSE, warning=FALSE}
p1 <- sa2_2011 |>
  ggplot() +
  geom_sf(aes(fill = patient_counts)) +
  ggtitle("Patient counts by SA2 (2011)") +
  theme_bw()

p2 <- sa3_2016 |>
  ggplot() +
  geom_sf(aes(fill = patient_counts)) +
  ggtitle("Patient counts by SA3 (2016)") +
  theme_bw()

gridExtra::grid.arrange(p1, p2, ncol = 1)
```