Skip to content

Commit

Permalink
Merge pull request #103 from peteowen1/anzsic-2006-fix
Browse files Browse the repository at this point in the history
update anzsic2006 to download from abs source.
  • Loading branch information
wfmackey authored Aug 17, 2023
2 parents cb0f281 + 79f3a7a commit 8b2bf3d
Show file tree
Hide file tree
Showing 9 changed files with 95 additions and 58 deletions.
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ Authors@R: c(person("Will", "Mackey", email = "[email protected]", role = c("au
person("Benjamin", "Wee", role = c("aut")),
person("Carlos", "Yanez", role = "ctb"),
person("Bas", "Latcham", role = "ctb"),
person("Rex", "Parsons", role = "ctb", comment = c(ORCID = "0000-0002-6053-8174"))
person("Rex", "Parsons", role = "ctb", comment = c(ORCID = "0000-0002-6053-8174")),
person("Pete", "Owen", role = "ctb")
)
Maintainer: Will Mackey <[email protected]>
License: GPL-3
Expand Down
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# strayr (development version)
* `create read_correspondence_tbl()` reads correspondence tables from
`absmapsdata` similarly to `read_absmap()`
* updated `anzsco2006` to include leading zeros in codes (see ). This is a backwards incompatible change that may cause issues (not enough for a major version progression)

# strayr 0.2.2
* `anzsco2022` updated to reflect changes made by the ABS
Expand Down
29 changes: 14 additions & 15 deletions R/data_descriptions.R
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@
"anzsco2009"


#' ANZSCO 2019
#' ANZSCO 2013
#'
#' Wide table containing all levels of the Australian and New Zealand Standard
#' Classification of Occupations (ANZSCO), Version 1.3, 2019
#' Classification of Occupations (ANZSCO), Version 1.2, 2013
#'
#' @format A \code{tibble} with 11 variables:
#' \describe{
Expand All @@ -41,14 +41,13 @@
#' \item{\code{skill_level}}{Skill level required for occupation, determined by the ABS (1 is highest, 5 is lowest).
#' See \url{https://www.abs.gov.au/ausstats/[email protected]/Previousproducts/C4BECE1704987586CA257089001A9181 } for details.}
#' }
"anzsco2019"

"anzsco2013"


#' ANZSCO 2013
#' ANZSCO 2019
#'
#' Wide table containing all levels of the Australian and New Zealand Standard
#' Classification of Occupations (ANZSCO), Version 1.2, 2013
#' Classification of Occupations (ANZSCO), Version 1.3, 2019
#'
#' @format A \code{tibble} with 11 variables:
#' \describe{
Expand All @@ -65,7 +64,8 @@
#' \item{\code{skill_level}}{Skill level required for occupation, determined by the ABS (1 is highest, 5 is lowest).
#' See \url{https://www.abs.gov.au/ausstats/[email protected]/Previousproducts/C4BECE1704987586CA257089001A9181 } for details.}
#' }
"anzsco2013"
"anzsco2019"


#' ANZSCO 2021
#'
Expand All @@ -89,6 +89,7 @@
#' }
"anzsco2021"


#' ANZSCO 2022
#'
#' Wide table containing all levels of the Australian and New Zealand Standard
Expand All @@ -113,28 +114,26 @@
"anzsco2022"


#' ANZSIC
#' ANZSIC 2006
#'
#' Wide table containing all levels of the Australian and New Zealand Standard
#' Industrial Classification (ANZSIC), 2006 (Revision 1.0). Cat. 1292.0.
#' Industrial Classification (ANZSIC), 2006 (Revision 2.0). Cat. 1292.0.
#'
#' @format A \code{tibble} with 8 variables:
#' \describe{
#' \item{\code{anzsic_division_code}}{ANZSIC division codes character, e.g. "A", "B"}
#' \item{\code{anzsic_division}}{ANZSIC division title, e.g. "Agriculture, Forestry and Fishing"}
#' \item{\code{anzsic_subdivision_code}}{ANZSIC subdivision codes integer, e.g. 1, 2}
#' \item{\code{anzsic_subdivision_code}}{ANZSIC subdivision codes 2-digit character, e.g. 01, 02}
#' \item{\code{anzsic_subdivision}}{ANZSIC subdivision title, e.g. "Agriculture"}
#' \item{\code{anzsic_group_code}}{ANZSIC group codes integer, e.g. 11, 12}
#' \item{\code{anzsic_group_code}}{ANZSIC group codes 3-digit character, e.g. 011, 012}
#' \item{\code{anzsic_group}}{ANZSIC group title, e.g. "Mushroom and Vegetable Growing"}
#' \item{\code{anzsic_class_code}}{ANZSIC class codes integer, e.g. 111, 112}
#' \item{\code{anzsic_class_code}}{ANZSIC class codes 4-digit character, e.g. 0111, 0112}
#' \item{\code{anzsic_class}}{ANZSIC class title, e.g. "Vegetable Growing (Under Cover)"}
#' }
#' @source \url{https://www.abs.gov.au/statistics/classifications/australian-and-new-zealand-standard-industrial-classification-anzsic/2006-revision-2-0/numbering-system-and-titles/division-subdivision-group-and-class-codes-and-titles}
"anzsic2006"





#' ASCED Field of Education
#'
#' Wide table containing all levels of fields of education in the Australian
Expand Down
2 changes: 1 addition & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ Current structures stored in `strayr` are:
- `anzsco2013`: occupation levels of ANZSCO, [2013, Version 1.2](https://www.abs.gov.au/AUSSTATS/[email protected]/allprimarymainfeatures/4AF138F6DB4FFD4BCA2571E200096BAD?opendocument).
- `anzsco2009`: occupation levels ANZSCO, [First Edition, Revision 1, 2009](https://www.abs.gov.au/AUSSTATS/[email protected]/DetailsPage/1220.0First%20Edition,%20Revision%201?OpenDocument).
- Australian and New Zealand Standard Industrial Classification (**ANZSIC**), Cat. 1292.0:
- `anzsic2006`: industry levels of ANZSIC, [2006 (Revision 1.0)](https://www.abs.gov.au/ausstats/[email protected]/0/20C5B5A4F46DF95BCA25711F00146D75?opendocument).
- `anzsic2006`: industry levels of ANZSIC, [2006 (Revision 2.0)](https://www.abs.gov.au/statistics/classifications/australian-and-new-zealand-standard-industrial-classification-anzsic/2006-revision-2-0).
- Australian Standard Classification of Education (**ASCED**), Cat. 1272.0:
- `asced_foe2001`: field of education levels of ASCED, [2001](https://www.abs.gov.au/ausstats/[email protected]/mf/1272.0).
- `asced_qual2001`: qualification levels of ASCED, [2001](https://www.abs.gov.au/ausstats/[email protected]/mf/1272.0).
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ Current structures stored in `strayr` are:
- Australian and New Zealand Standard Industrial Classification
(**ANZSIC**), Cat. 1292.0:
- `anzsic2006`: industry levels of ANZSIC, [2006 (Revision
1.0)](https://www.abs.gov.au/ausstats/[email protected]/0/20C5B5A4F46DF95BCA25711F00146D75?opendocument).
2.0)](https://www.abs.gov.au/statistics/classifications/australian-and-new-zealand-standard-industrial-classification-anzsic/2006-revision-2-0).
- Australian Standard Classification of Education (**ASCED**), Cat.
1272.0:
- `asced_foe2001`: field of education levels of ASCED,
Expand Down
101 changes: 66 additions & 35 deletions data-raw/create_anzsic2006.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,47 +2,78 @@
# Reading and cleaning ANZSIC correspondence

library(tidyverse)
library(glue)
library(rvest)

# include factor variants or nah?
include_factor_variants <- FALSE

# ty asiripanich
anzsic_url <- "https://raw.githubusercontent.com/asiripanich/anzsic/master/anzsic_2006.csv"

# Read
anzsic_raw <- read_csv(anzsic_url) %>%
rename_all(~ glue("anzsic_{.}")) %>%
mutate_if(is.double, as.integer) %>%
as_tibble()

# Add layers of nfd
class_nfd <- anzsic_raw %>%
distinct(anzsic_division_title, anzsic_division_code,
anzsic_subdivision_title, anzsic_subdivision_code,
anzsic_group_title, anzsic_group_code) %>%
mutate(anzsic_class_code = anzsic_group_code * 10,
anzsic_class_title = glue("{anzsic_group_title}, nfd"))

group_nfd <- anzsic_raw %>%
distinct(anzsic_division_title, anzsic_division_code,
anzsic_subdivision_title, anzsic_subdivision_code) %>%
mutate(anzsic_group_title = glue("{anzsic_subdivision_title}, nfd"),
anzsic_group_code = anzsic_subdivision_code * 10,
anzsic_class_title = anzsic_group_title,
anzsic_class_code = anzsic_group_code * 10)

subdivision_nfd <- anzsic_raw %>%
group_by(anzsic_division_code, anzsic_division_title) %>%
summarise(anzsic_subdivision_code = min(anzsic_subdivision_code)) %>%
mutate(anzsic_subdivision_title = glue("{anzsic_division_title}, nfd"),
anzsic_group_title = anzsic_subdivision_title,
anzsic_group_code = anzsic_subdivision_code * 10,
anzsic_class_title = anzsic_group_title,
anzsic_class_code = anzsic_group_code * 10)
# fetch from abs website
url <- "https://www.abs.gov.au/statistics/classifications/australian-and-new-zealand-standard-industrial-classification-anzsic/2006-revision-2-0/numbering-system-and-titles/division-subdivision-group-and-class-codes-and-titles"

df <- url %>%
rvest::read_html() %>%
rvest::html_table()

# bind tables together
anzsic_2006_temp <-
purrr::list_rbind(df)

# fix columns names and bind together
colnames(anzsic_2006_temp) <- c("anzsic_division_code", "anzsic_subdivision_code", "anzsic_group_code", "anzsic_class_code", "title")

first_row <-
as.data.frame(t(colnames(df[[1]])))

colnames(first_row) <- c("anzsic_division_code", "anzsic_subdivision_code", "anzsic_group_code", "anzsic_class_code", "title")

anzsic_2006_total <-
dplyr::bind_rows(first_row, anzsic_2006_temp)

# replace blanks with NAs
anzsic_2006_total[anzsic_2006_total == ""] <- NA

# fill NAs down from above
anzsic_2006_fill <-
anzsic_2006_total %>%
tidyr::fill(colnames(anzsic_2006_total), .direction = c("down"))

# get each grouping type individually
anzsic_2006_class <-
anzsic_2006_total %>%
dplyr::filter(stringr::str_detect(anzsic_class_code, "^[:digit:]+$")) %>%
dplyr::select(anzsic_class_code, anzsic_class_title = title)

anzsic_2006_group <-
anzsic_2006_total %>%
dplyr::filter(stringr::str_detect(anzsic_group_code, "^[:digit:]+$")) %>%
dplyr::select(anzsic_group_code, anzsic_group_title = title)

anzsic_2006_subdivision <-
anzsic_2006_total %>%
dplyr::filter(stringr::str_detect(anzsic_subdivision_code, "^[:digit:]+$")) %>%
dplyr::select(anzsic_subdivision_code, anzsic_subdivision_title = title)

anzsic_2006_division <-
anzsic_2006_total %>%
dplyr::filter(stringr::str_detect(anzsic_division_code, "^[:alpha:]+$")) %>%
dplyr::select(anzsic_division_code, anzsic_division_title = title)

# combine grouping types into final table
anzsic_2006_final <-
anzsic_2006_fill %>%
dplyr::left_join(anzsic_2006_division) %>%
dplyr::left_join(anzsic_2006_subdivision) %>%
dplyr::left_join(anzsic_2006_group) %>%
dplyr::left_join(anzsic_2006_class) %>%
dplyr::filter(!is.na(anzsic_class_title)) %>%
dplyr::select(
anzsic_division_code, anzsic_division_title, anzsic_subdivision_code, anzsic_subdivision_title,
anzsic_group_code, anzsic_group_title, anzsic_class_code, anzsic_class_title
) %>%
dplyr::as_tibble()

# Finalise data frame; noting that we are avoiding the nfd complication for now
anzsic2006 <- anzsic_raw %>%
anzsic2006 <- anzsic_2006_final %>%
arrange(anzsic_division_code,
anzsic_subdivision_code,
anzsic_group_code,
Expand Down
Binary file modified data/anzsic2006.rda
Binary file not shown.
13 changes: 8 additions & 5 deletions man/anzsic2006.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions pkgdown/_pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,15 @@ reference:
- starts_with("asced")
- starts_with("asc")
- auholidays
- school_terms
- title: "Importing ABS Data"
desc: >
Functions for retrieving ABS data
contents:
- read_absmap
- get_seifa
- get_seifa_index_sheet
- read_correspondence_tbl
- title: "Helper functions"
desc: >
Functions for cleaning data and working with datasets
Expand Down

0 comments on commit 8b2bf3d

Please sign in to comment.