Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to income metric. #456

Open
wants to merge 3 commits into
base: version2025
Choose a base branch
from
Open

Update to income metric. #456

wants to merge 3 commits into from

Conversation

jwalsh28
Copy link
Collaborator

@jwalsh28 jwalsh28 commented Feb 4, 2025

This PR updates the income metric. Includes a back-update for years going back to 2014 and brings in 2023 census data + adds in a gender subgroup.

  1. Big changes of note:
    A) Addition of the gender subgroup is notable, please have a look at this in the county and place sub files.
    B) Similar to your comment Manu, this codes major changes reflect other ACS metrics in the use of the aws.s3 system. Checking if the application is consistent and looking out for potential bugs/errors would be great.

…e AWS method for data storage and check data on final evaluation forms
@malcalakovalski
Copy link
Contributor

I have similar overall comments to the PR's for issues #450 and #451. However, one thing I noticed is missing here is running the the files through the final evaluation function. The subgroup evaluation form csvs are also missing from 10a_final-evaluation

---
title: "Run employment"
date: today
format:html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
format:html
format: html

Comment on lines +36 to +76
quarto_render_move <- function(
input,
output_file = NULL,
output_dir = NULL,
...
) {

# Get all the input / output file names and paths
x <- quarto::quarto_inspect(input)
output_format <- names(x$formats)
output <- x$formats[[output_format]]$pandoc$`output-file`
if (is.null(output_file)) { output_file <- output }
input_dir <- dirname(input)
if (is.null(output_dir)) { output_dir <- input_dir }
output_path_from <- file.path(input_dir, output)
output_path_to <- file.path(output_dir, output_file)

# Render qmd file to input_dir
quarto::quarto_render(input = input, ... = ...)

# If output_dir is different from input_dir, copy the rendered output
# there and delete the original file
if (input_dir != output_dir) {

# Try to make the folder if it doesn't yet exist
if (!dir.exists(output_dir)) { dir.create(output_dir) }

# Now move the output to the output_dir and remove the original output
file.copy(
from = output_path_from,
to = output_path_to,
overwrite = TRUE
)
file.remove(output_path_from)

# If the output_dir is the same as input_dir, but the output_file
# has a different name from the input file, then just rename it
} else if (output_file != output) {
file.rename(from = output_path_from, to = output_path_to)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do really like this parametrized report approach and think in the next round of updates we should encourage it + add guidance. This quarto_render_move function is excellent and I think it should belong in the functions folder so all metrics can use it. With the current approach it will be duplicated in many folders in the repo!

This comment is more food for thought for 2026, no action needed currently


+ The ACS micro data used to create this metric is large and will take time to read in. It is strongly recommended that you use a server with large computing power to run this.
+ Please check that no folders in the filepath that you have cloned this repostiory to include the acronym "UMF" - this will throw an error from the extract_ipums function. To check your file path you can use the function here::here().
*User warning* The ACS micro data used to create this metric is large and will take time to read in. It is strongly recommended that you use a server with significant computing power to run this.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a recommended server or listing one that was used successfully to update the metric would be helpful for posterity!

@@ -0,0 +1,99 @@
---
title: "Run employment"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: "Run employment"
title: "Run income"

@jwalsh28
Copy link
Collaborator Author

@malcalakovalski Apologize this intitial PR was a bit half-baked and did not include subgroups - they have been added. As well as the final evaluations. You should be good to proceed with your review.

Copy link
Contributor

@malcalakovalski malcalakovalski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks great except that I would incorporate the logic of functions/testing/evaluate_final_data_srv_quantile.R into the evaluate_final_data function used across the project. The new function is the same except that it allows for NA values for lb and ub in confidence intervals. However, this could be handled by adding an argument for whether the data is survey data or not in the function and do if else logic based on that. That's okay for this round, but I'm flagging this for the future!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants