Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 2023 examination data from CBO, TSY, and JCT #10

Merged
merged 5 commits into from
Feb 26, 2024
Merged

Add 2023 examination data from CBO, TSY, and JCT #10

merged 5 commits into from
Feb 26, 2024

Conversation

martinholmer
Copy link
Collaborator

Add FY data from CBO, TSY, and JCT, that can be converted to CY 2023 amounts, against which microdata estimates will be compared in the examination phase of the project.

Copy link
Collaborator

@nikhilwoodruff nikhilwoodruff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Martin- but do we need to have these as .text and .awk files? It'd be more manageable if everything was a CSV (or even JSON/YAML).

@martinholmer
Copy link
Collaborator Author

martinholmer commented Feb 16, 2024

@nikhilwoodruff asked in PR #10:

Thanks Martin- but do we need to have these as .text and .awk files? It'd be more manageable if everything was a CSV (or even JSON/YAML).

Why would it "be more manageable?

The .txt files are not rectangular, so CSV-format is not a good choice.
JSON and YAML files would need Python scripts to do the interpolation/extrapolation work, which is far more complex and time consuming.
So, the current setup is, from my point of view, quite manageable and economical for the task at hand.

@nikhilwoodruff
Copy link
Collaborator

@nikhilwoodruff asked in PR #10:

Thanks Martin- but do we need to have these as .text and .awk files? It'd be more manageable if everything was a CSV (or even JSON/YAML).

Why would it "be more manageable?

The .txt files are not rectangular, so CSV-format is not a good choice. JSON and YAML files would need Python scripts to do the interpolation/extrapolation work, which is far more complex and time consuming. So, the current setup is, from my point of view, quite manageable and economical for the task at hand.

Not sure I follow really. The data seems rectangular to me, or at least we could make it so. And there's a script already doing the interpolation in shell language- why is Python much more complicated?

@martinholmer
Copy link
Collaborator Author

@nikhilwoodruff said:

Not sure I follow really. The data seems rectangular to me, or at least we could make it so. And there's a script already doing the interpolation in shell language- why is Python much more complicated?

So, you have not answered my question:

Why would it "be more manageable"?

And @nikhilwoodruff also said:

there's a script already doing the interpolation in shell language

What shell language and which script?

@nikhilwoodruff
Copy link
Collaborator

So, you have not answered my question:
Why would it "be more manageable"?

Yes, well my thinking is that CSV files are easiest to manipulate/show in documentation/pull into programs across all languages in this project, and especially Python. Second to that, consistency probably helps us, so if everything is CSV/Python that'd be the easiest combination and avoid having to build logic to parse text files, or copy the values.

@nikhilwoodruff
Copy link
Collaborator

What shell language and which script?

Sorry- was referring to the AWK script.

@martinholmer
Copy link
Collaborator Author

@nikhilwoodruff, I changed the spaced-delimited text files to CSV files so that they "are easie[r] to manipulate/show in documentation/pull into programs across all languages in this project."

@martinholmer martinholmer merged commit 31733a2 into PSLmodels:master Feb 26, 2024
2 checks passed
@martinholmer martinholmer deleted the fy-to-cy branch February 26, 2024 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants