Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request to retain dedicated plot column for TMP L1 #169

Closed
peterregier opened this issue May 7, 2024 · 7 comments
Closed

Request to retain dedicated plot column for TMP L1 #169

peterregier opened this issue May 7, 2024 · 7 comments
Assignees

Comments

@peterregier
Copy link
Collaborator

I just pulled L1 data and it appears the plot column included in prior releases is now removed and instead in the filename. I recognize and appreciate trying to reduce redundancy and am happy to parse from filenames, but as critical metadata all TEMPEST analyses will need, I think it warrants a dedicated column for ease of use. I'm intending this comment to only apply to TEMPEST.

@bpbond
Copy link
Member

bpbond commented May 7, 2024

Thanks @peterregier

Why it is "only apply to TEMPEST"? Plot is a crucial piece of information everywhere else too, right? What about Site? Should that have its own dedicated column? Curious about your thoughts here.

for ease of use

That's definitely a priority, so I'm very open to this. Just would like to understand better why something like this seems onerous:

# Read in all the 2024 TEMPEST control plot data
f <- list.files("TMP_2024/", pattern = "TMP_C_", full.names = TRUE)
dat <- bind_rows(lapply(f, read_csv))
dat$Site <- "TMP"
dat$Plot <- "C"

@bpbond
Copy link
Member

bpbond commented May 7, 2024

Just making a note, adding "Site" and "Plot" columns to one of the 2024 TEMPEST files (I tested TMP_C_20240301-20240331_L1_v1-0.csv) increased its size from 44 to 47.3 MB. Not bad.

@peterregier
Copy link
Collaborator Author

@bpbond my instinct on "easier" comes from my solution of parsing filenames with stringr::str_extract(name, "(?<=)[^_]+(?=)") when reading all csvs from a given L1 folder (eg TMP_2023). Absolutely agree your solution is not onerous, just hoping to keep the data as easy as possible for folks of all to use, if folks don't have to parse strings or assign, that's one less barrier in my mind.

I think you're absolutely right, having transect location as an equivalent variable would make sense for synoptics! Adding site as a column makes sense to me too. I recognize wanting to keep things lightweight, just putting in my 2 cents on that balance.

@bpbond
Copy link
Member

bpbond commented May 7, 2024

It does seem penny-wise, pound-foolish to save 10% in file sizes but force everyone to again and again parse filenames to re-create those Site and Plot columns.

Thoughts @selinalcheng @roylrich @wilsonsj100 ?

bpbond added a commit that referenced this issue May 7, 2024
bpbond added a commit that referenced this issue May 8, 2024
* Add metadata required to map GCReW met to GCW-W
* Remove GCW mappings as no met station or sonde
* Fix solar total/flux units for 15min data #160
* Clean up reset() function
* Add site and plot back in; see #169
* Remove option to remove input files
@bpbond
Copy link
Member

bpbond commented May 8, 2024

Addressed in #167

@bpbond bpbond closed this as completed May 8, 2024
@wilsonsj100
Copy link

Sorry for the late input, but I am pro adding the zone back in! Thank you!

@roylrich
Copy link

roylrich commented May 8, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants