-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Refactoring / improvements of datasets #38
base: master
Are you sure you want to change the base?
Conversation
Hi Romain, |
Hello @finlaycampbell, No problem 😄 we have all the same trouble, we have many things to do and not much time. I will have a look on other datasets. The tradeoff between compatibility and duplication is not so easy. I was not very confortable with this topic, as far as I know there is no clear way to do that i R see this question on SO. Conservative approachThe one I've drafted in this PR.
Radical approachThe one described in SO.
Intermediate approachKeep the original dataset but do not export it anymore
What is your opinion? |
* Implemented another way of deprecating datasets * Added some information in NEWS.md about changes
Hello, I've implemented a more sustainable approach: deprecating datasets by moving them to Please give me you opinion. Best |
I've drafted what I think is a far better way to deal with that. # Default version
> head(dengue_fais_2011)
# A tibble: 6 x 2
# date_of_onset incidence
# <date> <dbl>
# 1 2011-09-15 0
# 2 2011-09-22 0
# 3 2011-09-29 0
# Activate the compatibility mode with a proper deprecation warning
> legacy_mode()
# Loading objects:
# dengue_fais_2011
# Warning message:
# This function replaces datasets with the previous version for compatibility reason
# Back to previous version that have been simply moved to /data-raw
> head(dengue_fais_2011)
# A tibble: 6 x 3
# onset_date nr value
# <date> <dbl> <dbl>
# 1 2011-09-15 7 0
# 2 2011-09-22 14 0
# 3 2011-09-29 21 0 |
* Added tsibble capabilities * Fixed ebola_kikwit by turning not reported data into `NA`
Hi Romain, Many thanks for all the work put into this! I'm a big fan of the legacy_mode() approach, I'll just have to discuss with others first if they're happy breaking backwards compatibility with the default outbreaks load - I'll get back to you. We should at least post a message when the package is loaded to indicate that some of the data has been refactored. Regarding to Thanks again, let me know if you have any comments questions. |
Hello, Understood regarding the function related to the For the rest, just tell me if it's worth continuing the refactoring because it's a lot of work and I'm still not sure it's useful. Maybe people are happy with the datasets as they are and I will move to something else. In this case I will keep my branch as is and I will let you decide if you want to pick something. Please let me know. Many thanks. |
* Removed everything related to TS * Status up to date in `NEWS.md`
Hey, I'll check out the tidyverts initiative now, it looks nice! Regarding the refactoring work, I definitely think this is definitely a project worth completing and improves the package functionality overall - but please only continue with it if you are happy to do so, especially if it's a lot of work. After discussion we've decided the best approach is to default to legacy_mode() for the next 1-2 months, with a message upon package load stating that users can use the updated datasets by entering |
Ok understood seems reasonable. I will take a break and then check how to enable the Best. |
Hello,
This version is refactoring of some existing datasets.
Previous datasets have been kept as is for compatibility reason.
New datasets hold the same name with a suffix
_td
for tidy.Global changes / rules
tibble
in order to get consistent behaviour regardless of whether or nottibble
is attached.Makefile
for common tasks.date_of_onset
everywhere,gender
instead ofsex
, etc.Dengue & Zika datasets Funk et al. (2016)
nr
column that can be computed).data-raw/
as stated in R Packages book.I've made this work in order to try to improve this package -- and also to learn a little bit more about R Data Packages. Does it make sense? Are you interested? If so I can continue with some of the other datasets in order to apply the same principles. 😊
Best