Modified pipeline to work with right censoring #42

davidsantiagoquevedo · 2024-02-12T18:26:32Z

This PR refactorizes the package to work with right censoring. Main additions:

Updated example dataset cohortdata with simulated censoring dates death_other_causes and unique IDs id to match the data
Tests to control the correctness of updated data
Use of censoring_date_col in get_immunization_date and get_time_to_event. Set NULL by default
Tests for censoring_date_col when provided

…osidered by hierarchically defining t0 and tf to calculate time_to_event = tf-t0

…d explicitely

…nonimous ID. Removed subsidy

…ded coherence test between disjoint variables death_date and death_other_causes

…ortdata

chartgerink

Overall looks good to me from reading the changes. I am still trying to understand what censoring data means from a practical standpoint - why would I want to do this to begin with? It might be worthwhile to have a brief vignette on this to help users along as well (that does not mitigate the value of the current PR).

One question that remains for me is: Do we know whether the cases in cohortdata cover all the options for censoring that we are testing? That is, you're testing whether the functionality succeeds correctly but do we know whether it also fails correctly?

Another point I noticed while testing I would like to check in on (not related to censoring though): Is it normal that the immunization delay is added on top of the first vaccination, if two vaccination dates are provided? I want to check just to be sure.

R/coh_data_wrangling.R

davidsantiagoquevedo · 2024-02-19T14:23:40Z

Hi @chartgerink, thanks for your review. The censoring impacts the estimation of the time-to-event for individuals whose event status is unknown at the end of the follow-up period. This occurs when they either drop out, are lost to follow-up, or experience a different event (e.g., death from other causes). Therefore, depending on the available information, a censoring date can be associated with the cohort. Despite not experiencing the event of interest, these individuals still contribute to the analysis with their exposure history. This information is used to estimate the hazard ratio and relative risks by providing details on exposure status and the duration of follow-up before censoring.

In addition, censoring will be also important in the matching routine because matched couples must be completely censored if one of the individuals is censored.

We are working on a vignette to explain this because it is one of the core ideas of the package.

davidsantiagoquevedo · 2024-02-19T14:38:32Z

@chartgerink can you please expand a bit on the other two points?

Regarding the last one, immunization_delay is added at the end of the function to data$imm_limit - data$delta_imm, where delta_imm represents the time distance to the vaccine selected for analysis. If take_first = TRUE, the function uses the first vaccine; if take_first = FALSE, it uses the one closest to imm_limit.

chartgerink · 2024-02-19T14:44:48Z

Thanks for clarifying @davidsantiagoquevedo 👍

With respect to the testing approach - it seems like the tests only investigate that the output is in line with what you expect. I wonder whether there are any situations where you would expect the function to not work that you could test to strengthen the overall test suite for the censoring functionality.

davidsantiagoquevedo · 2024-02-19T14:53:44Z

You're right, @chartgerink. I believe that to address this, we should simulate an example dataset for testing and to capture unexpected behaviors in all the functions. Because cohortdata was specifically simulated for demonstration purposes.

chartgerink

All comments resolved - thanks @davidsantiagoquevedo 🙏

davidsantiagoquevedo and others added 22 commits February 2, 2024 13:30

refac: checks for start_from_immunization added to the same if statement

abc1c06

refac: simplified logic in get_time_to_event. All the cases are now c…

32e03ce

…osidered by hierarchically defining t0 and tf to calculate time_to_event = tf-t0

fix: conditional informed outcome dates t0 -> tf

678aa24

fix: return numeric outcome in get_time_to_event

1c0e127

feat: added variable to control censoring date

06a9424

fix: time_to_event definition start_from_immunization must be declare…

d3f178a

…d explicitely

fix: declared explicitely start_from_immunization in examples

bcc8f9b

fix: lintr changed lenght(levels()) -> nlevels()

94143dd

roxygen2: updated documentation

42b632e

replaced example cohordata. Added date of death by other causes and a…

8d05a9f

…nonimous ID. Removed subsidy

updated snaps for tests using new data set

1c3ed76

refac: test with two dates, replaced vaccine_date_2 by death_date. Ad…

80fd98c

…ded coherence test between disjoint variables death_date and death_other_causes

compressed dataset and regenerated duplicated IDs

8a34af9

Remove extra parenthesis

c5da3b3

Remove ORCID domain

ca24c41

feat: added censoring date to get_immunization_date

be6e1e5

feat: test for censoring_date_col

d9fe2a6

refac: avoid defining additional dataset. All tests are done with coh…

c3cde60

…ortdata

feat: test for provided censored_date_col

d1dc6b6

roxygen: update param description

2e58a2f

style: indentation

1bc364d

refac: documentation inheritParams from get_immunization_date

db44778

davidsantiagoquevedo requested a review from chartgerink February 12, 2024 18:26

chartgerink reviewed Feb 19, 2024

View reviewed changes

R/coh_data_wrangling.R Show resolved Hide resolved

R/coh_data_wrangling.R Outdated Show resolved Hide resolved

R/coh_data_wrangling.R Show resolved Hide resolved

R/coh_data_wrangling.R Show resolved Hide resolved

refac: simplified logic in ifelse for t0

ac0aba4

chartgerink self-requested a review February 20, 2024 15:55

chartgerink approved these changes Feb 20, 2024

View reviewed changes

Merge branch 'main' into feat-censoring

6aa89df

davidsantiagoquevedo merged commit 5910429 into main Feb 20, 2024
7 checks passed

davidsantiagoquevedo deleted the feat-censoring branch February 20, 2024 16:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modified pipeline to work with right censoring #42

Modified pipeline to work with right censoring #42

davidsantiagoquevedo commented Feb 12, 2024

chartgerink left a comment

davidsantiagoquevedo commented Feb 19, 2024

davidsantiagoquevedo commented Feb 19, 2024 •

edited

Loading

chartgerink commented Feb 19, 2024

davidsantiagoquevedo commented Feb 19, 2024

chartgerink left a comment

Modified pipeline to work with right censoring #42

Modified pipeline to work with right censoring #42

Conversation

davidsantiagoquevedo commented Feb 12, 2024

chartgerink left a comment

Choose a reason for hiding this comment

davidsantiagoquevedo commented Feb 19, 2024

davidsantiagoquevedo commented Feb 19, 2024 • edited Loading

chartgerink commented Feb 19, 2024

davidsantiagoquevedo commented Feb 19, 2024

chartgerink left a comment

Choose a reason for hiding this comment

davidsantiagoquevedo commented Feb 19, 2024 •

edited

Loading