-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modified pipeline to work with right censoring #42
Conversation
…osidered by hierarchically defining t0 and tf to calculate time_to_event = tf-t0
…nonimous ID. Removed subsidy
…ded coherence test between disjoint variables death_date and death_other_causes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me from reading the changes. I am still trying to understand what censoring data means from a practical standpoint - why would I want to do this to begin with? It might be worthwhile to have a brief vignette on this to help users along as well (that does not mitigate the value of the current PR).
One question that remains for me is: Do we know whether the cases in cohortdata
cover all the options for censoring that we are testing? That is, you're testing whether the functionality succeeds correctly but do we know whether it also fails correctly?
Another point I noticed while testing I would like to check in on (not related to censoring though): Is it normal that the immunization delay is added on top of the first vaccination, if two vaccination dates are provided? I want to check just to be sure.
Hi @chartgerink, thanks for your review. The censoring impacts the estimation of the time-to-event for individuals whose event status is unknown at the end of the follow-up period. This occurs when they either drop out, are lost to follow-up, or experience a different event (e.g., death from other causes). Therefore, depending on the available information, a censoring date can be associated with the cohort. Despite not experiencing the event of interest, these individuals still contribute to the analysis with their exposure history. This information is used to estimate the hazard ratio and relative risks by providing details on exposure status and the duration of follow-up before censoring. In addition, censoring will be also important in the matching routine because matched couples must be completely censored if one of the individuals is censored. We are working on a vignette to explain this because it is one of the core ideas of the package. |
@chartgerink can you please expand a bit on the other two points? Regarding the last one, immunization_delay is added at the end of the function to |
Thanks for clarifying @davidsantiagoquevedo 👍 With respect to the testing approach - it seems like the tests only investigate that the output is in line with what you expect. I wonder whether there are any situations where you would expect the function to not work that you could test to strengthen the overall test suite for the censoring functionality. |
You're right, @chartgerink. I believe that to address this, we should simulate an example dataset for testing and to capture unexpected behaviors in all the functions. Because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All comments resolved - thanks @davidsantiagoquevedo 🙏
This PR refactorizes the package to work with right censoring. Main additions:
cohortdata
with simulated censoring datesdeath_other_causes
and unique IDsid
to match the datacensoring_date_col
inget_immunization_date
andget_time_to_event
. SetNULL
by defaultcensoring_date_col
when provided