Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix n.risk at time == 0 in fortify.survfit(*, surv.connect = TRUE) + fix n.censor for survfitms objects #230

Merged
merged 8 commits into from
Jun 24, 2024

Conversation

ThomasSoeiro
Copy link
Contributor

@ThomasSoeiro ThomasSoeiro commented Jun 16, 2024

Hi @terrytangyuan,

Here is a proposal for #229.

I kept the original behavior for survfitms objects because I am not sure if a fix is needed here too.

Some tests: (models are from ?survfit; fortify.survfit2 is the patched function)

library(survival)

fit <- survfit(Surv(time, status) ~ x, data = aml)
old <- ggfortify:::fortify.survfit(fit, surv.connect = TRUE)
new <- fortify.survfit2(fit, surv.connect = TRUE)
dplyr::setdiff(new, old)
#   time n.risk n.event n.censor surv std.err upper lower        strata
# 1    0     12       0        0    1       0     1     1 Nonmaintained

fitMS <- survfit(Surv(start, stop, event) ~ 1, id = id, data = mgus1)
old <- ggfortify:::fortify.survfit(fitMS, surv.connect = TRUE)
new <- fortify.survfit2(fitMS, surv.connect = TRUE)
identical(new, old)
# [1] TRUE

Copy link
Collaborator

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Would you like to add some test cases for this?

@ThomasSoeiro
Copy link
Contributor Author

ThomasSoeiro commented Jun 18, 2024

Thanks. Would you like to add some test cases for this?

Done in 92bc773.

I kept the original behavior for survfitms objects because I am not sure if a fix is needed here too.

Should we use d[d$time == ave(d$time, d$event, FUN = min), ] for survfitms objects?

fitMS <- survfit(Surv(start, stop, event) ~ 1, id = id, data = mgus1)
ggfortify:::fortify.survfit(fitMS) |> by(~ event, head)
# event: (s0)
#   time n.risk n.event n.censor    pstate     std.err     upper     lower event
# 1    6    241       0        0 0.9958506 0.004140760 1.0000000 0.9877679  (s0)
# 2    7    240       0        0 0.9917012 0.005843706 1.0000000 0.9803137  (s0)
# 3   31    239       0        0 0.9875519 0.007142061 1.0000000 0.9736524  (s0)
# 4   32    238       0        0 0.9834025 0.008229598 0.9996652 0.9674043  (s0)
# 5   39    237       0        0 0.9792531 0.009181538 0.9974150 0.9614220  (s0)
# 6   60    236       0        0 0.9751037 0.010036539 0.9949748 0.9556296  (s0)
# --------------------------------------------------------------------------- 
# event: pcm
#     time n.risk n.event n.censor pstate std.err upper lower event
# 301    6      0       0        0      0       0    NA    NA   pcm
# 302    7      0       0        0      0       0    NA    NA   pcm
# 303   31      0       0        0      0       0    NA    NA   pcm
# 304   32      0       0        0      0       0    NA    NA   pcm
# 305   39      0       0        0      0       0    NA    NA   pcm
# 306   60      0       0        0      0       0    NA    NA   pcm
# --------------------------------------------------------------------------- 
# event: death
#     time n.risk n.event n.censor      pstate     std.err      upper        lower event
# 601    6      0       1        0 0.004149378 0.004140760 0.02933707 0.0005868799 death
# 602    7      0       1        0 0.008298755 0.005843706 0.03299139 0.0020874940 death
# 603   31      0       1        0 0.012448133 0.007142061 0.03832457 0.0040432548 death
# 604   32      0       1        0 0.016597510 0.008229598 0.04386286 0.0062804232 death
# 605   39      0       1        0 0.020746888 0.009181538 0.04939151 0.0087147233 death
# 606   60      0       1        0 0.024896266 0.010036539 0.05486341 0.0112975857 death

@ThomasSoeiro
Copy link
Contributor Author

ThomasSoeiro commented Jun 19, 2024

BTW, regarding subset() in:

d <- suppressWarnings(subset(d, select = -c(id)))

From ?subset (Warning section):

This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like [, and in particular the non-standard evaluation of argument subset can have unanticipated consequences.

Would you like to change that? Why is subset() wrapped in suppressWarnings()?

@terrytangyuan
Copy link
Collaborator

Sure, can you help update that piece of code and make sure everything still works as expected?

@ThomasSoeiro
Copy link
Contributor Author

ThomasSoeiro commented Jun 20, 2024

Should we use d[d$time == ave(d$time, d$event, FUN = min), ] for survfitms objects?

I have implemented that in 7277152 (and a test in 973b124).

Sure, can you help update that piece of code and make sure everything still works as expected?

Done in 906ad32 with some more (basic) clean up.

Some tests: (models are from ?survfit; fortify.survfit2 is the patched function)

library(survival)

fit <- survfit(Surv(time, status) ~ x, data = aml)
old <- ggfortify:::fortify.survfit(fit, surv.connect = TRUE)
new <- fortify.survfit2(fit, surv.connect = TRUE)
dplyr::setdiff(new, old)
#   time n.risk n.event n.censor surv std.err upper lower        strata
# 1    0     12       0        0    1       0     1     1 Nonmaintained

new2 <- fortify.survfit2(fit)
new2$strata <- as.character(new2$strata)
tdy <- broom::tidy(fit)
tdy$strata <- sub("x=", "", tdy$strata)
names(tdy) <- names(new2)
tdy <- as.data.frame(tdy)
identical(new2, tdy)
# [1] TRUE

fitMS <- survfit(Surv(start, stop, event) ~ 1, id = id, data = mgus1)
old <- ggfortify:::fortify.survfit(fitMS, surv.connect = TRUE)
new <- fortify.survfit2(fitMS, surv.connect = TRUE)
dplyr::setdiff(new, old)
#   time n.risk n.event n.censor pstate std.err upper lower event
# 1    0      0       0        0      0       0     0     0   pcm
# 2    0      0       0        0      0       0     0     0 death

new2 <- fortify.survfit2(fitMS)
new2$event <- as.character(new2$event)
tdy <- broom::tidy(fitMS)
mapply(identical, new2, tdy)
#     time   n.risk  n.event n.censor   pstate  std.err    upper    lower    event 
#     TRUE     TRUE     TRUE    FALSE     TRUE     TRUE     TRUE     TRUE     TRUE 

So there may be an issue in n.censor for survfitms objects.

@ThomasSoeiro ThomasSoeiro changed the title fix n.risk at time == 0 when nlevels(strata) > 1 fix n.risk at time == 0 in fortify.survfit(* surv.connect = TRUE) Jun 20, 2024
@ThomasSoeiro ThomasSoeiro changed the title fix n.risk at time == 0 in fortify.survfit(* surv.connect = TRUE) fix n.risk at time == 0 in fortify.survfit(*, surv.connect = TRUE) Jun 20, 2024
@ThomasSoeiro
Copy link
Contributor Author

ThomasSoeiro commented Jun 21, 2024

So is there an issue in n.censor for survfitms objects?

I proposed a fix for that in b3f60c8 (inspired by broom:::tidy.survfit()).

Some tests: (models are from ?survfit; fortify.survfit2 is the patched function; tests run with survival version 3.5-8 and 3.7-0)

library(survival)

fit <- survfit(Surv(time, status) ~ x, data = aml)
old <- ggfortify:::fortify.survfit(fit, surv.connect = TRUE)
new <- fortify.survfit2(fit, surv.connect = TRUE)
dplyr::setdiff(new, old)
#   time n.risk n.event n.censor surv std.err upper lower        strata
# 1    0     12       0        0    1       0     1     1 Nonmaintained

new2 <- fortify.survfit2(fit)
new2$strata <- as.character(new2$strata)
tdy <- broom::tidy(fit)
tdy$strata <- sub("x=", "", tdy$strata)
names(tdy) <- names(new2)
tdy <- as.data.frame(tdy)
identical(new2, tdy)
# [1] TRUE

fitMS <- survfit(Surv(start, stop, event) ~ 1, id = id, data = mgus1)
old <- ggfortify:::fortify.survfit(fitMS, surv.connect = TRUE)
old <- subset(old, select = -n.censor)
new <- fortify.survfit2(fitMS, surv.connect = TRUE)
new <- subset(new, select = -n.censor)
dplyr::setdiff(new, old)
#   time n.risk n.event pstate std.err upper lower event
# 1    0      0       0      0       0     0     0   pcm
# 2    0      0       0      0       0     0     0 death

new2 <- fortify.survfit2(fitMS)
new2$event <- as.character(new2$event)
tdy <- broom::tidy(fitMS)
names(tdy) <- names(new2)
tdy <- as.data.frame(tdy)
identical(new2, tdy)
# [1] TRUE

The logic in fortify.survfit() could be further simplified, but we probably should not spend too much time on this since fortify() may be deprecated in favor of {broom}. I think we should mostly focus on correcting the output.

So I think we are done now. What do you think?

(Please note that the proposed changes need careful review. I am not used to multi-state survival model and I am far from an expert in survival analysis.)

@ThomasSoeiro ThomasSoeiro changed the title fix n.risk at time == 0 in fortify.survfit(*, surv.connect = TRUE) fix n.risk at time == 0 in fortify.survfit(*, surv.connect = TRUE) + fix n.censor for survfitms objects Jun 21, 2024
Copy link
Collaborator

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The changes look good to me. Feel free to start using and testing it further separately.

@terrytangyuan terrytangyuan merged commit 5de6565 into sinhrks:master Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants