`n.risk` at `time == 0` is not correct when `nlevels(strata) > 1` in `fortify.survfit(*, surv.connect = TRUE)` #229

ThomasSoeiro · 2024-03-08T13:37:41Z

System information

OS Platform and Distribution: Red Hat Enterprise Linux Server 7.8 (Maipo)
ggfortify installed from: CRAN
ggfortify version: ggfortify_0.4.14
Exact command to reproduce: see below

Describe the problem

n.risk at time == 0 is not correct when nlevels(strata) > 1 in fortify.survfit(*, surv.connect = TRUE).

Source code / logs / plots

library(survival)

fit <- survfit(Surv(time, status) ~ x, data = aml)
summary(fit)
# Call: survfit(formula = Surv(time, status) ~ x, data = aml)
# 
#                 x=Maintained 
#  time n.risk n.event survival std.err lower 95% CI upper 95% CI
#     9     11       1    0.909  0.0867       0.7541        1.000
#    13     10       1    0.818  0.1163       0.6192        1.000
#    18      8       1    0.716  0.1397       0.4884        1.000
#    23      7       1    0.614  0.1526       0.3769        0.999
#    31      5       1    0.491  0.1642       0.2549        0.946
#    34      4       1    0.368  0.1627       0.1549        0.875
#    48      2       1    0.184  0.1535       0.0359        0.944
# 
#                 x=Nonmaintained 
#  time n.risk n.event survival std.err lower 95% CI upper 95% CI
#     5     12       2   0.8333  0.1076       0.6470        1.000
#     8     10       2   0.6667  0.1361       0.4468        0.995
#    12      8       1   0.5833  0.1423       0.3616        0.941
#    23      6       1   0.4861  0.1481       0.2675        0.883
#    27      5       1   0.3889  0.1470       0.1854        0.816
#    30      4       1   0.2917  0.1387       0.1148        0.741
#    33      3       1   0.1944  0.1219       0.0569        0.664
#    43      2       1   0.0972  0.0919       0.0153        0.620
#    45      1       1   0.0000     NaN           NA           NA

ggfortify:::fortify.survfit(fit, surv.connect = TRUE) |> head()
#   time n.risk n.event n.censor      surv    std.err     upper     lower        strata
# 1    0     11       0        0 1.0000000 0.00000000 1.0000000 1.0000000    Maintained
# 2    0     11       0        0 1.0000000 0.00000000 1.0000000 1.0000000 Nonmaintained  # <- n.risk should be 12
# 3    9     11       1        0 0.9090909 0.09534626 1.0000000 0.7541338    Maintained
# 4   13     10       1        1 0.8181818 0.14213381 1.0000000 0.6192490    Maintained
# 5   18      8       1        0 0.7159091 0.19508758 1.0000000 0.4884263    Maintained
# 6   23      7       1        0 0.6136364 0.24873417 0.9991576 0.3768671    Maintained

The issue is from:

ggfortify/R/fortify_surv.R

Line 65 in 5ed9524

base <- d[1, ]

Maybe it can be changed to d[d$time == ave(d$time, d$strata, FUN = min), ].

And then this will need to be simplified:

ggfortify/R/fortify_surv.R

Lines 73 to 79 in 5ed9524

    
           if ('strata' %in% colnames(d)) { 
        
             strata <- levels(d$strata) 
        
             base <- base[rep(seq_len(nrow(base)), length(strata)), ] 
        
             rownames(base) <- NULL 
        
             base$strata <- strata 
        
             base$strata <- factor(base$strata, levels = base$strata) 
        
           }

The text was updated successfully, but these errors were encountered:

ThomasSoeiro · 2024-06-07T21:38:28Z

Hi @terrytangyuan,
Do you confirm the issue? Would you consider a patch based on what I proposed in #229 (comment)?
Thanks!

terrytangyuan · 2024-06-11T00:24:50Z

Please submit a PR with additional test cases. Thanks!

ThomasSoeiro · 2025-01-24T23:36:07Z

FYI, I've just realized that a much better fix would be to use summary(times = 0):

library(survival)
fit <- survfit(Surv(time, status) ~ x, aml)
fit |> summary(times = 0) |> str()

ThomasSoeiro changed the title ~~n.risk value for time == 0 is not correct when length(strata) > 1 in fortify.survfit().~~ n.risk value for time == 0 is not correct when nlevels(strata) > 1 in fortify.survfit(). Mar 8, 2024

ThomasSoeiro changed the title ~~n.risk value for time == 0 is not correct when nlevels(strata) > 1 in fortify.survfit().~~ n.risk at time == 0 is not correct when nlevels(strata) > 1 in fortify.survfit(*, surv.connect = TRUE) Mar 8, 2024

ThomasSoeiro mentioned this issue Jun 16, 2024

fix n.risk at time == 0 in fortify.survfit(*, surv.connect = TRUE) + fix n.censor for survfitms objects #230

Merged

ThomasSoeiro closed this as completed Jun 29, 2024

ThomasSoeiro reopened this Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`n.risk` at `time == 0` is not correct when `nlevels(strata) > 1` in `fortify.survfit(*, surv.connect = TRUE)` #229

`n.risk` at `time == 0` is not correct when `nlevels(strata) > 1` in `fortify.survfit(*, surv.connect = TRUE)` #229

ThomasSoeiro commented Mar 8, 2024 •

edited

Loading

ThomasSoeiro commented Jun 7, 2024

terrytangyuan commented Jun 11, 2024

ThomasSoeiro commented Jan 24, 2025

n.risk at time == 0 is not correct when nlevels(strata) > 1 in fortify.survfit(*, surv.connect = TRUE) #229

n.risk at time == 0 is not correct when nlevels(strata) > 1 in fortify.survfit(*, surv.connect = TRUE) #229

Comments

ThomasSoeiro commented Mar 8, 2024 • edited Loading

System information

Describe the problem

Source code / logs / plots

ThomasSoeiro commented Jun 7, 2024

terrytangyuan commented Jun 11, 2024

ThomasSoeiro commented Jan 24, 2025

`n.risk` at `time == 0` is not correct when `nlevels(strata) > 1` in `fortify.survfit(*, surv.connect = TRUE)` #229

`n.risk` at `time == 0` is not correct when `nlevels(strata) > 1` in `fortify.survfit(*, surv.connect = TRUE)` #229

ThomasSoeiro commented Mar 8, 2024 •

edited

Loading