WIP: Cumulative incidence #4

ararslan · 2017-10-09T06:39:20Z

Still do to:

Tests
Add to the documentation
Refactor to avoid near entire code duplication with Kaplan-Meier

Started on this the same time I started the package, never bothered to push it to GitHub. Maybe having it up here will remind me to continue.

nalimilan

Nice to have! Just a few comments based on a quick look.

nalimilan · 2017-10-09T13:01:44Z

src/cuminc.jl

+    fit(CumulativeIncidence, times, events, competing)
+
+Given a vector of times to events, a vector of indicators for the event of interest,
+and a vector of indicators for the competing event, compute the an estimate of the


nalimilan · 2017-10-09T13:05:24Z

src/cuminc.jl

+
+* `times`: Distinct event times
+* `nevents`: Number of observed events of interest at each time
+* `ncensor`: Number of right censorings for the event of interest at each time


Maybe call that ncensored? That's only two characters more.

Below, could also be ncompeting, though it's a bit longer so I'm not sure.

nalimilan · 2017-10-09T13:07:37Z

src/cuminc.jl

+In the presence of competing risks, `1-KM`, where `KM` is the Kaplan-Meier estimate,
+is uninterpretable and is a biased estimate of the failure probability. The
+cumulative incidence estimator of Kalbfleisch and Prentice (1980) is a function
+of the hazards of both the event of interest and the competing event, and provides


"events"? Same below.

nalimilan · 2017-10-09T13:13:40Z

src/cuminc.jl

+                       compete::AbstractVector{<:Integer}) where {T<:Real}
+    nobs = length(times)
+    if !(nobs == length(status) == length(compete))
+        throw(DimensionMismatch("the input vectors must have the same length"))


Would be nice to print the sizes which were actually passed.

nalimilan · 2017-10-09T13:17:19Z

src/cuminc.jl

+    end
+    p = sortperm(times)
+    t = times[p]
+    s = BitVector(status[p])


I didn't know that constructor even existed. Would be more efficient and clearer as broadcast(i -> status[i] != 0, p).

nalimilan · 2017-10-09T13:18:20Z

src/cuminc.jl

+end
+
+function StatsBase.fit(::Type{CumulativeIncidence},
+                       times::AbstractVector{T},


Any plans for a generalized CompetingEventTime which would hold both the time and type of (non-)event?

Hm, hadn't thought of that, that's an interesting idea. We may want to consider adding that.

piever · 2017-10-14T12:24:46Z

Nice! I think it's very important to also refactor the code, as there is one extra measure, which should be computed in a very similar way (cumulative hazard) that you get in more or less the same way. It really is the same computation as Kaplan-Meier but instead of:

km *= 1 - d_i/n_i

you have:

cum_haz += d_i/n_i

It's important to have that one as one can generalize it to easily compute the cumulative baseline hazard for a Cox model, for which then one can recover the other estimates in the context of a Cox model (for example, the survival function is the exponential of minus the cumulative hazard).

tbeason · 2020-07-25T18:45:00Z

I've picked this up and am committed to getting a refreshed PR completed soon, using this PR as a starting point.

Question: How should ties in times be handled? I know there are lots of different ways, but perhaps random jittering might be the easiest? In my application this would be fine because I have millions of observations, so how I break ties won't matter much I think. This was not addressed in the original PR.

Plans:

I only have one competing risk, so I will only be coding it up for that case (like this existing PR).
I will make a general CompetingEventTime that handles times and status.
I will make a CumulativeIncidence type with estimator_update function to be consistent with current implementation of KaplanMeier, and define _estimator(::Type{CumulativeIncidence},tte::AbstractVector{CompetingEventTime{T}}) where {T}

Sound ok?

ararslan · 2020-07-27T16:38:56Z

How should ties in times be handled?

My approach here was to use the cumulative incidence estimator described in the linked paper from Kalbfleisch and Prentice. I can't recall exactly how it handles ties (it's been years since I've done any survival analysis), but it should be described there. If not, there's likely a paper from John Crowley that describes an appropriate method. Sorry I don't have better information at the moment. I'll try to dig back through some papers saved on an old computer to see whether I have anything helpful soon.

tbeason · 2020-07-27T18:47:24Z

So I guess it turns out that ties do not need to be handled for the basic CIF computation, you just need to aggregate across them like you had done. Only when you move to modelling the CIF with covariates and computing likelihoods do you need to start to worry about how ties are broken. That is my understanding now.

EDIT to add: None of my comparisons against SAS/Stata had any ties in the data, so a further check would be to make sure we report the same as they do in the presence of ties.

ararslan · 2020-07-27T19:44:14Z

The company I worked for at the time I started working on this (before it even lived on GitHub) used SAS exclusively, but we had to implement cumulative incidence ourselves since at the time (SAS 9.3 or 9.4) an appropriate estimator was not implemented in SAS. If I recall it was a fairly straightforward computation in SAS using the KM estimated survivor function output from proc lifetest.

WIP: Cumulative incidence

c041505

nalimilan reviewed Oct 9, 2017

View reviewed changes

tbeason mentioned this pull request Jul 27, 2020

CIF and CompetingEventTimes #24

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Cumulative incidence #4

WIP: Cumulative incidence #4

ararslan commented Oct 9, 2017

nalimilan left a comment

nalimilan Oct 9, 2017

nalimilan Oct 9, 2017

nalimilan Oct 9, 2017

nalimilan Oct 9, 2017

nalimilan Oct 9, 2017

nalimilan Oct 9, 2017

ararslan Oct 9, 2017

piever commented Oct 14, 2017

tbeason commented Jul 25, 2020

ararslan commented Jul 27, 2020

tbeason commented Jul 27, 2020 •

edited

Loading

ararslan commented Jul 27, 2020

WIP: Cumulative incidence #4

Are you sure you want to change the base?

WIP: Cumulative incidence #4

Conversation

ararslan commented Oct 9, 2017

nalimilan left a comment

Choose a reason for hiding this comment

nalimilan Oct 9, 2017

Choose a reason for hiding this comment

nalimilan Oct 9, 2017

Choose a reason for hiding this comment

nalimilan Oct 9, 2017

Choose a reason for hiding this comment

nalimilan Oct 9, 2017

Choose a reason for hiding this comment

nalimilan Oct 9, 2017

Choose a reason for hiding this comment

nalimilan Oct 9, 2017

Choose a reason for hiding this comment

ararslan Oct 9, 2017

Choose a reason for hiding this comment

piever commented Oct 14, 2017

tbeason commented Jul 25, 2020

ararslan commented Jul 27, 2020

tbeason commented Jul 27, 2020 • edited Loading

ararslan commented Jul 27, 2020

tbeason commented Jul 27, 2020 •

edited

Loading