Skip to content

Commit

Permalink
Add better example, add citation for random truncation
Browse files Browse the repository at this point in the history
  • Loading branch information
AshesITR committed Oct 11, 2023
1 parent 07fa858 commit 835da5f
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 11 deletions.
15 changes: 13 additions & 2 deletions jss-paper/bibliography.bib
Original file line number Diff line number Diff line change
Expand Up @@ -173,8 +173,7 @@ @Article{nloptr-lbfgs
journal = {Mathematical Programming}
}

@Book{
GroeneboomWellner1992,
@Book{GroeneboomWellner1992,
author = {Piet Groeneboom and Jon A. Wellner},
title = {Information Bounds and Nonparameteric Maximum Likelihood Estimation},
doi = {10.1007/978-3-0348-8621-5},
Expand All @@ -185,6 +184,18 @@ @Book{
isbn = {978-3-0348-8621-5}
}

@Book{DoerreEmura2019,
author = {Achim D\"orre and Takeshi Emura},
title = {Analysis of Doubly Truncated Data},
subtitle = {An Introduction},
doi = {10.1007/978-981-13-6241-5},
year = {2019},
publisher = {Springer Singapore},
series = {SpringerBriefs in Statistics},
issn = {2191-544X},
isbn = {978-981-13-6241-5}
}

@Article{ll2012,
title = {Modeling Dependent Risks with Multivariate Erlang Mixtures},
journal = {ASTIN Bulletin},
Expand Down
19 changes: 10 additions & 9 deletions jss-paper/reservr.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ Its coordinates have the following interpretation: the last two coordinates, whi
The first two coordinates, which satisfy $L \le M \le V \le U$ and which may be $\mp \infty$, encode censoring: the observation $(M,V)=(m,v)$ means that the target variable $Y$ satisfies $Y \in (m,v]$ if $m<v$ and $Y=m$ if $m=v$; the latter corresponds to an uncensored observation of $Y$.

Suppose we observe an independent sample $\mathcal J = \{(m_i, v_i, l_i, u_i): i=1, \ldots, n\}$ of $(M,V,L,U)$.
Suggested by standard maximum (conditional) likelihood approaches for truncated \citep{???} and censored observations \citep{Zhang2005}, we suggest to estimate $\theta$ by maximizing the objective function
Suggested by standard maximum (conditional) likelihood approaches for truncated \citep{DoerreEmura2019} and censored observations \citep{Zhang2005}, we suggest to estimate $\theta$ by maximizing the objective function
<!-- -->
\begin{align}
\ell(\theta) = \sum_{(m, v, l, u) \in \mathcal J} \Big\{ \log f_\theta(m) \mathbf{1}(m=v) + \log F_\theta((m,v]) \mathbf{1}(m<v) \Big\} - \log F_\theta((l,u]), \label{eq:cml-likelihood-nw}
Expand All @@ -73,15 +73,16 @@ Denoting the resulting sample by $\mathfrak{I} = \{(m_i, v_i, l_i, u_i, w_i)\}$,
\ell(\theta) = \sum_{(m, v, l, u, w) \in \mathfrak{I}} w \cdot \Big[ \Big\{ \log f_\theta(m) \mathbf{1}(m=v) + \log F_\theta((m, v]) \mathbf{1}(m<v) \Big\} - \log F_\theta((l, u]) \Big]. \label{eq:cml-likelihood}
\end{align}

A practical example of random truncation and interval censorship arises in the presence of inaccurate accident date specifications when reporting claims in general insurance.

The target variable $Y$ is the reporting delayof an accident happening at accident time $T_0$, which is hence reported to the insurer at calendar time $Y+T_0$.
A practical example of random truncation arises when modelling the reporting delay of claims in general insurance.
The target variable $Y$ is the reporting delay of an accident happening at accident time $T_0$, which is hence reported to the insurer at calendar time $Y+T_0$.
The truncation bounds $(L, U)$ for $Y$ will be equal to $(0, \tau - T_0)$ with $\tau$ the current calendar time.
Censoring bounds could, for example, be $(M, V) = (Y+T_0 - \lceil T_0 \rceil, Y + T_0 - \lfloor T_0 \rfloor)$ representing an inaccurate observation of $T_0$ at reporting time $Y+T_0$.
Technically, $U$ is not (fully) observable in this situation, but can be approximated by $\tau - \lfloor T_0 \rfloor$.
It should be noted that the type of censoring described in this example is not non-informative because $(M, V)$ is not independent of $Y$.
\textcolor{red}{[AB: Ein Beispiel, bei dem beide Unabhängigkeits-Annahmen sinnvoll sind, wäre super.]}


Combined random truncation with interval censoring can occur when modelling failure times when only survival data at two (or more) maintenance appointments some time after purchase is captured, and only for items that are sold.
The target variable $Y$ is the failure time of an item.
Item condition (failed / functional) can be observed at maintenance times $M_0$ and $M_1$, which may vary for each item.
For each maintained item, the production time $P_0$ and the purchase time $P_1$ is also known.
Only items that are functional at purchase time $P_1$ are observed at the maintenance times.
This gives rise to truncation bounds $(L, U) = (P_1 - P_0, \infty)$ and censoring interval bounds $(M, V) \in \{ (P_1 - P_0, M_0 - P_0), (M_0 - P_0, M_1 - P_0), (M_1 - P_0, \infty) \}$, depending on the item condition at times $M_0$ and $M_1$.

## Motivation of Equation \ref{eq:cml-likelihood-nw}. {short-title="Motivation of the conditional likelihood" #motivation-cml-likelihood}

Expand Down

0 comments on commit 835da5f

Please sign in to comment.