-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy path01-SampleSize-BinaryOutcome.qmd
303 lines (214 loc) · 12 KB
/
01-SampleSize-BinaryOutcome.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
# Two-arm study with a binary primary outcome for assessing threshold stopping
```{r}
#| echo: false
#| warning: false
library(ggplot2)
library(tidyverse)
library(splitstackshape)
library(extraDistr)
```
This vignette was motivated by a study of CTU Bern (1938). It is an example of a Bayesian phase-II trial design with interim threshold stopping using predictive probabilities (@Lee2008, @Sambucini2021). The primary endpoint is a binary endpoint, say, treatment discontinuation (TD).
The trial is designed with early stopping decisions rules based on predictive probabilities that the intervention is non-safe, that is, a high probability of an excess TD proportion given interim data. We quantify the trial operations characteristics probability of stopping because of non-safety (stopping trial before maximal sample size is reached), probability of treatment discontinuation and the expected number of patients in experimental arm using a simulation approach.
## Methods
### Sample size
The sample size was fixed to $N_{max}=36$ patients (24 patients in the experimental arm, 12 patients in the control arm) because of feasibility.
### Study design
The following figure summarises the trial design. Suppose we randomise patients with a 2:1 allocation ratio to the experimental arm and the control arm. Let $p_i$ be the TD proportion coming from a design prior $\pi^d_i \sim Beta(a^d_i,b^d_i)$, $p_i$, $i=0,1$. The design prior represents the uncertainty of $p_i$ at the design stage and might be different from the analysis prior. One interim analysis will be performed after the recruitment of $N^{I}=n_{0}+n_{1}=18$ patients ($n_{0}$ denotes the number of patients in the control arm and $n_{1}$ denotes the number of patients in the experimental arm; The index $I$ stands for 'interim') which corresponds to 50% of the fixed sample size.
```{mermaid}
flowchart TD
A[2:1 randomisation] --> B[Experimental group with TD proportion \np1 coming from design prior]
A --> C[Control group with TD proportion \np0 coming from design prior]
B ~~~ D[Interim analysis\n\nStop early:\nIf experimental group has a high predictive probability of excessive discontinuation\notherwise randomise new patients until N_max is reached]
C ~~~ D
style A fill:white,color:black
style B fill:white,color:black
style C fill:white,color:black
style D fill:white,color:black,stroke-dasharray: 5 5
```
### Stopping criterium
The excessive threshold for stopping the trial was set to $p_{max}=0.2$. This number was based on the design prior choice (see subsection 'Design prior choice'). In brief, based on a discussion with clinical experts and available evidence the assumed mean average TD proportion was set to $0.1$. A doubling of this number led to $p_{max}=0.2$.
### Statistical methods
Suppose that at the interim analysis **in the experimental arm** $r_{1}$ discontinuation events are observed. If we assume that $N_{max}=n_{max, 0}+n_{max, 1}$ is the planned sample size of the trial, then a remaining future $m_{1}=n_{max, 1}-n_{1}$ will be recruited in the experimental arm with possible future events $S_1\in\{0,\cdots, m_1\}$. Note that at each interim analysis the posterior distribution is Beta-distributed $p_i|r_i,n_i \sim Beta(a_i+r_i, b_i+n_i-r_i)$ and $S_1|r_1,n_1$ is Beta-binomial distributed $S_1 \sim Betabinomal(m_1, a+r_1, b+n_1-r_1)$.
The trial is stopped early if $PP>\theta_{S}$, for an upper discontinuation proportion threshold $p_{max}$,
$$
\begin{aligned}
PP&=E\left[I\left\{ P\left(p_1>p_{max}|r_{1}, n_{1}, s_{1}\right)>\theta_{T_{Non-safe}}\right\}|r_{1}, n_{1}\right] \\
&=\sum_{s_1=0}^{m_1} I(P(p_1 > p_{max}|r_{1},n_1,s_1)>\theta_{T_{Non-safe}}) P(S_{1}=s_1|r_{1}, n_1).
\end{aligned}
$$
That is, we stop the trial early if there is a high predictive probability $\Theta_S$ that the discontinuation proportion in the experimental arm exceeds a upper threshold $p_{max}$ given interim data and future events, otherwise the remaining 18 patients are randomised until the fixed sample size is reached.
Stopping boundaries are calculated for each combination of $r_1$ (and thus $m_1$) given the maximal sample size $N_{max}$ and the interim sample size $N^I$. The tibble below shows the example of $r_1=2$ and $n_1=12$ for the skeptical prior. We set $\theta_{T_{Non-safe}}=0.6$.
```{r}
#| echo: true
#| warning: false
p_max <- c(0.2)
# Prior information
prior_type <- "Skeptical"
a_1 <- 2.4
b_1 <- 9.6
x <- seq(0,1,0.001)
n_initial <- 12
r <- 0:n_initial
n_increase <- 12
n_max <- 24
n <- seq(n_initial, n_max, n_increase)
r <- 0:n_max
m <- n_max-r
#### Scenario
theta_T <- 0.6
data <- expand.grid(n=n, r=r) %>% arrange(n) %>%
filter(n>=r) %>% mutate(i=n_max-n+1)
data <- expandRows(data, count=3)
data <- data %>% group_by(n, r) %>%
mutate(m=n_max-n, ind=1, i=cumsum(ind)-1, ind=NULL)
data <- data %>%
mutate(PS1=dbbinom(i, size=m, alpha=a_1+r, beta=b_1+n-r),
Ti=1-pbeta(p_max, a_1+r+i, b_1+n_max-r-i),
ind=ifelse(Ti>theta_T, 1, 0), n_max)
data <- data %>% select(n_max, n, r, m, i, Ti, PS1, ind)
data %>% filter(n==12, r==2)
```
The predictive probabilities for each $r_1$ are (with $\theta_{S}=0.8$):
```{r}
#| echo: true
#| warning: false
## Early stopping using predictive distribution
threshold_safety <- 0.8
data_pp <- data %>% group_by(n_max, n, r) %>% summarise(pp=round(sum(PS1*ind),6),
stop_pp=ifelse(pp>threshold_safety, 1, 0))
data_pp %>% filter(n==12)
```
The same calculation holds for the neutral prior and we get the following stopping boundaries
```{r}
#| echo: true
#| warning: false
stop_boundaries0 <- data_pp %>% filter(stop_pp==1) %>%
group_by(n_max, n) %>% summarise(r=min(r), stop=NULL)
stop_boundaries0$p_max <- p_max
stop_boundaries0$theta_T <- theta_T
stop_boundaries0$threshold_safety <- threshold_safety
stop_boundaries0$prior_type <- prior_type
stop_boundaries <- stop_boundaries0
# Prior information
prior_type <- "Neutral"
a_1 <- 0.6
b_1 <- 5.4
x <- seq(0,1,0.001)
n_initial <- 12
r <- 0:n_initial
n_increase <- 12
n_max <- 24
n <- seq(n_initial, n_max, n_increase)
r <- 0:n_max
m <- n_max-r
#### Scenario
theta_T <- 0.6
threshold_safety <- 0.8
data <- expand.grid(n=n, r=r) %>% arrange(n) %>% filter(n>=r) %>% mutate(i=n_max-n+1)
data <- expandRows(data, count=3)
data <- data %>% group_by(n, r) %>% mutate(m=n_max-n, ind=1, i=cumsum(ind)-1, ind=NULL)
data <- data %>% mutate(cond_postprob=dbbinom(i, size=m, alpha=a_1+r, beta=b_1+n-r),
Ti=1-pbeta(p_max, a_1+r+i, b_1+n_max-r-i),
ind=ifelse(Ti>theta_T, 1, 0), n_max)
data <- data %>% select(n_max, n, r, m, i, cond_postprob, Ti, ind)
## Early stopping using predictive distribution
data_pp <- data %>% group_by(n_max, n, r) %>% summarise(pp=round(sum(cond_postprob*ind),6),
stop_pp=ifelse(pp>threshold_safety, 1, 0))
stop_boundaries0 <- data_pp %>% filter(stop_pp==1) %>%
group_by(n_max, n) %>% summarise(r=min(r), stop=NULL)
stop_boundaries0$p_max <- p_max
stop_boundaries0$theta_T <- theta_T
stop_boundaries0$threshold_safety <- threshold_safety
stop_boundaries0$prior_type <- prior_type
stop_boundaries <- bind_rows(stop_boundaries, stop_boundaries0)
stop_boundaries %>% filter(n==12)
```
#### Design prior choice
The design prior for the primary endpoint was chosen as based on clinical expert knowledge and available evidence. We assumed that the "true" TD proportion comes from a $Beta(1.2,10.8)$ design prior which is centered at a TD proportion of 10% with a 90% uncertainty range from 0.9% to 26.6%. The probability that the discontinuation rate is larger than 20% is 12.2%. The design prior has weight of 12 patients, that is, half of planned patients in the experimental arm.
```{r}
#| echo: false
#| warning: false
# Prior information control
a_0 <- 1.2
b_0 <- 10.8
x <- seq(0.001,0.999,0.001)
# qbeta(0.05, a_0, b_0)
# qbeta(0.95, a_0, b_0)
# pbeta(0.03, a_0, b_0)
# 1-pbeta(0.15, a_0, b_0)
# 1-pbeta(0.20, a_0, b_0)
data_prior <- data.frame(x, y=dbeta(x, a_0, b_0), type=1)
# Prior information experimental
# a_0 <- 5/4
# b_0 <- 45/4
data_prior$type <- factor(data_prior$type, levels=1, labels=c("a=1.2, b=10.8 (both arms)"))
fig <- ggplot(data_prior, aes(x=x, y=y, linetype=type, group=type))+
geom_line()+theme_bw()+scale_x_continuous(breaks=seq(0,1,0.1))+
theme(panel.grid = element_blank(), legend.position = "bottom")+
scale_linetype("Prior parameters")+
geom_vline(xintercept=0.1, colour="red", linetype=2)+
ylab("Density")+xlab("")+ggtitle("Beta design prior")+
labs(caption=c("Dashed red vertical lines indicates mean value of design prior."))
fig
```
#### Analysis prior choice
As analysis priors we decided for two scenarios.
- A skeptical Beta(2.4, 9.4) analysis prior for the experimental arm which is centered on a TD proportion of 20% with a probability of excessing 40% of $5.6$%.
- A neutral Beta(0.6, 5.4) analysis prior for the experimental arm which is centered on a discontinuation proportion of 10%.
```{r}
#| echo: false
#| warning: false
# Skeptical prior
a_0 <- 2.4
b_0 <- 9.6
x <- seq(0.001,0.999,0.001)
data_prior <- data.frame(x, y=dbeta(x, a_0, b_0), type=1)
# Neutral prior
a_0 <- 0.6
b_0 <- 5.4
data_prior <- bind_rows(data_prior, data.frame(x, y=dbeta(x, a_0, b_0), type=2))
data_prior$type <- factor(data_prior$type, levels=1:2,
labels=c("a=2.4, b=9.6 (Skeptical prior)", "a=0.6, b=5.4 (Neutral prior)"))
fig <- ggplot(data_prior %>% filter(y<5), aes(x=x, y=y, linetype=type, group=type))+
geom_line()+theme_bw()+scale_x_continuous(breaks=seq(0,1,0.1))+
theme(panel.grid = element_blank(), legend.position = "bottom", legend.direction = "vertical")+
scale_linetype("Prior parameters")+geom_vline(xintercept=0.2, colour="red", linetype=1)+
geom_vline(xintercept=0.1, colour="red", linetype=2)+
ylab("Density")+xlab("")+ggtitle("Beta analysis prior")+
labs(caption=c("Red vertical lines indicate mean values of priors."))
fig
```
### Simulation study
We set up simulation study to quantify operations characteristics
- Probability of stopping because of non-safety (stopping trial before $N_{max}$ is reached),
- Probability of treatment discontinuation (declaring treatment discontinuation including $N_{max}$),
- Expected number of patients in experimental arm
with the following parameters
- Beta design priors: Boths arms: a=1.5, b=13.5. True $p_i$ are drawn from the design priors,
- Beta analysis priors: Boths arms: Skeptical: a=2.4, b=9.6. Neutral: a=0.6, b=5.4,
- Maximal discontinuation proportion threshold $p_{max}$: 0.2,
- Number of randomised patients in the experimental arm at first interim analysis $n_1$: 12,
- Number of randomised patients in the control arm at first interim analysis $n_1$: 6,
- Excessive discontinuation probability for posterior distribution $\theta_{T}$: 0.6,
- Safety threshold for predictive distribution $\theta_{S}$: 0.8,
- Number of simulation runs: 10'000,
- Maximal sample size $N_{max}$: 12/24 patients per arm,
- Increase of sample size $n_{increase}$: 6/12 patients per arm.
## Results
The results from our simulation study are:
| Operations characteristics | $N_{max}=36$ |
|-------------------------------------------------------------|----------------|
| Probability of intolerable trial (skeptical prior) | 15.8% |
| Probability of intolerable trial (neutral prior) | 8.2% |
| Probability of early stopping because of non-safety | 7.7% |
| Expected number of patients in experimental arm | 23.1 |
| Stopping boundaries | |
|-------------------------------------------------------------|----------------|
| Stopping boundary interim analysis (skeptical prior) | 4+ |
| Intolerable trial at trial end (skeptical prior) | 6+ |
| Intolerable trial at trial end (neutral prior) | 7+ |
## Author {-}
André Moser, Senior Statistician\
Department of Clinical Research\
University of Bern\
3012 Bern, Switzerland