Finding the Value of a Mult-arm Policy Tree and Gain (HTE) of Each Node #175

shafayetShafee · 2025-01-09T14:40:48Z

Hello, GRF labs team, thank you for building and maintaining ML tools for causal inference. I have recently started using the packages - {grf}, {policytree} and {maq} for a multi-arm treatment data. However, I need guidance on,

How can I measure the benefit of a multi-arm policy tree, that is, what is the expected uplift if I allocate the treatments as suggested by the policy tree?
How can I calculate the HTE (or uplift) for each suggested leaf of the multi-arm policytree?

Consider the following reprex from the {policytree} docs,

library(grf)
library(DiagrammeR)
library(policytree)

set.seed(1344)

n <- 5000 # number of subjects
p <- 10 # number of pre-treatment covariates
data <- gen_data_mapl(n, p)
head(data.frame(data)[1:6])

  action          Y        X.1        X.2          X.3        X.4
1      1  2.0905685 0.46864412 0.22784132 0.6269076453 0.71743214
2      1 -0.3635110 0.42637207 0.74486955 0.2168520605 0.25096641
3      2  1.1952524 0.70800587 0.03636718 0.0007285422 0.01940693
4      2  1.8925130 0.05426782 0.97814957 0.5647130643 0.45693730
5      2 -0.4199647 0.95205213 0.53340813 0.3522940478 0.44700733
6      0  2.6126626 0.69570475 0.77983017 0.9636860283 0.72605917

X <- data$X
Y <- data$Y
W <- as.factor(data$action)

multi.forest <- multi_arm_causal_forest(X, Y, W, seed = 1344)

Gamma.matrix <- double_robust_scores(multi.forest)
head(Gamma.matrix)

             0          1          2
[1,]  2.640066  2.1543402  1.2385549
[2,]  1.365158 -2.1944712  1.4777855
[3,] -1.276021  0.6792919  0.8427064
[4,]  2.511125  1.8532989  4.4776106
[5,]  2.708066  2.1123636 -6.4939299
[6,]  9.127422  1.8865874  1.4614259

train <- sample(1:n, 3500)
opt.tree <- policy_tree(X[train, ], Gamma.matrix[train, ], depth = 2)
opt.tree

policy_tree object 
Tree depth:  2 
Actions:  1: 0 2: 1 3: 2 
Variable splits: 
(1) split_variable: X5  split_value: 0.600704 
  (2) split_variable: X7  split_value: 0.334204 
    (4) * action: 3 
    (5) * action: 1 
  (3) split_variable: X7  split_value: 0.784358 
    (6) * action: 2 
    (7) * action: 3

Get the policy for the held out data,

X.test <- X[-train, ]
pi_hat <- predict(opt.tree, X.test)
head(pi_hat)

[1] 3 1 2 2 2 2

Determining the value of the Policy Tree

Following the eqn. (11) from this paper, value of the policy, $\widehat V(\pi) = \frac{1}{n} \sum_{i=1}^{n}\langle \hat \pi(X_i), \widehat \Gamma_i \rangle$ where the policy $\pi(X_i)$ is a $K$-dimensional vector where the $k^{th}$ element is equal to 1 if arm $k$ is assigned, and 0 otherwise and $\widehat \Gamma_i$ is the estimated DR score (eqn. 13).

# creating the pi hat matrix
pi_mat <- matrix(
  data = 0, 
  nrow = length(pi_hat), 
  ncol = length(unique(pi_hat)) - 1
)

pi_mat[cbind(1:length(pi_hat), pi_hat - 1)] <- 1
colnames(pi_mat) <- c("action-1", "action-2")
head(pi_mat)

     action-1 action-2
[1,]        0        1
[2,]        0        0
[3,]        1        0
[4,]        1        0
[5,]        1        0
[6,]        1        0

# getting the DR scores
gamma_diff <- grf::get_scores(multi.forest)[, ,]
head(gamma_diff)

          1 - 0      2 - 0
[1,] -0.4857258 -1.4015111
[2,] -3.5596292  0.1126275
[3,]  1.9553132  2.1187276
[4,] -0.6578262  1.9664855
[5,] -0.5957024 -9.2019960
[6,] -7.2408342 -7.6659957

# calculating value
mean(pi_mat * gamma_diff[-train, ])

[1] 0.5233645

So can I say, “Expected benefit per subject of the above treatment allocation would be 0.52 units and total benefit would be 1000 * 0.52 = 520 units” ??
Am I on the right track to answering my first question?
Also, How can I reach the answer to my second question?

The text was updated successfully, but these errors were encountered:

erikcs · 2025-01-12T06:50:49Z

Hi @shafayetShafee, yes you are right, that is one way to calculate an estimate of a gain defined as the policy value over assigning everyone the control: $E[\pi_1(X_i)(Y_i(1) - Y_i(0)) + \pi_2(X_i)(Y_i(2) - Y(0))]$. But, note that you need to do sum(pi_mat * gamma_diff[-train, ]) / nrow(pi_mat) to do the 1/n averaging correctly, as the mean of a matrix does something else.

If you prefer the binary matrix representation of $\pi$ you could also construct it like this, which might be simpler:

pi.mat <- model.matrix(~ as.factor(pi_hat) - 1)
pi.mat[, -1] # arm 2 and 3

(That gain is also the same as sum(Gamma.matrix[-train, ] * pi.mat) / nrow(pi.mat) - mean(Gamma.matrix[-train, 1]))

For calculating leaf statistics, you can call predict(opt.tree, X.test, type = "node.id") which returns the integer value of the leaf the sample belongs to.

shafayetShafee · 2025-01-12T11:34:02Z

Thanks a lot @erikcs for responding.

To answer the second question, I have tried the following as per your suggestion and from the last code chunk of this vignette

Leaf Statistics

library(dplyr)

node.id <- predict(opt.tree, X.test, type = "node.id")

gamma_node_df <- data.frame(
  node_id = node.id,
  Gamma.matrix[-train, ],
  allocated_arm = pi_hat - 1
)

colnames(gamma_node_df) <- c("node.id", "dr_score_0", "dr_score_1", "dr_score_2", "allocated_arm")

dr_score_per_leaf <- gamma_node_df %>% 
  group_by(node.id, allocated_arm) %>% 
  summarise(
    across(
      .cols = everything(), 
      .fns = list(mean = mean)
    ),
    .groups = "drop"
  )

dr_score_per_leaf

# A tibble: 4 × 5
  node.id allocated_arm dr_score_0_mean dr_score_1_mean dr_score_2_mean
    <dbl>         <dbl>           <dbl>           <dbl>           <dbl>
1       4             2          -0.807           0.914            1.52
2       5             0           2.77            1.90             1.23
3       6             1           0.934           1.85             1.27
4       7             2          -1.74            1.07             1.63

Calculating Gain in each Leaf (i.e Node)

dr_score_per_leaf %>% 
  dplyr::transmute(
    node.id,
    allocated_arm,
    gain_01 = dr_score_1_mean - dr_score_0_mean,
    gain_02 = dr_score_2_mean - dr_score_0_mean
  )

# A tibble: 4 × 4
  node.id allocated_arm gain_01 gain_02
    <dbl>         <dbl>   <dbl>   <dbl>
1       4             2   1.72    2.32 
2       5             0  -0.873  -1.54 
3       6             1   0.913   0.331
4       7             2   2.81    3.37

dr_score_per_leaf %>% 
  dplyr::transmute(
    node.id,
    allocated_arm,
    gain_01 = dr_score_1_mean - dr_score_0_mean,
    gain_02 = dr_score_2_mean - dr_score_0_mean
  ) %>% 
  tidyr::pivot_longer(
    cols = -c(node.id, allocated_arm), values_to = "gain", names_prefix = "gain_0"
  ) %>% 
  filter(allocated_arm == name | allocated_arm == 0) %>% 
  mutate(
    gain = if_else(allocated_arm == 0, 0, gain)
    # setting 0 gain for group who will get nothing (i.e. will be allocated as control)
  ) %>% 
  group_by(node.id, allocated_arm) %>% 
  summarise(
    gain = mean(gain),
    .groups = "drop"
  )

# A tibble: 4 × 3
  node.id allocated_arm  gain
    <dbl>         <dbl> <dbl>
1       4             2 2.32 
2       5             0 0    
3       6             1 0.913
4       7             2 3.37

Now my question is,

Are the gains I am getting above correct?
Should the sum of the above gains be equal to the total gain of the predicted policy?

erikcs added the question Further information is requested label Jan 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finding the Value of a Mult-arm Policy Tree and Gain (HTE) of Each Node #175

Finding the Value of a Mult-arm Policy Tree and Gain (HTE) of Each Node #175

shafayetShafee commented Jan 9, 2025

erikcs commented Jan 12, 2025 •

edited

Loading

shafayetShafee commented Jan 12, 2025

Finding the Value of a Mult-arm Policy Tree and Gain (HTE) of Each Node #175

Finding the Value of a Mult-arm Policy Tree and Gain (HTE) of Each Node #175

Comments

shafayetShafee commented Jan 9, 2025

Determining the value of the Policy Tree

erikcs commented Jan 12, 2025 • edited Loading

shafayetShafee commented Jan 12, 2025

Leaf Statistics

Calculating Gain in each Leaf (i.e Node)

erikcs commented Jan 12, 2025 •

edited

Loading