Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finding the Value of a Mult-arm Policy Tree and Gain (HTE) of Each Node #175

Open
shafayetShafee opened this issue Jan 9, 2025 · 2 comments
Labels
question Further information is requested

Comments

@shafayetShafee
Copy link

Hello, GRF labs team, thank you for building and maintaining ML tools for causal inference. I have recently started using the packages - {grf}, {policytree} and {maq} for a multi-arm treatment data. However, I need guidance on,

  1. How can I measure the benefit of a multi-arm policy tree, that is, what is the expected uplift if I allocate the treatments as suggested by the policy tree?

  2. How can I calculate the HTE (or uplift) for each suggested leaf of the multi-arm policytree?

Consider the following reprex from the {policytree} docs,

library(grf)
library(DiagrammeR)
library(policytree)

set.seed(1344)

n <- 5000 # number of subjects
p <- 10 # number of pre-treatment covariates
data <- gen_data_mapl(n, p)
head(data.frame(data)[1:6])
  action          Y        X.1        X.2          X.3        X.4
1      1  2.0905685 0.46864412 0.22784132 0.6269076453 0.71743214
2      1 -0.3635110 0.42637207 0.74486955 0.2168520605 0.25096641
3      2  1.1952524 0.70800587 0.03636718 0.0007285422 0.01940693
4      2  1.8925130 0.05426782 0.97814957 0.5647130643 0.45693730
5      2 -0.4199647 0.95205213 0.53340813 0.3522940478 0.44700733
6      0  2.6126626 0.69570475 0.77983017 0.9636860283 0.72605917
X <- data$X
Y <- data$Y
W <- as.factor(data$action)

multi.forest <- multi_arm_causal_forest(X, Y, W, seed = 1344)

Gamma.matrix <- double_robust_scores(multi.forest)
head(Gamma.matrix)
             0          1          2
[1,]  2.640066  2.1543402  1.2385549
[2,]  1.365158 -2.1944712  1.4777855
[3,] -1.276021  0.6792919  0.8427064
[4,]  2.511125  1.8532989  4.4776106
[5,]  2.708066  2.1123636 -6.4939299
[6,]  9.127422  1.8865874  1.4614259
train <- sample(1:n, 3500)
opt.tree <- policy_tree(X[train, ], Gamma.matrix[train, ], depth = 2)
opt.tree
policy_tree object 
Tree depth:  2 
Actions:  1: 0 2: 1 3: 2 
Variable splits: 
(1) split_variable: X5  split_value: 0.600704 
  (2) split_variable: X7  split_value: 0.334204 
    (4) * action: 3 
    (5) * action: 1 
  (3) split_variable: X7  split_value: 0.784358 
    (6) * action: 2 
    (7) * action: 3 

Get the policy for the held out data,

X.test <- X[-train, ]
pi_hat <- predict(opt.tree, X.test)
head(pi_hat)
[1] 3 1 2 2 2 2

Determining the value of the Policy Tree

Following the eqn. (11) from this paper, value of the policy, $\widehat V(\pi) = \frac{1}{n} \sum_{i=1}^{n}\langle \hat \pi(X_i), \widehat \Gamma_i \rangle$ where the policy $\pi(X_i)$ is a $K$-dimensional vector where the $k^{th}$ element is equal to 1 if arm $k$ is assigned, and 0 otherwise and $\widehat \Gamma_i$ is the estimated DR score (eqn. 13).

# creating the pi hat matrix
pi_mat <- matrix(
  data = 0, 
  nrow = length(pi_hat), 
  ncol = length(unique(pi_hat)) - 1
)

pi_mat[cbind(1:length(pi_hat), pi_hat - 1)] <- 1
colnames(pi_mat) <- c("action-1", "action-2")
head(pi_mat)
     action-1 action-2
[1,]        0        1
[2,]        0        0
[3,]        1        0
[4,]        1        0
[5,]        1        0
[6,]        1        0
# getting the DR scores
gamma_diff <- grf::get_scores(multi.forest)[, ,]
head(gamma_diff)
          1 - 0      2 - 0
[1,] -0.4857258 -1.4015111
[2,] -3.5596292  0.1126275
[3,]  1.9553132  2.1187276
[4,] -0.6578262  1.9664855
[5,] -0.5957024 -9.2019960
[6,] -7.2408342 -7.6659957
# calculating value
mean(pi_mat * gamma_diff[-train, ])
[1] 0.5233645
  • So can I say, “Expected benefit per subject of the above treatment allocation would be 0.52 units and total benefit would be 1000 * 0.52 = 520 units” ??

  • Am I on the right track to answering my first question?

  • Also, How can I reach the answer to my second question?

@erikcs
Copy link
Member

erikcs commented Jan 12, 2025

Hi @shafayetShafee, yes you are right, that is one way to calculate an estimate of a gain defined as the policy value over assigning everyone the control: $E[\pi_1(X_i)(Y_i(1) - Y_i(0)) + \pi_2(X_i)(Y_i(2) - Y(0))]$. But, note that you need to do sum(pi_mat * gamma_diff[-train, ]) / nrow(pi_mat) to do the 1/n averaging correctly, as the mean of a matrix does something else.

If you prefer the binary matrix representation of $\pi$ you could also construct it like this, which might be simpler:

pi.mat <- model.matrix(~ as.factor(pi_hat) - 1)
pi.mat[, -1] # arm 2 and 3

(That gain is also the same as sum(Gamma.matrix[-train, ] * pi.mat) / nrow(pi.mat) - mean(Gamma.matrix[-train, 1]))

For calculating leaf statistics, you can call predict(opt.tree, X.test, type = "node.id") which returns the integer value of the leaf the sample belongs to.

@erikcs erikcs added the question Further information is requested label Jan 12, 2025
@shafayetShafee
Copy link
Author

Thanks a lot @erikcs for responding.

To answer the second question, I have tried the following as per your suggestion and from the last code chunk of this vignette

Leaf Statistics

library(dplyr)
node.id <- predict(opt.tree, X.test, type = "node.id")

gamma_node_df <- data.frame(
  node_id = node.id,
  Gamma.matrix[-train, ],
  allocated_arm = pi_hat - 1
)

colnames(gamma_node_df) <- c("node.id", "dr_score_0", "dr_score_1", "dr_score_2", "allocated_arm")

dr_score_per_leaf <- gamma_node_df %>% 
  group_by(node.id, allocated_arm) %>% 
  summarise(
    across(
      .cols = everything(), 
      .fns = list(mean = mean)
    ),
    .groups = "drop"
  )

dr_score_per_leaf
# A tibble: 4 × 5
  node.id allocated_arm dr_score_0_mean dr_score_1_mean dr_score_2_mean
    <dbl>         <dbl>           <dbl>           <dbl>           <dbl>
1       4             2          -0.807           0.914            1.52
2       5             0           2.77            1.90             1.23
3       6             1           0.934           1.85             1.27
4       7             2          -1.74            1.07             1.63

Calculating Gain in each Leaf (i.e Node)

dr_score_per_leaf %>% 
  dplyr::transmute(
    node.id,
    allocated_arm,
    gain_01 = dr_score_1_mean - dr_score_0_mean,
    gain_02 = dr_score_2_mean - dr_score_0_mean
  )
# A tibble: 4 × 4
  node.id allocated_arm gain_01 gain_02
    <dbl>         <dbl>   <dbl>   <dbl>
1       4             2   1.72    2.32 
2       5             0  -0.873  -1.54 
3       6             1   0.913   0.331
4       7             2   2.81    3.37 
dr_score_per_leaf %>% 
  dplyr::transmute(
    node.id,
    allocated_arm,
    gain_01 = dr_score_1_mean - dr_score_0_mean,
    gain_02 = dr_score_2_mean - dr_score_0_mean
  ) %>% 
  tidyr::pivot_longer(
    cols = -c(node.id, allocated_arm), values_to = "gain", names_prefix = "gain_0"
  ) %>% 
  filter(allocated_arm == name | allocated_arm == 0) %>% 
  mutate(
    gain = if_else(allocated_arm == 0, 0, gain)
    # setting 0 gain for group who will get nothing (i.e. will be allocated as control)
  ) %>% 
  group_by(node.id, allocated_arm) %>% 
  summarise(
    gain = mean(gain),
    .groups = "drop"
  )
# A tibble: 4 × 3
  node.id allocated_arm  gain
    <dbl>         <dbl> <dbl>
1       4             2 2.32 
2       5             0 0    
3       6             1 0.913
4       7             2 3.37 

Now my question is,

  • Are the gains I am getting above correct?

  • Should the sum of the above gains be equal to the total gain of the predicted policy?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants