Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluating propensity scores for balancing the joint distribution #288

Open
ehudkr opened this issue Oct 23, 2024 · 1 comment
Open

Evaluating propensity scores for balancing the joint distribution #288

ehudkr opened this issue Oct 23, 2024 · 1 comment

Comments

@ehudkr
Copy link

ehudkr commented Oct 23, 2024

Hi everyone, thanks for the book. I only skimmed bits and pieces but it reads well and is skillfully presented.
Chapter 9 on propensity score evaluation is great. The thematic progression from sample-mean to full distribution and from linear to non-linear modeling is important and often overlooked.

I have a suggestion to progress it even further, which I'll do my best to describe briefly.

We first start with the observation/motivation that the SMD only evaluates covariates marginally, and there could be pathologies where two covariates are well-balanced separately, but their product is not.
Somewhat similar to the case below (from here)
image

To account for that, we first apply the existing solution - using the SMD for the interaction of x_1:x_2.
However, examining all possible pairs (not to mention triplets etc.) can be impractical.

Then, the proposed solution is post-adjustment two-sample discrimination tests.
Briefly, if your propensity scores do well in balancing the covariate distribution between groups, making the groups indistinguishable, then you shouldn't be able to predict the treatment assignment from the (weighted) covariates. If you would use a statistical classifier to separate the two groups then your accuracy should be as good as random. And the longer you can keep it up for more flexible discriminators (that dive into the joint distribution of the data, like random forests, additive trees, etc.) the more trustworthy your propensity scores are in balancing the joint distribution of covariates between groups.
In practice, there's no need to actually fit an additional model, it can be enough to simply calculate post-adjustment discrimination metrics like the (area under the) ROC curve and the like, weighted by the inverse propensity scores or on the matched sample. All very doable in R.

That's the gist very briefly, I hope that's clear enough (there are more details, though still relatively high-level, here)

If you think that's in the scope of your chapter and not too off the background of your readers (which I think is not the case given your table of content), then I think that could be a great addition, making the chapter more complete while progressing on the same arc of the existing story. I know Malcolm prefers issues rather than PRs here, but I can try to sketch a draft in Quarto if needed.

Best,

@malcolmbarrett
Copy link
Collaborator

@LucyMcGowan what do you think about this idea? I haven't done something like this but I think I like the logic of it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants