appendix.qmd

# Appendix {.unnumbered}

## Formulate Approximate Explain

@merrick_explanation_2020 introduced the Formulate, Approximate Explain
(FAE) framework characterized by the idea that the choice of how to estimate
the value function must be chosen explicitly with a specific contrastive
explanation in mind. They provide a critique, similar to
@sundararajan_many_2020 of methods (IME and SHAP) that define the value
function using an observational conditional expectation. In particular, they
provide a motivating example that shows how these methods can lead to
attributions that violate both the dummy and symmetry Shapley value axioms.

The first step (formulate) in their framework is to generate a contrastive
question, which they argue leads to a specific reference or distribution of
references specific to the use-case. They show how different Shapley-based
methods can be unified by considering the different choice of _reference
distribution_ used to simulate feature removal. Their notion of a reference
distribution is roughly analogous to RBShap. They also introduce the idea of
_single reference games_, which simulates feature absence by replacing
values using a specific reference input (identical to BShap). In the second
step, the Shapley value is computed in two steps: sampling references from the
reference distribution and then approximating the average Shapley value for
each feature over this set of single-reference games. This technique allows
them to compute confidence intervals (assuming that the approximation method is
unbiased), which are used to quantify the uncertainty in the Shapley value
estimates as part of the final "explain" stage.

## Causal Reasoning and Model Fairness

@kilbertus_avoiding_2018 were the first to make the case that causal
reasoning is required to assess model fairness^[More specifically, they
are interested in unresolved discrimination]. In the Shapley value literature,
@heskes_causal_2020 demonstrates how different causal structures, even
if they yield the same observational distribution, can lead to different
Shapley values. While this is true in general, they provide a concrete example
that directly addresses the indirect influence debate. They compare different
Shapley-based methods across four causal building blocks (chain, fork,
confounder, and cycle) in a scenario where there are two features $X_1$ and
$X_2$ and a model that depends only on $X_2$ (as in our earlier example),
demonstrating that many Shapley-based methods do not yield attributions with a
proper world-causal explanation. For example, marginal SHAP methods are only
able to estimate direct effects and therefore yield attributions without a
proper world-causal explanation if the underlying structure is a chain (see
@fig-chain). On the other hand, observational conditional SHAP
methods account for the dependence between features, but are unable to account
for the fact that an intervention on the same variable in a fork and a
confounder have different distributional implications and may result in
attributions that do not have a proper world-causal interpretation. In our
view, the marginal SHAP attributions have a proper model-causal interpretation
even if interpreting them in a world-causal way is improper. Therefore,
distinguishing between model and world causality is necessary in order to
resolve the indirect influence debate through causal reasoning.