MAP-reduction-meta.Rmd

---
classoptions: 
  - sn-nature      
  - referee         # Optional: Use double line spacing 
  - lineno        # Optional: Add line numbers
  # - iicol         # Optional: Double column layout
title: "Meaningfully reducing consumption of meat and animal products is an unsolved problem: \\newline A meta-analysis"
titlerunning: MAP-reduction-meta
authors: 
  - firstname: Seth Ariel
    lastname: Green
    email: setgree@stanford.edu
    affiliation: 1
    corresponding: TRUE
  - firstname: Benny
    lastname: Smith 
    affiliation: 2
  - firstname: Maya B.
    lastname: Mathur
    affiliation: 1
affiliations:
  - number: 1
    info:
      orgdiv: Quantitative Sciences Unit, Department of Medicine
      orgname: Stanford University
  - number: 2
    info:
      orgname: Allied Scholars for Animal Protection 
keywords:
  - meta-analysis
  - meat
  - plant-based
  - randomized controlled trial
  - climate change
  - sustainability
abstract: |
  Which interventions produce the largest and most enduring reductions in consumption of meat and animal products (MAP)? We address this question with a theoretical review and meta-analysis of randomized controlled trials that measured MAP consumption at least one day after intervention. We meta-analyze 35 papers comprising 41 studies, 112 interventions, and approximately 87,000 subjects. We find that these papers employ four major strategies to change behavior: choice architecture, persuasion, psychology, and a combination of persuasion and psychology. The pooled effect of all 112 interventions on MAP consumption is quite small (standardized mean difference (SMD) = 0.07 (95\% CI: [0.02, 0.12]), indicating an unsolved problem. Interventions aiming to reduce only consumption of red and processed meat were more effective (SMD = 0.25; 95\% CI: [0.11, 0.38]), but it remains unclear whether such interventions increase consumption of other forms of MAP. but it remains unclear whether such interventions also decrease consumption of other forms of MAP. We conclude that while existing approaches do not provide a proven remedy to MAP consumption, designs and measurement strategies have generally been improving over time, and many promising interventions await rigorous evaluation.
date: "`r Sys.Date()`"
output: 
  rticles::springer_article:
    keep_tex: true
bibliography: "./vegan-refs.bib"
editor_options: 
  chunk_output_type: console
header-includes:
  - \usepackage{comment}
  - \usepackage{anyfontsize}
  - \usepackage[style=default]{caption}
  - \usepackage{float}      # For precise float placement
  - \usepackage{placeins}   # Provides \FloatBarrier command
---

```{r setup, include=FALSE}

# so that knitr labels figures
library(knitr)
opts_chunk$set(fig.path = "./figures/",
               echo = TRUE,
               out.extra = "",
               fig.pos = "ht")

options(tinytex.clean = TRUE) # switch to FALSE to get the bbl file for overleaf

# libraries, functions, data, and key results
source('./scripts/libraries.R')
source('./scripts/functions.R')
source('./scripts/load-data.R')
source('./scripts/models-and-constants.R')

# fixing something annoying that isn't printing correctly
choice_delta <- choice_results$Delta 
```

# Introduction {#sec1}

Global consumption of meat and animal products (MAP) is increasing [@godfray2018] and is expected to continue doing so [@whitton2021].
Abating this trend is vital to reducing chronic diseases caused by excessive MAP consumption and the risk of zoonotic pandemics [@willett2019; @landry2023; @hafez2020], mitigating environmental degradation and climate change [@poore2018; @koneswaran2008; @greger2010], and improving animal welfare [@kuruc2023; @scherer2019].
However, eating MAP is widely regarded as normal, ethical, and necessary [@piazza2022; @milford2019].

There is a vast and diverse literature investigating potential means to reduce MAP consumption.
Example approaches include providing free access to meat substitutes [@katare2023], changing the price [@horgen2002] or perceptions [@kunst2016] of meat, and attempting to persuade people to change their diets [@bianchi2018conscious].
Some interventions are associated with large impacts [@lentz2020; @boronowsky2022; @reinders2017], and prior reviews have concluded that some frequently studied approaches, such as using persuasive messaging that appeals to animal welfare [@mathur2021meta], may be consistently effective.
A particularly high-profile strand of this literature employs choice architecture, i.e. altering the contexts in which MAP is selected [@bianchi2018restructuring], for instance by changing menu layouts [@bacon2018; @gravert2021], placing vegetarian items more prominently in dining halls [@ginn2024], or making plant-based options the default at catered meals [@hansen2021].
Choice architecture could be a cheap, effective way of altering dietary behavior [@colgan2024], and governments, universities, and other institutions are increasingly implementing these approaches in such settings as dining halls [@pollicino2024] and hospital cafeterias [@morgenstern2024].

However, recurring design and measurement limitations compromise the literature on MAP reduction.
Many interventions are either not randomized [@garnett2020] or underpowered [@delichatsios2001].
Measured outcomes are often imperfect proxies of MAP consumption, such as attitudes, intentions, and hypothetical choices [@raghoebar2020; @vermeer2010], yet behaviors often do not track with these psychological processes [@mathur2021effectiveness; @porat2024] and reported preferences [@hensher2010].
Additionally, many studies with comparatively large effects specifically aim to reduce consumption of red and processed meat (RPM).
However, because these studies exclusively measure changes in RPM, it is unknown whether they induce substitution to other forms of MAP, such as chicken or fish [@grummon2023].
Thus, treating RPM consumption as a proxy of net MAP reduction, as prior reviews have done [@bianchi2018conscious; @chang2023; @kwasny2022], may cause bias.
Finally, many studies measure only immediate rather than long-term effects [@hansen2021; @griesoph2021].
This is of special concern if subjects who are encouraged to have a single vegetarian meal later
compensate by consuming more MAP, which would make an immediate outcome measurement a biased estimate of overall effects. 
Such compensatory effects are common in dietary studies [@yeomans2001; @robinson2013; @lowe2007].

In the past few years, a new wave of MAP reduction research has made commendable methodological advances in design, measurement validity, and statistical power.
Historically, in some scientific fields, strong effects detected in early studies with methodological limitations were ultimately overturned by more rigorous follow-ups [@wykes2008; @paluck2019; @scheel2021].
Does this phenomenon hold in the MAP reduction literature as well?

To address this question, we conducted a meta-analysis of randomized controlled trials (RCTs) that aim to reduce MAP consumption and that meet basic methodological standards [@andersson2021; @kanchanachitra2020; @abrahamse2007; @acharya2004; @banerjee2019; @bianchi2022; @bochmann2017; @bschaden2020; @carfora2023; @cooney2014; @cooney2016; @feltz2022; @haile2021; @hatami2018; @hennessy2016; @jalil2023; @mathur2021effectiveness; @merrill2009; @norris2014; @peacock2017; @polanco2022; @sparkman2021; @weingarten2022; @piester2020; @aberman2018; @aldoh2023; @allen2002; @camp2019; @coker2022; @sparkman2020; @berndsen2005; @bertolaso2015; @fehrenbach2015; @mattson2020; @shreedhar2021].
Specifically, we restricted eligibility to RCTs that measured consumption outcomes at least a single day after treatment was first administered and that had at least 25 subjects in both treatment and control (or, for cluster-assigned studies, at least ten clusters in total).

Studies in our meta-analysis pursued one of four theoretical approaches: choice architecture, psychological appeals (typically manipulations of perceived norms around eating meat), explicit persuasion (centered around animal welfare, the environment, and/or health), or a combination of psychological and persuasion messages.
Interventions varied in delivery method, for example, documentary films [@mathur2021effectiveness], leaflets [@peacock2017], university lectures [@jalil2023], op-eds [@haile2021], and changes to menus in cafeterias [@andersson2021] and restaurants [@coker2022; @sparkman2021].
We estimated overall effect sizes as well as effect sizes associated with different theoretical approaches and delivery mechanisms.
Although we find some heterogeneity across theories and mechanisms, we find consistently smaller effects on MAP consumption than previous reviews that placed fewer (if any) restrictions on studies' outcomes and
methodological rigor [@bianchi2018restructuring; @byerly2018; @chang2023; @harguess2020; @kwasny2022; @mathur2021meta; @meier2022]. 
When we included studies whose methodology fell short of our inclusion criteria [@alblas2023; @beresford2006; @betterfoodfoundation2023; @celis2017; @dannenberg2023; @delichatsios2001eatsmart; @epperson2021; @frie2022; @garnett2020; @griesoph2021; @hansen2021; @johansen2009; @kaiser2020; @lentz2020; @loy2016; @matthews2019; @piazza2022; @reinders2017; @sparkman2017], this considerably increased the pooled estimate. 
In addition, studies that only aimed to reduce RPM consumption [@anderson2017; @carfora2017correlational; @carfora2017randomised; @carfora2019; @carfora2019informational; @delichatsios2001talking; @dijkstra2022; @emmons2005cancer; @emmons2005project; @jaacks2014; @james2015; @lee2018; @lindstrom2015; @perino2022; @schatzkin2000; @sorensen2005; @wolstenholme2020], reported consistently stronger effects on behavior than
studies aimed at reducing net MAP consumption.
Overall, in contrast to previous reviews, we conclude that meaningfully reducing net MAP
consumption is an unsolved problem, although many promising approaches still await rigorous evaluation. 

# Results {#sec2}

## Results across all studies {#sec2.1}

Our meta-analysis included `r num_papers` papers comprising `r num_studies` studies and `r num_interventions` separate point estimates.
Each point estimate corresponded to a distinct intervention.
The total sample size was approximately `r n_total` subjects.

```{r mean_subjects, include=F}
total_pops <- dat |> filter(cluster_assigned == 'N') |> mutate(total_pop = as.numeric(n_c_post + n_t_post)) |> pull(total_pop)

median_total_pop <- round(median(total_pops, na.rm = TRUE))
percentile_25 <- round(quantile(total_pops, 0.25, na.rm = TRUE))
percentile_75 <- round(quantile(total_pops, 0.75, na.rm = TRUE))
# nums per approach
choice_results <- dat |> filter(theory == 'Choice Architecture') |> extract_model_results() |> select(N_estimates, N_studies)
twenty_twenty_n <- decade_tab |> filter(decade == "2020s") |> pull(n)
```

Because methodological quality is rapidly improving in this literature, the majority of eligible papers (`r twenty_twenty_n` of `r num_papers`) were published from 2020 onwards, although the earliest was published in 2002 [@allen2002].
Among studies where treatment was assigned to individuals rather than to clusters (e.g., school classes), the median analyzed sample size per study was `r median_total_pop` subjects (25\textsuperscript{th}–75\textsuperscript{th} percentiles: `r paste0(percentile_25, ", ", percentile_75)`).

We found that studies' theoretical approaches could be grouped into four categories.
**Choice architecture** studies [@andersson2021; @kanchanachitra2020] `r paste0("(n = ", choice_results$N_studies, " studies with ", choice_results$N_estimates, " estimates)")` manipulate aspects of physical environments to reduce MAP consumption, such as by placing the vegetarian option at eye level on a cafeteria's billboard menu [@andersson2021]. **Persuasion** studies [@kanchanachitra2020; @aberman2018; @abrahamse2007; @acharya2004; @banerjee2019; @bianchi2022; @bochmann2017; @bschaden2020; @carfora2023; @hennessy2016; @piester2020; @cooney2014; @cooney2016; @feltz2022; @haile2021; @hatami2018; @jalil2023; @mathur2021effectiveness; @merrill2009; @norris2014; @peacock2017; @polanco2022; @sparkman2021; @weingarten2022] `r paste0("(n = ", persuasion_results$N_studies, " studies with ", persuasion_results$N_estimates, " estimates)")` focus on health, environmental (usually climate change), and animal welfare reasons to reduce MAP consumption. Such messages are often delivered through printed materials, such as leaflets [@haile2021; @polanco2022], booklets [@bianchi2022] articles and op-eds [@sparkman2021; @feltz2022], and videos [@sparkman2021; @cooney2016; @mathur2021effectiveness]. Less common delivery methods included in-person dietary consultations [@merrill2009], emails [@banerjee2019], and text messages [@carfora2023]. **Psychology** studies [@aldoh2023; @allen2002; @camp2019; @coker2022; @piester2020; @sparkman2020] `r paste0("(n = ", psych_results$N_studies, " studies with ", psych_results$N_estimates, " estimates)")` manipulate the interpersonal, cognitive, or affective factors associated with eating MAP. The most common psychological intervention is centered on social norms seeking to alter the perceived popularity of non-MAP dishes [@sparkman2020; @sparkman2021]. In one study, a restaurant put up signs stating that "[m]ore and more [retail store name] customers are choosing our veggie options" [@coker2022]. In another, a university cafeteria put up signs stating that "[i]n a taste test we did at the [name of cafe], 95% of people said that the veggie burger tasted good or very good!”
[@piester2020].
One study told participants that people who ate meat are more likely to endorse social hierarchy and embrace human dominance over nature [@allen2002].
Other psychological interventions include response inhibition training, where subjects are trained to avoid responding impulsively to stimuli such as unhealthy food [@camp2019], and implementation intentions, where participants list potential challenges and solutions to changing their own behavior [@aberman2018; @shreedhar2021].
Finally, some studies combine **persuasion** approaches with **psychological** appeals to reduce MAP consumption [@aberman2018; @berndsen2005; @bertolaso2015; @carfora2023; @fehrenbach2015; @hennessy2016; @mathur2021effectiveness; @mattson2020; @piester2020; @shreedhar2021] `r paste0("(n = ", psych_persuasion_results$N_studies, " studies with ", psych_persuasion_results$N_estimates, " estimates)")`. These studies typically combine a persuasive message with a norms-based appeal [@piester2020; @mattson2020] or an opportunity to pledge to reduce one's MAP consumption [@mathur2021effectiveness; @shreedhar2021].

```{r needed_vars, include=F}
low_prop_test <- prop_stronger( q = 0.1, M = overall_results$Delta,
                                t2 = overall_results$tau^2,
                                se.M = overall_results$SE, tail = "above",
                                estimate.method = "calibrated",
                                ci.method = "calibrated", dat = dat,
                                yi.name = "d", vi.name = "var_d",
                              bootstrap = "ifneeded", R = 200) |> 
   mutate(across(1:6, \(x) round(x, 3)))
low_prop_test
high_prop_test <- prop_stronger( q = 0.2,  M = overall_results$Delta,
                                t2 = overall_results$tau^2,
                                se.M = overall_results$SE, tail = "above",
                                estimate.method = "calibrated",
                                ci.method = "calibrated", dat = dat,
                                yi.name = "d", vi.name = "var_d",
                              bootstrap = "ifneeded", R = 200) |> 
     mutate(across(1:6, \(x) round(x, 3)))
high_prop_test
```

```{r meta_table, echo=FALSE, message=FALSE, results='asis'}
source('./scripts/table-one-meta.R')
meta_table
```

In our dataset, the pooled effect of all interventions is standardized mean difference (SMD) = `r overall_results$Delta` (95% CI: `r overall_results$CI`), p = `r overall_results$p_val`, with some heterogeneity (standard deviation of population effects $\tau$ = 0.082).
Given the pooled effect size and the estimated heterogeneity, we estimate that `r round(low_prop_test$est * 100)`% of true effects are above SMD = 0.1, and just `r round(high_prop_test$est * 100)`% are above SMD = 0.2 [@mathur2019; @mathur2020robust].

## Subset and moderator analyses {#Sec2.2}
Stratifying by theoretical approach, pooled estimates were similar across psychology, persuasion, and persuasion and psychology (SMDs from `r persuasion_results$Delta` to `r psych_results$Delta`; Table 1).
Estimates may have been somewhat larger among the choice architecture studies (SMD = `r choice_delta`), but the sample size was much smaller (`r choice_results$N_estimates` estimates).
Within studies with a persuasion component, pooled estimates are similar for environmental appeals (SMD = `r environment_results$Delta`, `r environment_results$N_studies` studies with `r environment_results$N_estimates` estimates), and health appeals (SMD = `r health_results$Delta`, `r health_results$N_studies` studies with `r health_results$N_estimates` estimates), but are smaller for appeals to animal welfare (SMD = `r animal_welfare_results$Delta`, `r animal_welfare_results$N_studies` studies with `r animal_welfare_results$N_estimates` estimates). We did not conduct meta-regression for theoretical approach or type of persuasion because studies with multiple interventions could occupy multiple categories, and many persuasion interventions combined multiple types of message, e.g. presenting students with both environmental and health reasons to reduce MAP consumption [@jalil2023].

```{r forest_plot, echo=FALSE, message=FALSE, fig.align='center', fig.pos='H', fig.cap="Forest plot of all meta-analyzed studies. For papers contributing multiple point estimates, the plotted point corresponds to a fixed effects meta-analysis for each paper for visual clarity. Papers employing multiple theoretical approaches are represented once per theory. Point size is inversely proportional to variance. Points are sorted within theory by estimate size. The vertical black line demarcates an effect size of zero, and the dotted line is the observed overall effect.", out.width="120mm", out.extra=""}
source('./scripts/forest-plot.R')
forest_plot 
# ggsave('./results/figures/forest_plot-1.pdf', plot = forest_plot,
#        width = 8.5, height = 11, units = "in")

```

```{r moderator_table, echo=F, message=F}
source('./scripts/table-two-moderator.R')
moderator_table

NA_delta <- region_results |> filter(Moderator == "North America") |> pull(Delta)
two_thousand_delta <- decade_results |> filter(Moderator == "2000s") |> pull(Delta)
twenty_twenty_delta <- decade_results |> filter(Moderator == "2020s") |> pull(Delta)

edu_delta <- delivery_results |> filter(Moderator == 'Educational materials') |> pull(Delta)
delivery_delta <- delivery_results |> filter(Moderator == 'Dietary consultation') |> pull(Delta)
delivery_n <- delivery_results |> filter(Moderator == 'Dietary consultation') |> pull(N_Studies)
```

```{r rpmc_prop, include=F}

red_high_prop_test <- prop_stronger( q = 0.2, M = rpmc_results$Delta,
                                     t2 = rpmc_results$tau^2,
                                se.M = rpmc_results$SE, tail = "above",
                                estimate.method = "calibrated",
                                ci.method = "calibrated", dat = RPMC,
                                yi.name = "d", vi.name = "var_d",
                              bootstrap = "ifneeded", R = 200) |> 
     mutate(across(1:6, \(x) round(x, 3)))
```

The `r RPMC_studies` studies that only attempted to reduce consumption of RPM, comprising 25 point estimates, yielded a pooled effect of SMD = `r rpmc_results$Delta` (95% CI: `r rpmc_results$CI`), p = `r rpmc_results$p_val`, $\tau$ = `r rpmc_results$tau`.
Among these studies, we estimate that `r round(red_high_prop_test$est * 100, 2)`% of true RPM effects are above SMD = 0.2.
We observe consistently small effects across categories of population (all pooled SMDs \< 0.1), but more heterogeneity by region: North America, where a majority of studies took place, had an average effect of SMD = `r NA_delta` vs. 0.14 to 0.21 for other locations.
Effect sizes have broadly been declining over time, from an average of SMD = `r two_thousand_delta` in the 2000s to SMD = `r twenty_twenty_delta` in the 2020s.

## Publication bias and robustness checks {#Sec2.3}

```{r publication_bias, include=F, message=F}
rma_model <- metafor::rma.uni(yi = d, vi = var_d, data = dat)
hedges_model <- selmodel(x = rma_model, type = 'stepfun', 
                         alternative = 'greater', steps = c(0.025, 1))

pub_bias_corrected_estimate <- PublicationBias::pubbias_meta(yi = dat$d, vi = dat$var_d, cluster = dat$unique_study_id, model_type = 'robust', favor_positive = TRUE, alpha_select = .05, small = TRUE, selection_ratio = 1) 

pub_bias_estimate <- round(pub_bias_corrected_estimate$stats$estimate, 3)
pub_ci_lower <- round(pub_bias_corrected_estimate$stats$ci_lower, 2)
pub_ci_upper <- round(pub_bias_corrected_estimate$stats$ci_upper, 2)
pub_ci_p_val <- round(pub_bias_corrected_estimate$stats$p_value, 3)

nulls <- dat |> filter(neg_null_pos == 0| neg_null_pos == -1)
worst_case <- extract_model_results(data = nulls)
```

The meta-analytic mean corrected for publication bias that favors significant, positive results was `r round(hedges_model$b, 3)` (95% CI: [`r paste0(round(hedges_model$ci.lb, 3), ",", " ", round(hedges_model$ci.ub, 3))`]), p = `r round(hedges_model$pval, 3)` [@hedges1992].
Figure 2 displays a significance funnel plot [@mathur2020]. 
A conservative estimate that accounts for the possibility of worst-case publication bias yields an estimate of SMD = `r worst_case$Delta` (95% CI: `r worst_case$CI`), p = `r worst_case$p_val` [@mathur2020; @mathur2024] (further sensitivity checks in Supplement).

```{r funnel_plot, echo=FALSE, message=F, fig.cap="Significance funnel plot displaying studies’ point estimates versus their estimated standard errors. Orange points: affirmative studies (p < 0.05 and a positive point estimate). Grey points: nonaffirmative studies (p $\\geq$ 0.05 or a negative point estimate). Diagonal grey line: the standard threshold of “statistical significance” for positive point estimates; studies lying on the line have exactly p = 0.05. Black diamond: main-analysis point estimate within all studies; grey diamond: worst-case point estimate within only the nonaffirmative studies.", fig.align='center', fig.pos='H'}
funnel_plot <- significance_funnel(yi = dat$d, vi = dat$var_d, favor_positive = TRUE, , alpha_select = 0.05, plot_pooled = TRUE)
funnel_plot
```

Table 2 displays subset analyses and average differences in effect size
by study population, region, era of publication, and delivery method. Average differences were estimated via meta-regression.

```{r robustness_check, include=F}
source('./scripts/robustness-checks.R')
```

As a robustness check, we also coded and meta-analyzed a supplementary dataset of 22 marginal studies, comprising 35 point estimates. 
Marginal studies were those whose methods fell short of our inclusion criteria, but typically met all but one, e.g. the control group received some aspect of treatment [@piazza2022], or treatment was alternated weekly but not randomly [@garnett2020] (Supplement).
Expanding the meta-analytic dataset to include these marginal
studies yields a pooled effect of SMD = `r merged_results$Delta` (95% CI: `r merged_results$CI`), p $<$ `r merged_results$p_val`.
Particularly large results were found in studies that measured outcomes immediately [@hansen2021] or that had smaller samples [@lentz2020].

# Methods {#sec3}

```{r methods_nums, include=F}
reviews_count <- nrow(read.csv('./data/review-of-reviews.csv'))
excluded_count <- nrow(read.csv('./data/excluded-studies.csv'))
```

## Study selection {#sec3.1}
```{=tex}
%bm
```
Our meta-analytic sample comprises RCT evaluations of interventions intended to reduce MAP consumption that had at least 25 subjects in treatment and control (or at least 10 clusters for studies that were cluster-assigned) and that measured MAP consumption at least a single day after treatment begins.
We required that studies have a pure control group receiving no treatment.
We further restricted our search to studies that were publicly circulated in English by December 2023.
We also made three decisions regarding study inclusion after data collection began.
First, we defined a separate analytic category for studies that only targeted RPM consumption.
Second, we excluded studies that did not aim to reduce either all MAP or all RPM consumption and instead  sought to induce substitution from one kind of MAP to another, e.g. that encouraged treated subjects to eat fish [@johansen2009].
Third, we excluded studies whose interventions left no room for participants to voluntarily decide their MAP consumption, e.g.~interventions in institutions where subjects were simply served more vegetables on their plate.

Given our interdisciplinary research question and previous work indicating a large grey literature [@mathur2021meta], we designed and carried out a customized search process.
We: 1) reviewed `r reviews_count` prior reviews, nine of which yielded included articles [@mathur2021meta; @bianchi2018conscious; @bianchi2018restructuring; @ammann2023; @chang2023; @DiGennaro2024; @harguess2020; @ronto2022; @wynes2018]; 2) conducted backwards and forward citation search; 3) reviewed published articles by authors with papers in the meta-analysis; 4) crowdsourced potentially missing papers from leading researchers in the field; 5) searched Google Scholar for terms that had come up in studies repeatedly; 6) used an AI search tool to search for gray literature (\url{https://undermind.ai/}); and 7) checked two databases emerging from ongoing nonprofit projects that both seek to identify all papers on meat-reducing interventions.
All three authors contributed to the search.
Inclusion/exclusion decisions were primarily made by the first author, with all authors contributing to discussions about borderline cases.

Figure 3 is a PRISMA diagram depicting the sources of included and excluded studies, which is detailed further in the Supplement.

```{r prisma_diagram, echo=FALSE, message=FALSE, fig.align='center', fig.pos='H', fig.height=8, fig.width=6, out.width='120%', fig.cap = "PRISMA diagram."}
knitr::include_graphics('./figures/prisma-diagram.png')

```

## Data extraction {#sec3.3}

The first author extracted all data.
We extracted an effect size for one outcome per intervention: the measure of net MAP or RPM consumption that had the longest follow-up time after intervention. Additional variables coded included information about publication, details of the interventions, length of follow-ups, intervention theories, and additional details about interventions' methods, contexts, and open science practices (see accompanying code and data repository for full documentation: <https://doi.org/10.24433/CO.6020578.v3>).
When in doubt about calculating effect sizes, we consulted publicly available datasets and/or contacted authors.
To assess risk of bias, we collected data on whether outcomes were self-reported or objectively measured, publication status, and presence of a pre-analysis plan and/or open data (Supplement).

All effect size conversions were conducted by the first author using methods and R code initially developed for previous papers [@paluck2019; @paluck2021; @porat2024] using standard techniques [@cooper2019], with the exception of a difference in proportion estimator that treats discrete events as draws from a Bernoulli distribution (see appendix to [@paluck2021] for details).
As our measure of standardized mean difference, we used Glass's $\Delta$ whenever possible, defined as $\Delta = \frac{\mu_T - \mu_C}{\sigma_C}$, where $\mu_T$ and $\mu_C$ respectively denote the treatment and control group means and $\sigma_C$ denotes the pre-treatment control group standard deviation.
If the control group SD was not available, we standardized on the pooled SD.
When means and SDs were not available, we converted effect sizes from: regression coefficients, eta squared, or z-scores.
When there was insufficient information to calculate a specific SMD, but the text reports the result as a null, we recorded the outcome as an "unspecified null" and set it to 0.01.

## Statistical analysis {#sec3.4}

We used `Rmarkdown` [@xie2018] and a containerized online platform [@moreau2023; @clyburne2019] to ensure computational reproducibility [@polanin2020].
We conducted meta-analysis using robust variance estimation (RVE) methods [@hedges2010] as implemented by the `robumeta` package in `R` [@fisher2015; @Rlang].
Many studies in our sample compared multiple treatment groups to a single control group.
Therefore, we used the RVE method to allow for the resulting dependence between observations, as well as a standard small-sample correction.

Data analyses were largely conducted with custom functions building on `tidyverse` [@wickham2019]. We assessed publication bias using selection model methods [@hedges1992; @vevea1995], sensitivity analysis methods [@mathur2024], and the significance funnel plot [@mathur2020].
These methods assume that the publication process favors “statistically significant” (i.e., p \< 0.05) and positive results over “nonsignificant” or negative results.
Our sensitivity check meta-analyzes only non-affirmative results, which creates an estimate under a hypothetical “worst-case” publication bias scenario where affirmative studies are almost infinitely more likely to be published than non-affirmative studies.
We conducted these analyses using functions in `metafor` [@viechtbauer2010] and `PublicationBias` [@mathur2020; @mathur2024].

# Discussion

Our meta-analysis of RCTs estimated a small overall effect of SMD = `r overall_results$Delta`, along with its upper confidence bound of SMD = `r round(model$reg_table$CI.U, 2)`.
Effects were also consistently small across an array of locations, study designs, and intervention categories.
Some individual studies found comparatively larger effects (e.g. five studies estimated SMD \> 0.5: [@carfora2023; @merrill2009; @kanchanachitra2020; @bianchi2022; @piester2020]).
We view these these interventions as intriguing candidates for subsequent research and replication. However, these studies' heterogeneous theories, methods, and implementation details suggest that no singular approach, means of delivery, or message should be considered a well-validated method of reducing MAP consumption.
Taken together, these findings suggest that reducing MAP consumption is an unsolved problem.

Perhaps surprisingly, our results diverged from the more positive findings of previous reviews  [@mathur2021meta; @meier2022; @mertens2022], which are summarized in the Supplement.
Our much smaller estimate likely reflects our stricter methodological inclusion criteria.
For instance, of the ten largest effect sizes in a previous meta-analysis [@mathur2021effectiveness], nine measured attitudes and/or intentions, and the tenth came from a non-randomized design.
Prior research has found that intentions often do not predict behavior [@mathur2021effectiveness], and reviews in other fields have found systematic differences in impacts between randomized and non-randomized evaluations [@porat2024; @stevenson2023].
Supporting this interpretation, robustness checks in which we relaxed our methodological inclusion criteria produced results similar to those of previous reviews.
This possibility will need further empirical evaluation.

Another potentially surprising result is that only two choice architecture papers met our methodological inclusion criteria.
Most potentially eligible papers either measured hypothetical outcomes or measured outcomes immediately after the intervention.
Moreover, prior reviews that found choice architecture approaches to be consistently effective at modifying diet typically focused on foods that may have weaker cultural and social attachments than MAP, such as sugary drinks and snacks [@venema2020; @adriaanse2009].
We speculate that changes to how MAP is sold and consumed, by contrast, are more likely to be noticed and to engender political and cultural backlash [@popper2019].

Likewise, as our analyses show, studies aimed at reducing RPM consumption are associated with a considerably larger effect (SMD = `r rpmc_results$Delta`) than those aimed at reducing all MAP consumption.
Many prior reviews grouped MAP and RPM studies together, treating their outcomes as aimed at a single theoretical target [@slough2023].
However, if reductions in RPM lead to consumers' substituting to other forms of MAP, then analyses that synthesize the two categories of outcome may produce inflated estimates of net MAP reduction.
We view such substitutions as likely: many health guidelines, such as the heart-healthy diet [@diab2023], encourage reducing RPM while also encourage moderate intake of poultry and fish, both of which come with severe externalities, such as risking zoonotic outbreaks from factory farms [@hafez2020] and causing land and water pollution [@grvzinic2023].
Additionally, raising chicken and fish may lead to substantially worse outcomes for animal welfare [@mathur2022ethical].
We speculate that cutting back on RPM by substituting to other forms of MAP may be easier and more socially normative than is cutting back on all MAP.
This possibility might explain the observed difference in effect sizes.

Our analyses have limitations.
Relatively few studies met our methodological inclusion criteria, limiting statistical precision.
Additionally, as with all meta-regression analyses, ours should not be interpreted as causal estimates of study-level moderators.
That is, estimated differences in effect sizes between groups of studies do not represent the causal effects of the study characteristics (e.g., theoretical approach) on their interventions' effects because studies' characteristics are not randomly assigned.
Finally, although our methodological inclusion criteria were more stringent than those of previous reviews, the included studies still had limitations.
For example, many outcome measures in our database were coarse, such as self-reported reduction vs. non-reduction in MAP consumption as a binary variable [@aberman2018]
Other studies seek to associate eating MAP with a sense of threat [@fehrenbach2015] or with endorsing social hierarchy [@allen2002] and then collect self-reported outcomes.
These designs raise the possibility of social desirability bias.

Overall, this literature shows encouraging trends in methodology.
First, as noted, a majority of studies in our meta-analysis have been published since 2020, indicating the field's increasing attention to rigorous design and measurement.
Second, we observe many fruitful collaborations between researchers and advocacy organizations, as shown by the large number of nonprofit white papers in our sample.
Third, many promising designs and interventions still await rigorous evaluation.
For instance, no study that met our criteria evaluated extended contact with farm animals [@cerrato2022], manipulations to the price of meat [@wilde2016], activating moral and/or physical disgust [@palomo2018], watching popular media such as the *Simpsons* episode "Lisa the Vegetarian" [@byrd2010] or the movie *Babe* [@novatna2019], and many categories of choice architecture intervention [@olafsson2024].
Moreover, emerging research designs help address longstanding measurement challenges, such as the possibility that interventions implemented at one time point (e.g., choice architecture at a lunch buffet) create later compensatory behavior (e.g., eating more MAP at dinner) [@vocski2024].
Ultimately, our findings suggest that meaningfully reducing MAP consumption is an unsolved problem, and points to the critical importance of the field's increasing focus on methodological rigor.

\bmhead{Acknowledgments}

*Thanks to Alex Berke, Alix Winter, Anson Berns, Dan Waldinger, Hari Dandapani, Adin Richards, Martin Gould, Matt Lerner, and Rye Geselowitz for comments on an early draft. Thanks to Jacob Peacock, Andrew Jalil, Gregg Sparkman, Joshua Tasoff, Lucius Caviola, Natalia Lawrence, and Emma Garnett for help with assembling the database and providing guidance on their studies. Thanks to Sofia Vera Verduzco for research assistance. We gratefully acknowledge funding from the NIH (grant R01LM013866), Open Philanthropy, and the Food Systems Research Fund (Grant FSR 2023-11-07).*

# Declarations {.unnumbered}

The authors declare no conflicts of interest.
\newpage

# References