Management Science, Accepted October 2025
Author(s): Xinlan Emily Hu, Mark E. Whiting, Linnea Gandhi, Duncan J. Watts, and Abdullah Almaatouq
INSTRUCTIONS: The typical README serves the purpose of guiding a reader through the available material and a route to replicating the results in the research paper. Start by providing a brief overview of the available material and a brief guide as to how to proceed from beginning to end.
Data provenance: This paper relies on data, and all data necessary to reproduce the results of the paper are included. Below is a detailed description of how the data was obtained, allowing to reproduce the dataset.
INSTRUCTIONS: The README should contain a description of the origin (provenance), location and accessibility of the data used in the article.
When secondary data are used (whether the data are included in the package, or not), the description of the provenance should describe the condition under which (a) the current authors (b) any future users might access the data. Provide details on how datasets can be obtained (including contact details of provider, date ranges, versions, exclusions, etc.), how they need to be merged if applicable, and where they need to be put under which names for the code to run. The idea is that after a researcher follows your step-by-step instructions, they can readily run your code and (re)produce your results.
When the data were generated by the authors, e.g. collected from the web, in online, lab or field experiments, through surveys, then the description of the provenance should describe the data generating process, e.g., the survey or experimental procedures.
The information should describe ALL data sets and data sources used, regardless of whether they are provided as part of the replication package or not, and regardless of size or scope. For each dataset, list the file and describe the source. If there are many datasets, a list in tabular form can be useful. If some or all datasets are provided as sample or synthetic or mockup data, please indicate prominently, since they typically will not reproduce your results. Describe the generating process (provide code if applicable).
INSTRUCTIONS: Include data dictionaries for all used datasets (whether included or not). Each data dictionary lists all variables with names as used in the code/dataset, with a one-line description of the variable.
INSTRUCTIONS: List any software used to run your code, and their versions. List any non-standard packages (e.g. Stata or R packages, python libraries) and their versions, if applicable. If using non-standard software, provide instructions on how to obtain it. If the code includes scripts that install packages, document here as well. If your code uses random numbers, use a fixed seed such that your results can be reproduced.
If the runtime of the code is larger than a few minutes on a regular computer, indicate approximate runtimes. If relevant, describe necessary hardware requirements.
INSTRUCTIONS: Provide detailed instructions on how to run your code. In particular, describe in which order which code files need to be run in order to reproduce all figures and tables in your paper. Indicate where in the code which table/figure/result is produced.
Please include the code as executable files/scripts (not as PDFs). Make sure to include code to reproduce all figures, tables, and other results in the main manuscript. Code for results in the appendix is highly appreciated but not compulsory. Make sure your code is well-documented and uses only relative pathnames (it should run on any computer, not just yours). If there are many individual code files, provide a Master script if possible.