This repository contains the data and findings for paper -
data/
: This directory contains the raw data collected from the research repositories. The all_research_repos.csv file specifically contains data used in this analysis, which is limited to research repositories. (Note: The files all_org_repos.csv and all_user_repos.csv include repositories labeled as research/non_research)analysis/
: This folder contains plot_analysis.ipynb file with graphs/plots used in paper. The analysis and result graphs can be found in the plot_analysis.ipynb Jupyter notebook.scripts/
: This directory contains the code code used for plotting graphs in jupyternotebook.
SWORDS-template-UP v1.0.0 have been used for gathering GitHub profiles, repositories and additional software developement variables. This is an exteded version of SWORDS-template
To collect the necessary data for our analysis, follow these steps:
- Collect GitHub profiles of users and organizations by using the SWORDS-template-UP collect_users script.
- Collect repositories of GitHub profiles using the SWORDS-template-UP collect_repositories script.
- Collect additional variables by running specific scripts, such as:
To reproduce our analysis, follow these steps:
- Clone this repository.
- Navigate to the
analysis/
directory. - Run the analysis script.
The data is licensed under Creative Commons Attribution 4.0 International License
Please cite it as described in the CITATION.cff file.