Code and data for Extracting Research Software Installation Instructions from README files: An Initial Analysis
This repository contains the code and data produced for the paper Extracting Research Software Installation Instructions from README files: An Initial Analysis(Conference: Natural Scientific Language Processing and Research Knowledge Graphs (NSLP 2024))
@article{readme2plan,
title = {Extracting Research Software Installation Instructions from README files: An Initial Analysis},
journal = {Natural Scientific Language Processing and Research Knowledge Graphs (NSLP 2024)},
year = {2024},
doi = {},
author = {Carlos Utrilla Guerrero and Daniel Garijo},
url = {},
}
The structure of the repository is as follows:
RESULTS
contains all scripts used for experimental results.full_prompt_responses
contains all scripts used for extracting a ground truth dataset of dependency trees for each study subject.
If you are interested in replicating our results, please follow this steps:
- Step 1: Create
- Step 2:
⚠️ Please note that this repo is under construction. Do not contain the complete documentation yet
- Python 3.10 or newer
- Select repos with different levels of difficulty/complexity -- explain why the levels are set out that way
- Annotate the different alternative installation options that exist for repositories. Run evaluationfor each
- Define how each step is going to be annotated in a plan -- what happens if there are optional steps?