This repo contains the replication package for the ICSE 22 technical paper, titled "Automated Handling of Anaphoric Ambiguity in Requirements: A Multi-solution Study"
We organize the SupplementaryMaterial folder into three different sub-folders:
-
Datasets: This sub-folder contains the three datasets that are used in our study. These are: (1) the DAMIR dataset which we constructed as part of our work, and two publicly available datasets which we adapted to our work (2) CoNLL2011, and (3) ReqEval. More details on these datasets are provided in Section 4.3 of the paper.
-
Application: This sub-folder contains the final solutions resulting from our study, and the Jupyter notebook containing the scripts that one can apply to run our solutions. We will explain in detail how to run this notebook. We note that we did not provide a runanble script for the last solution (solution 6 in our paper) that is based on the existing NLP coreference resolution tools. This is because this solution is not compatible with the other Python libraries that we use for the other solutions. Given that solution 6 did not achieve good-enough results compared with the SpanBERT- and ML-based solutions, we decided to drop solution 6 from our supplementary material to simplify the application environemnt for the other five solutions.
-
Tuning: This sub-folder contains the notebook required for fine-tuning the SpanBERT-based solutions and experimenting with different configurations for the ML-based solutions. More details are provided in Section 4.5 of our paper.
How to apply the SpanBERT- and ML-based solutions for anaphoric ambiguity handling in a simple example?
- Clone the repo and navigate to the Application folder
git clone https://github.com/SNTSVV/Anaphoric-Ambiguity.git
cd Anaphoric-Ambiguity/Application
-
Download SpanBERT_NLP and SpanBERT_RE model weights separatly from this google drive link.
-
Unzip the downloaded folder and move the content to the Application folder as follows:
Application/SpanBERT-NLPv21.8.10/
Application/SpanBERT-REv21.9.01/
Alternatively from the command line (make sure you are in the Application folder):
pip install gdown
gdown https://drive.google.com/uc?id=1uMZpbIMv-bP_T0x-E2FYyT8ECrF5f3z8
unzip anaphoric-ambiguity-models.zip
rm anaphoric-ambiguity-models.zip
- Python 3.8 or higher: Get the latest Python version here. If you already have Python installed, make sure that the version you have is 3.8 or later. You can check the Python version installed on your computer by typing this command in the terminal window:
$ python --version
- pip: Make sure you have pip (the package installer for Python) installed. More information can be found here.
- Jupyter Notebook: Install Jupyter notebook. The installation instructions are provided in the Jupyter documentation. It is highly recommended that you install Jupyter using Anacoda.
We have created a list requirements.txt within the Application folder that includes the Python libraries we use in our work. You can download all required libraries by typing the following command in a terminal:
$ pip install -r requirements.txt
- Launch the Jupyter Notebook application using the following command:
$ Jupyter notebook
- The Jupyter Notebook will be launched in a new browser window (or a new tab) showing the Notebook Dashboard. Navigate through the folder structure on the first page of Jupyter (Files tab) and go to the folder where you have downloaded the Application.
- Open the Jupyter notebook Solutions application.ipynb.
- Select the option Cell>Run All to execute all of the cells in the notebook.