Reproduction of DeepLineDP

The aim of this project was to reproduce the following research paper:

C. Pornprasit; C. Kla Tantithamthavorn, DeepLineDP: Towards a Deep Learning Approach for Line-Level Defect Prediction (2023).

describe the necessary steps that we took throughout the proces and finally try to improve on the already achieved results.

Project links

Another aim of the project was to document the whole process, plan and share responsibilities, therefore we used following tools (links to actual projects):

Overleaf
Trello

Authors

Kamila Sproska
Dominik Polak

Reproduction

Our approach towards extending the original repository

Original repository for research paper was separated into two:

supplementary materials (scripts for training models) - the original from awsm-research/DeepLineDP was pasted into DeepLineDP folder.
database - original from awsm-research/line-level-defect-prediction was pasted into DeepLineDP/datasets folder.

We decided to merge two repositories in order to make reproduction easier.

Preparation for reproduction

Since models require CUDA to be able to run and not all computers can have it installed, we decided to do the reproduction on Google colab.
For this reason there are a couple of steps required to do before reproduction itself.

Download this repository using Download ZIP option.
Upload folder to drive to the main catalog (for this example the folder is called M6).
Go to uploaded folder and find reproduction.ipynb script. Choose Open with > Google Colaboratory option.
Change runtime type to GPU Change runtime type -> GPU -> Save.

Running reproduction script

Overview of the whole process:

All those steps have been described in reproduction.ipynb, however most notable remarks to keep in mind are:

When mounting Google Drive make sure you followed all the popup instructions and followed the setup correctly. At the end setup should look somewhat like this:
Not all lines need to be run each time, however all pip install commands have to be run at the beginning of each session.
The file is uploaded without cleared output, so that it is easier to recognize whether cell ran correctly (outputs should be similar).
To check reporoduction with applied changes go to /content/drive/MyDrive/M6/DeepLineDP/script/preprocess_data.py and change flag ignore_imports to True

Results of the reproduction

For all databases

file-Effort@Top20Recall (↘)	file-Recall@Top20LOC (↗)	file-IFA (↘)

For activemq

file-Effort@Top20Recall (↘)	file-Recall@Top20LOC (↗)	file-IFA (↘)

Improvements

In order to set a specific type of change it is needed to set specific flags. In DeepLineDP_model.py and in preprocess_data.py there are following flags:

Flag	Function
ignore_imports	Changes all import lines to comment containing `#import`
replace_exceptions	Replaces all *Exception classes to Exception
remove_public_keyword	Removes keyword public
remove_final_keyword	Removes keyword final
normalize_names	Removes whitespaces before and after each line
remove_duplication_line	Removes all duplicated lines (that have already appeared)
add_hidden_layer	Adds one hidden FC layer

Each flag has a default value False therefore in order to tes a certain type of change it is necessary before running reproduction script to change chosen flag(s) to true.

Exceptions replaced

	Original	Exceptions replaced
↘
↗
↘

Imports replaced with comment

	Original	Imports replaced with comment
↘
↗
↘

Public remove

	Original	Public remove
↘
↗
↘

Final remove

	Original	Final remove
↘
↗
↘

Hidden layer added

	Original	Hidden layer added
↘
↗
↘

Duplicate line remove

	Original	Duplicate line remove
↘
↗
↘

Result for additional metrics


↘
↗
↘

Result for not preprocessed data


↘
↗
↘

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
DeepLineDP		DeepLineDP
readme-images		readme-images
LICENSE		LICENSE
README.md		README.md
reproduction.ipynb		reproduction.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reproduction of DeepLineDP

Project links

Authors

Reproduction

Our approach towards extending the original repository

Preparation for reproduction

Running reproduction script

Results of the reproduction

For all databases

For activemq

Improvements

Exceptions replaced

Imports replaced with comment

Public remove

Final remove

Hidden layer added

Duplicate line remove

Result for additional metrics

Result for not preprocessed data

About

Releases

Packages

Contributors 4

Languages

License

pwr-pbr23/M6

Folders and files

Latest commit

History

Repository files navigation

Reproduction of DeepLineDP

Project links

Authors

Reproduction

Our approach towards extending the original repository

Preparation for reproduction

Running reproduction script

Results of the reproduction

For all databases

For activemq

Improvements

Exceptions replaced

Imports replaced with comment

Public remove

Final remove

Hidden layer added

Duplicate line remove

Result for additional metrics

Result for not preprocessed data

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages