Skip to content

This is the experimental package of paper entitled "What is the Vocabulary of Flaky Tests? An Extended Replication" that was ​submitted for publication in ICPC 2021 - Replications and Negative Results (RENE).

License

Notifications You must be signed in to change notification settings

bhpachulski/ICPC-RENE-Paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CC BY-SA 4.0 DOI arXiv

Bruno Henrique Pachulski Camara 1, 2,
Marco Aure ́lio Graciotto Silva 3,
Andre T. Endo 4,
Silvia Regina Vergilio 2.

1 Centro Universitário Integrado, Campo Mourão, PR, Brazil
2 Department of Computer Science, Federal University of Parana ́, Curitiba, PR, Brazil
      [email protected], [email protected]
3 Department of Computing, Federal University of Technology - Parana ́, Campo Mourão, PR, Brazil
      [email protected]
4 Department of Computing, Federal University of Technology - Parana ́, Cornélio Procópio, PR, Brazil
      [email protected]

This paper has been submitted for publication in ICPC 2021 - Replications and Negative Results (RENE).

This experimental package is organized by research questions. For each of the questions, some files can be executed to obtain the data that are presented in the paper.

Abstract

Software systems have been continuously evolved and delivered with high quality due to the widespread adoption of automated tests. A recurring issue hurting this scenario is the presence of flaky tests, a test case that may pass or fail non-deterministically. A promising, but yet lacking more empirical evidence, approach is to collect static data of automated tests and use them to predict their flakiness. In this paper, we conducted an empirical study to assess the use of code identifiers to predict test flakiness. To do so, we first replicate most parts of the previous study of Pintoetal.(MSR2020). This replication was extended by using a different ML Python platform (Scikit-learn) and adding different learning algorithms in the analyses. Then, we validated the performance of trained models using datasets with other flaky tests and from different projects. We successfully replicated the results of Pintoetal.~(2020), with minor differences using Scikit-learn; different algorithms had performance similar to the ones used previously. Concerning the validation, we noticed that the recall of the trained models was smaller, and classifiers presented a varying range of decreases. This was observed in both intra-project and inter-projects test flakiness prediction.

Keywords: test flakiness, regression testing, replication studies, machine learning

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

CC BY-SA 4.0

About

This is the experimental package of paper entitled "What is the Vocabulary of Flaky Tests? An Extended Replication" that was ​submitted for publication in ICPC 2021 - Replications and Negative Results (RENE).

Topics

Resources

License

Stars

Watchers

Forks

Packages