Skip to content

classifiers_comparison

Valerio Arnaboldi edited this page Apr 5, 2018 · 1 revision

This script compares the performance of different binary classifiers in terms of precision, recall, and accuracy of their predictions. Specifically, the user can specify the set of positive and negative documents to be used for training and testing. The script takes 80% of the provided observations and uses them for training, while the remaining 20% is used for training. For each model, 10 different training and test sets are created. Each model is thus trained and tested 10 times and the result, in terms of precision, recall and accuracy, are averaged on the 10 test sets. To see the complete set of program arguments, run the script with the option -h.

$ python3 classifiers_comparison.py -h

Models Options

The script accepts the same arguments as tp_doc_classifier.py, apart from those for manual editing of the features, which are currently not supported.

Clone this wiki locally