Skip to content

Latest commit

 

History

History
67 lines (46 loc) · 3.02 KB

README.md

File metadata and controls

67 lines (46 loc) · 3.02 KB

Frugal Algorithm Selection

Frugal algorithm selection is an active learning approach that attempts to reduce the labelling cost by using only a subset of the training data with timeout predictor and dynamic timeout configurations.

Directory Structure

Below is an overview of the main folders in this repository and their contents:

  • /DATASETS: Contains all datasets that were used in the research.
  • /EXPERIMENT_LOGS: Includes Slurm output files of each implemented approach under various configurations.
  • /EXPERIMENT_OUTPUTS: Includes output files of each implemented approach under various configurations.
  • /PLOTS: Holds all plots that are included in the published paper and its appendix.
  • Appendix.pdf: This file is supplementary material accompanying the paper. It provides additional data, analyses, and explanations that support the main text of the research.

Installation

Requirements

Ensure you have Python 3.12 or higher installed on your machine. The dependencies are listed below and can also be found in the requirements.txt file:

  • liac-arff==2.5.0
  • matplotlib==3.8.0
  • modAL-python==0.4.2.1
  • numpy==1.26.1
  • pandas==2.1.1
  • scikit-learn==1.3.1

Installing Dependencies

To install the required packages with the specified versions, run the following command:

pip install -r requirements.txt

This ensures you have the exact environment used in our studies, promoting consistency and reproducibility.

Features:

Timeout Predictor:

Timeout Predictor On/Off: It can be activated/deactivated with arameter used in the voting mechanism and instance selection

Dynamic Timeout:

Adjustable Timeout: The dynamic timeout increases during runtime based on performance on the validation set and can be adjusted through parameters to suit different experimental setups.

Query Size:

Customizable Ratio: Adjust the query size ratio through parameters to optimize the querying process in learning algorithms.

Instance Selection Approach

Uncertainty-based and Random: Supports different methods of instance selection including uncertainty-based (leveraging modAL) and random approaches.

Usage

Example Run

Uncertainty-based Instance Selection

To run an uncertainty-based instance selection, which includes passive learning approaches, use the following command:

python Algorithm_Selection.py --dataset {dataset} --prefix {prefix} --weight {Weighted, No_Weight} --timeout_predictor_usage {Yes, No} --query_size {query_size} --seed {seed} --split {split}

Random Instance Selection

For running the random instance selection with the same configurable parameters:

python Algorithm_Selection_Random_Query.py --dataset {dataset} --prefix {prefix} --weight {Weighted, No_Weight} --timeout_predictor_usage {Yes, No} --query_size {query_size} --seed {seed} --split {split}

Further parameter details and their descriptions can be found within the code comments or the supplementary documentation provided in the repository.