-
Notifications
You must be signed in to change notification settings - Fork 1
Test Predictors, Features
Wittawat Jitkrittum edited this page Aug 19, 2014
·
5 revisions
Given a list of predictors and a list of feature extractors, one quick way to report the AUC of each combination pair on a specified subject (e.g., Dog_1) is to do as follows.
from seizures.features.ARFeatures import *
from seizures.features.MixFeatures import StackFeatures
from seizures.features.PLVFeatures import PLVFeatures
from seizures.prediction.ForestPredictor import ForestPredictor
from seizures.prediction.Boosting import AdaBoostTrees
from seizures.pipelines.FeaturePredictorTest import CVFeaturesPredictorsTester
from seizures.pipelines.FeaturePredictorTest import CachedCVFeaPredTester
feature_extractors = [
ARFeatures(),
#VarLagsARFeatures(4),
#StackFeatures(ARFeatures(), VarLagsARFeatures(4), VarLagsARFeatures(6))
#StackFeatures(ARFeatures(), VarLagsARFeatures(4)),
StackFeatures(ARFeatures(), PLVFeatures()),
#PLVFeatures(),
StackFeatures(ARFeatures(), VarLagsARFeatures(5), PLVFeatures()),
StackFeatures(ARFeatures(), VarLagsARFeatures(5), VarLagsARFeatures(30), PLVFeatures())
]
predictors = [AdaBoostTrees(), ForestPredictor(n_estimators=200)]
#predictors = [ ForestPredictor(n_estimators=200)]
patient = 'Dog_1'
#patient = 'Patient_2'
tester = CachedCVFeaPredTester(feature_extractors, predictors, patient)
# randomly select subsamples of total segments (ictal + interictal)
# To make it faster. I expect using the full data to give similar result anyway.
max_segments=500
table = tester.test_combination(fold=2, max_segments=max_segments)
Do the following to print a reporting table
# the argument to print_table(..) can be
# seizure_mean_auc, seizure_std_auc, early_mean_auc, early_std_auc
table.print_table('seizure_mean_auc')
This will produce
# From FeaturesPredictsTable
Reporting seizure_mean_auc
+---------------------------------------+---------+--------+
| feat. \ pred. | ABTrees | Forest |
+---------------------------------------+---------+--------+
| AR | 0.906 | 0.992 |
| Stack(AR, PLV) | 0.896 | 0.995 |
| Stack(AR, LagsAR(5), PLV) | 0.919 | 0.994 |
| Stack(AR, LagsAR(5), LagsAR(30), PLV) | 0.913 | 0.995 |
+---------------------------------------+---------+--------+
For early seizure results, use
table.print_table('early_mean_auc')
which will produce
# From FeaturesPredictsTable
Reporting early_mean_auc
+---------------------------------------+---------+--------+
| feat. \ pred. | ABTrees | Forest |
+---------------------------------------+---------+--------+
| AR | 0.824 | 0.956 |
| Stack(AR, PLV) | 0.838 | 0.962 |
| Stack(AR, LagsAR(5), PLV) | 0.817 | 0.963 |
| Stack(AR, LagsAR(5), LagsAR(30), PLV) | 0.761 | 0.951 |
+---------------------------------------+---------+--------+
From these two tables, combining multiple feature types (i.e., Stack) yield a higher AUC.
Reporting seizure_mean_auc
+---------------------------------------+--------+
| feat. \ pred. | Forest |
+---------------------------------------+--------+
| AR | 0.957 |
| Stack(AR, PLV) | 0.93 |
| Stack(AR, LagsAR(5), LagsAR(30), PLV) | 0.88 |
+---------------------------------------+--------+
# From FeaturesPredictsTable
Reporting early_mean_auc
+---------------------------------------+--------+
| feat. \ pred. | Forest |
+---------------------------------------+--------+
| AR | 0.841 |
| Stack(AR, PLV) | 0.915 |
| Stack(AR, LagsAR(5), LagsAR(30), PLV) | 0.783 |
+---------------------------------------+--------+
Patient and dog data seem to be quite different. Patient data is more difficult. LagsAR(x) with too high x does not seem to be informative.