Skip to content

Test Predictors, Features

Wittawat Jitkrittum edited this page Aug 19, 2014 · 5 revisions

Given a list of predictors and a list of feature extractors, one quick way to report the AUC of each combination pair on a specified subject (e.g., Dog_1) is to do as follows.

from seizures.features.ARFeatures import *
from seizures.features.MixFeatures import StackFeatures
from seizures.features.PLVFeatures import PLVFeatures
from seizures.prediction.ForestPredictor import ForestPredictor
from seizures.prediction.Boosting import AdaBoostTrees
from seizures.pipelines.FeaturePredictorTest import CVFeaturesPredictorsTester
from seizures.pipelines.FeaturePredictorTest import CachedCVFeaPredTester

feature_extractors = [
    ARFeatures(), 
    #VarLagsARFeatures(4),
    #StackFeatures(ARFeatures(), VarLagsARFeatures(4), VarLagsARFeatures(6))
    #StackFeatures(ARFeatures(), VarLagsARFeatures(4)),
    StackFeatures(ARFeatures(), PLVFeatures()),
    #PLVFeatures(),
    StackFeatures(ARFeatures(), VarLagsARFeatures(5), PLVFeatures()), 
    StackFeatures(ARFeatures(), VarLagsARFeatures(5), VarLagsARFeatures(30), PLVFeatures())
]
predictors = [AdaBoostTrees(), ForestPredictor(n_estimators=200)]
#predictors = [ ForestPredictor(n_estimators=200)]
patient = 'Dog_1'
#patient = 'Patient_2'
tester = CachedCVFeaPredTester(feature_extractors, predictors, patient)
# randomly select subsamples of total segments (ictal + interictal)
# To make it faster. I expect using the full data to give similar result anyway.
max_segments=500

table = tester.test_combination(fold=2, max_segments=max_segments)

Do the following to print a reporting table

# the argument to print_table(..) can be 
# seizure_mean_auc, seizure_std_auc, early_mean_auc, early_std_auc
table.print_table('seizure_mean_auc')

This will produce

# From FeaturesPredictsTable
Reporting seizure_mean_auc
+---------------------------------------+---------+--------+
|             feat. \ pred.             | ABTrees | Forest |
+---------------------------------------+---------+--------+
|                   AR                  |  0.906  | 0.992  |
|             Stack(AR, PLV)            |  0.896  | 0.995  |
|       Stack(AR, LagsAR(5), PLV)       |  0.919  | 0.994  |
| Stack(AR, LagsAR(5), LagsAR(30), PLV) |  0.913  | 0.995  |
+---------------------------------------+---------+--------+

For early seizure results, use

table.print_table('early_mean_auc')

which will produce

# From FeaturesPredictsTable
Reporting early_mean_auc
+---------------------------------------+---------+--------+
|             feat. \ pred.             | ABTrees | Forest |
+---------------------------------------+---------+--------+
|                   AR                  |  0.824  | 0.956  |
|             Stack(AR, PLV)            |  0.838  | 0.962  |
|       Stack(AR, LagsAR(5), PLV)       |  0.817  | 0.963  |
| Stack(AR, LagsAR(5), LagsAR(30), PLV) |  0.761  | 0.951  |
+---------------------------------------+---------+--------+

From these two tables, combining multiple feature types (i.e., Stack) yield a higher AUC.

Patient_1

Reporting seizure_mean_auc
+---------------------------------------+--------+
|             feat. \ pred.             | Forest |
+---------------------------------------+--------+
|                   AR                  | 0.957  |
|             Stack(AR, PLV)            |  0.93  |
| Stack(AR, LagsAR(5), LagsAR(30), PLV) |  0.88  |
+---------------------------------------+--------+


# From FeaturesPredictsTable
Reporting early_mean_auc
+---------------------------------------+--------+
|             feat. \ pred.             | Forest |
+---------------------------------------+--------+
|                   AR                  | 0.841  |
|             Stack(AR, PLV)            | 0.915  |
| Stack(AR, LagsAR(5), LagsAR(30), PLV) | 0.783  |
+---------------------------------------+--------+

Patient and dog data seem to be quite different. Patient data is more difficult. LagsAR(x) with too high x does not seem to be informative.

Clone this wiki locally