Skip to content

cheesinglee/bigml-feature-subsets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Feature Subset Selection using the BigML API

Usage:

usage: feature_subsets.py [-h] [-u USERNAME] [-a APIKEY] [-o OBJECTIVE_FIELD]
		    [-t TAG] [-n NFOLDS] [-k STALENESS] [-p PENALTY]
		    [-s SEQUENTIAL]
		    filename

positional arguments:
   filename              path to CSV file

optional arguments:
  -h, --help            show this help message and exit
  -u USERNAME, --username USERNAME
		  BigML username
  -a APIKEY, --apikey APIKEY
		  BigML API key
  -o OBJECTIVE_FIELD, --objective_field OBJECTIVE_FIELD
		  Index of objective field [default=last]
  -t TAG, --tag TAG     Tag for created BigML resources [default="Feature selection"]
  -n NFOLDS, --nfolds NFOLDS
		  Number of cross-validation folds [default=5]
  -k STALENESS, --staleness STALENESS
		  Staleness parameter for best-first search [default=5]
  -p PENALTY, --penalty PENALTY
		  Per-feature penalty factor [default=0.001]
  -s SEQUENTIAL, --sequential SEQUENTIAL
		  Perform model building sequentially [default=False]

Example:

feature_subsets.py --username="my_bigml_username" --apikey="my_bigml_key" data/crx.csv

Cleanup:

For your convenience, BigML sources, datasets, models, and evaluation are tagged by default with "Feature selection". This makes them easy to locate through the BigML dashboard. To quickly delete all the created resources, you can use bigmler:

bigmler --delete --all-tag "Feature selection"

Dependencies

  • Python 2.x
  • BigML Python bindings
  • scikit-learn

About

Feature Subset Selection a la Kohavi and John, using BigML

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages