This is a benchmark suite meant for automatic machine learning frameworks, containing a broad range of datasets and accuracy functions to evaluate performance on them. You can see an up to date list of all datasets and accuracy functions, as well as ongoing lightwood performance at: http://benchmarks.mindsdb.com:9107/accuracy_plots
In order to run the benchmarks locally to check if a change you made to lightwood is positive:
-
Install pip, git and git-lfs (note: when you pull new changes you have to
git lfs pull
in addition togit pull
and you have add large files togit-lfs
instead of it) -
Clone this repository and add it to your
PYTHONPATH
and installrequirements.txt
andploting/requirements.txt
-
cd into it and run
python3 benchmarks/run.py --use_db=0 --use_ray=0 --lightwood=#env
| Setuse_ray
to1
if you have more than 1 GPU or a very good GPU (e.g. a Quadro) | If you wish to benchmark fewer datasets set the--dataset
argument to the comma separated list of these datasets, e.g.--datasets=hdi,home_rentals,openml_transfusion
. -
Once the benchmarks are done running they will generate a preliminary report (
REPORT.md
) and a local file with the full results (REPORT.db
). These will be used for the plots and reports in the next step -
Run
python3 ploting/server.py
-
Got to
http://localhost:9107/compare/best_of_all_time/local
orhttp://localhost:9107/compare/last_{x}/local
.best_of_all_times
choses the best version of lightwood for each databaset+accuracy function combination, whilelast_{x}
looks at the lastx
versions. We usually like comparing withlast_3
to determine if we should release a new version. You can also compare with a specific version or commit hash if you're only interested in that. Go tohttp://localhost:9107/accuracy_plots
in order to see accuracy plots that include your local results (they will always be the last data-point on each plot)
Same as above, but you should have access to a db_info.json
file and thus be able to run with --use_db=1
to store your results in our database, this means you can compare using urls like http://benchmarks.mindsdb.com:9107/compare/<some hash>/<hash of your branch>
for easier sharing and to appease automatic release scripts.
When a PR is made into stable you should chose a machine (ideally the benchmarking rig on ec2) and:
- Clone the latest commit being merged (let's say commit hash for this is
foobar
) - Run the benchmarks via
python3 benchmarks/run.py --use_db=1 --use_ray=1 --lightwood=#env
- Check
http://benchmarks.mindsdb.com:9107/compare/last_3/foobar
in order to see if a release can be made (be patient, it might take 3-5 hours for all benchmarks to run) - Re-run github actions for the latest commit (excluding the documentation bot's commits) and make sure all is green
- Once we release a new stable run benchmarks for it using
--is_dev=0
such that it gets added to the official list of released versions