You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There's a lot to improve in our training / test set split in data_splitting.py:
the data is always split. However, we would want test_set_size=0 to work, so there's no split and AutoEmulate uses the full data as training data. We then have to decide what to do with evaluate, as there's no test set to evaluate on. We could allow it to take an external test set, otherwise throw an error. Similarly, refit should throw an error, as the models are refitted on the full 'training' data after cross-validation already.
data is randomly shuffled before being split. For temporal data, spatial data, grouped data etc. we want more sophisticated splitting techniques.
The text was updated successfully, but these errors were encountered:
mastoffel
changed the title
make it possible to have no train/test split
Improve train/test split
Nov 28, 2024
There's a lot to improve in our training / test set split in
data_splitting.py
:test_set_size=0
to work, so there's no split andAutoEmulate
uses the full data as training data. We then have to decide what to do withevaluate
, as there's no test set to evaluate on. We could allow it to take an external test set, otherwise throw an error. Similarly,refit
should throw an error, as the models are refitted on the full 'training' data after cross-validation already.The text was updated successfully, but these errors were encountered: