-
Notifications
You must be signed in to change notification settings - Fork 516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added datapoint for a small dataset #1249
base: main
Are you sure you want to change the base?
Conversation
The change is simple. Some performance test will be needed to approve this PR. |
Thanks for review! I'm happy to help running performance test if needed. |
@sonichi how were the original 5 selected? I did some similar works a couple of years ago and I used a greedy approximation because I was going for a ranking, not a hard subset. Did you use a larger benchmark set and use some partitioning of the space or some integer programming to get to the subset? I can run my benchmark suite on the branch and see if it helps improve accuracy on the datasets I'm looking at. But we probably also want to run against whatever system / benchmark you originally used. |
I used a new greedy algorithm in the zero-shot AutoML paper to select the portfolio from a large set of candidate configurations. |
Can you please link the paper, I'm not sure which one that is. |
With the PR, the model doesn't fail catastrophically on my benchmark (a subset of OpenML CC-18 with a 50/50 train/test split and 10 fold cross-validation) anymore, but it's still not competitive. I assume making it perform well would at least require running the greedy algorithm again on an expanded benchmark. Let me check the paper for details. |
Ok looks like the portfolio building is mostly the same as in my work and in autosklearn 2.0 apart from some minor differences, and the use of meta-features instead of just iterating through configurations. I'm somewhat surprised by how well the meta-feature based zero-shot works tbh, very cool! |
Is the code for profile mining process still in the repo? I can help with the experiment if we have more specific information like experiment code or which extra datasets to include |
The instructions are at: https://microsoft.github.io/FLAML/docs/Use-Cases/Zero-Shot-AutoML#how-to-prepare-offline |
The list of datasets used for the original work is in the paper. I'm not 100% sure about how datasets were selected for the AutoML benchmark vs the cc-18 (which is what I'm using). Also I'm currently using a somewhat non-standard splitting strategy that splits data 50/50. |
Agreed on that, running over the whole benchmark is a little overwork for this PR. It will be great if there is a simple performance test to check how this new point is affecting exisiting flaml default usage, and then we could decide to keep or discard this change. |
There are around 15 multi-class tasks in the benchmark, which is manageable to run just the default.lightgbm before and after. We can merge if the performance doesn't degrade. Likely the performance wouldn't change because the added dataset is not similar to them. |
I've been digging into the zero-shot paper for experiment details. I managed to selected all the multiclass task from the paper, following the 10-fold evaluation, using default LGBM, and I get the following result before and after this datapoint is added:
Unfortunately, it appears that this particular datapoint does impact and slightly diminish performance for a subset of datasets. I will investigate whether I can adjust this datapoint to prevent any negative effects on current tasks. |
I'm not sure if doing tweaks based on such a small number of tasks will be very robust. You don't have any other dataset to confirm that any additional changes generalize, right? So I feel you're likely to overfit to these three tasks you just identified. |
Why are these changes needed?
Currently default LGBMClassifer does not have a datapoint for small dataset that generated from code snippet below:
Applying LGBMClassifer to this dataset will have low performance than expected:
Hence, I added a datapoint which has metafeature of this dataset to LGBMClassifer default config. But I'm not sure if this new datapoint will break other scenario using default LGBMClassifer. I have passed test/default on my PC.
According to my test, XGBClassifer and RandomForestClassifer don't have this issue. They perform well in this dataset.
Related issue number
#1247
Checks