Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record the progress and auto disable the obviously slow candidate(never use in future trainning) #328

Open
eromoe opened this issue Dec 15, 2016 · 3 comments

Comments

@eromoe
Copy link

eromoe commented Dec 15, 2016

features:
X.shape (9516, 24956)
code:

tpot = TPOTClassifier(generations=5, population_size=500, verbosity=2, max_eval_time_mins=20)
tpot.fit(X_train, y_train)
print(tpot.score(X_test, y_test))

log:

Optimization Progress:   1%|          | 24/3000 [50:06<143:35:50, 173.71s/pipeline]rogress:   0%|          | 1/3000 [01:29<74:23:16, 89.30s/pipeline]          Optimization Progress:   1%|          | 25/3000 [50:06<397:45:25, 481.32s/pipeline]

Timeout during evaluation of pipeline #25. Skipping to the next pipeline.

Optimization Progress:   1%|▏         | 44/3000 [1:21:41<40:07:44, 48.87s/pipeline]
Optimization Progress:   1%|          | 28/3000 [50:16<195:53:14, 237.28s/pipeline]          
Optimization Progress:   2%|▏         | 45/3000 [1:21:41<323:34:33, 394.20s/pipeline]

Timeout during evaluation of pipeline #45. Skipping to the next pipeline.

Optimization Progress:   2%|▏         | 57/3000 [1:45:48<39:40:12, 48.53s/pipeline]  
Optimization Progress:   2%|▏         | 48/3000 [1:22:15<163:14:27, 199.07s/pipeline]          
Optimization Progress:   2%|▏         | 58/3000 [1:45:48<125:42:37, 153.83s/pipeline]

Timeout during evaluation of pipeline #58. Skipping to the next pipeline.

Optimization Progress:   2%|▏         | 66/3000 [1:46:12<16:09:53, 19.83s/pipeline]] 
Optimization Progress:   2%|▏         | 61/3000 [1:46:05<63:41:23, 78.01s/pipeline] 

It looks like some specifical models/ models combination are very slow, so the pipeline which contain that modle would hit timeout and failed every time. I think tpot could log the info, when failed count hit a certain number, then don't use that model or delay it .(Also can use some machine learning model to select the obviously slow candidate )

PS: The log message didn't look well in jupyter notebook:
qq 20161215161405

@eromoe eromoe changed the title Record the progress and auto disable the very slow candidate(never use in future trainning) Record the progress and auto disable the obviously slow candidate(never use in future trainning) Dec 15, 2016
@rhiever
Copy link
Contributor

rhiever commented Mar 22, 2017

@weixuanfu2016, can you please confirm that this issue is addressed in the 0.7 release? IIRC models that failed to finish evaluating due to timeouts have their "timeout score" recorded in the lookup dictionary, and are thus overlooked on all future evaluations, right?

@weixuanfu
Copy link
Contributor

weixuanfu commented Mar 22, 2017

Nope, this is not addressed in the 0.7 release since we do not check some specifical models/ models combination in optimization processes. The slow pipeline would have -inf fitness score due to timeout and it would be selected for next generation with updated NSGA-II selection operator in version 0.7 but these slow models and combination may be generated in later generations due to crossover and mutation. I think we may need an intelligent way to adapt the operator list or operator combinations during optimization process

@yishairasowsky
Copy link

yishairasowsky commented Apr 10, 2022

if you have a chance to reply
was there ever an answer to this? @spiros @bollwyvl @zarch @cottrell @mrocklin
because I have a similar issue
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants