Error in elitist_race -> race_wrapper -> race_wrapper_helper -> set #80

sbomsdorf · 2025-01-28T17:29:47Z

Both a colleague and I found the same internal error in iracedump.rda files which have the following traceback:

> iracedump[["irace.internal.error(msg)"]][["bt"]]
[1] "7: irace.assert(isTRUE(all.equal(configurations_id, sapply(experiments, " 
[2] "       getElement, \"id_configuration\"))))"                              
[3] "6: execute_evaluator(race_state$target_evaluator, experiments, scenario, "
[4] "       target_output, configurations[[\".ID.\"]])"                        
[5] "5: testConfigurations(configurations, scenario)"                          
[6] "4: testing_common(configurations, scenario, iraceResults)"                
[7] "3: testing_fromlog(logFile = scenario$logFile)"                           
[8] "2: irace_common(scenario = scenario, simple = FALSE)"                     
[9] "1: irace_cmdline()"

We execute irace with different code in different programming languages but on the same hardware using slurm and therefore both a target_runner and target_evaluator. The training works perfectly fine and also for the test instances, the output files are there. In the iracedump.rda the output also show, e.g., in iracedump[["testConfigurations(configurations, scenario)"]][["target_output"]][[1]][["outputRaw"]]. The above error did not show in the terminal output of irace but there the output ends with the table of elite configurations that are to be tested. Along the above lines, the irace.Rdata lacks the testing data.

Could you please point us to whether this can be a problem in some parameter definition in a scenario.txt file and/or how we can assess the actual data that causes the assertion to be thrown?

Please let us know if you need more information/data to further investigate the issue.

Many thanks in advance!

The text was updated successfully, but these errors were encountered:

MLopez-Ibanez · 2025-01-29T07:46:56Z

Which version of irace is this? Could you try the current development version https://github.com/MLopez-Ibanez/irace?tab=readme-ov-file#github-development-version ? If it fails with the development version, could you share the iracedump.rda file? Thanks!

sbomsdorf · 2025-01-29T11:03:38Z

It happened with irace 4.0.886dd4c. We are currently running everything again using 4.2.0.c9d441b-dirty and will keep you posted on the results with the development version.

MLopez-Ibanez · 2025-01-30T17:17:38Z

It happened with irace 4.0.886dd4c. We are currently running everything again using 4.2.0.c9d441b-dirty and will keep you posted on the results with the development version.

Thanks. Please let me know if you detect anything wrong.

MLopez-Ibanez · 2025-02-03T14:40:15Z

Hi, @sbomsdorf any news about this?

sbomsdorf · 2025-02-03T14:48:33Z

Hi,

I personally encounter another issue (irace is stuck when evaluating the first batch of instances run using the target-evaluator; I suspect an issue with the file system and/or my code but not irace since my colleague's code runs) . My colleague currently performs the exact same runs as before, but has not reached the testing phase yet. We will keep you posted.

Regards,
Stefan

MLopez-Ibanez · 2025-02-03T16:14:46Z

Thanks.

Are you using target-evaluator just because of using slurm ? If you have some knowledge about R, it may be better to create a targetRunnerParallel function in your scenario.txt and the batchtools package to implement the parallelization:

https://mllg.github.io/batchtools/reference/makeClusterFunctionsSlurm
https://mllg.github.io/batchtools/reference/btlapply.html

Or the clustermq pacakge: https://mschubert.github.io/clustermq/

This may be more reliable than what target-evaluator is currently doing.

sbomsdorf · 2025-02-11T13:29:24Z

Hi again,

First, the original issue seems to be resolved in the updated version of irace (the development version mentioned above, to be precise). The change from training to testing in irace works now and the output is as expected, as confirmed by my colleague.

Still, my problem with irace not being able to evaluate the output files of the first run persists. Indeed, we are using target-evaluator only because of slurm, i.e., to wait for all the output files to be available. We do not have sufficient knowledge in R to use the R packages, or, in other words, using the target-evaluator script is more accessibly/clear to us from a usability point of view.

The most recent output of my irace run is the table header and there is no iracedump.rda (stuck/running forever?).

+-+-----------+-----------+-----------+----------------+-----------+--------+-----+----+------+
| |   Instance|      Alive|       Best|       Mean best| Exp so far|  W time|  rho|KenW|  Qvar|
+-+-----------+-----------+-----------+----------------+-----------+--------+-----+----+------+

How do I use the debugInfo parameter? What are allowed input values?
(Section 11.1 General options of the user guide lacks this info)

MLopez-Ibanez · 2025-02-11T14:42:29Z

First, the original issue seems to be resolved in the updated version of irace (the development version mentioned above, to be precise). The change from training to testing in irace works now and the output is as expected, as confirmed by my colleague.

Great!

How do I use the debugInfo parameter? What are allowed input values? (Section 11.1 General options of the user guide lacks this info)

You can use values 1, 2 or 3, with 3 being the more verbose (I have updated the user-guide to mention this explicitly).

I'd suggest you use debugLevel=3 (or --debug-level 3 when invoking irace) and it will report what is running at that point. You may want to redirect the output of irace to a file using "irace --debug-level 3 .... &> irace-output.txt", where "..." is any other command-line parameters that you use.

If irace is stuck at that point, it is usually because the target-runner (or target-evaluator) are still running. A process can be running but consume no CPU.

sbomsdorf · 2025-02-13T09:44:49Z

Thank you very much for the prompt update of the guide!

I've used debugLevel=3 and am now able to see the output of the target-evaluator. All the instances submitted to slurm ran and the target-evaluator also processes the results correctly, i.e., prints the cost for irace. However, the output with debugLevel=3 just stops after the last instance is processed by the target-evaluator. Apparently, there is some kind of issue with the interface of the target-evaluator output and irace. For example, this is the last lines of the output of irace with debugLevel=3:

# 2025-02-13 10:32:53 CET: /home/<user>/<project>/code/tuning/target-evaluator 33
/home/<user>/<project>/code/tuning/target-evaluator 30
/home/<user>/<project>/code/tuning/target-evaluator 1189893065
/home/<user>/<project>/code/tuning/target-evaluator /home/<user>/<project>/code/data/instances/"instance1.txt --capacity=45"
/home/<user>/<project>/code/tuning/target-evaluator 33
/home/<user>/<project>/code/tuning/target-evaluator 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
# 2025-02-13 10:33:03 CET: DONE (33) Elapsed wall-clock seconds: 10.01
55182608

Thereafter, it stops. In the output without setting the debugLevel this corresponds to only the table header of the first irace run being printed.

I have already tried to use different number formats for the output costs (5.5182608123e7 vs. 55182608.123 vs. 55182608) without changing any change in the above-described issue.

sbomsdorf · 2025-02-13T09:49:05Z

I just found an error message in the iracedump.rda (only output if debugLevel=3):

> attributes(iracedump)[["error.message"]]
[1] "Error in set(target_output, j = "configuration", value = unlist_element(experiments,  :
Supplied 31 items to be assigned to 124 items of column 'configuration'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code.
Calls: irace_cmdline ... elitist_race -> race_wrapper -> race_wrapper_helper -> set
"

Unfortunately, I do not understand what underlying problem the error message suggests. What are the items, what is the column configuration?

MLopez-Ibanez · 2025-02-13T10:45:25Z

That looks like a genuine bug. Perhaps some race condition. Could you share the iracedump.rda and the full output when using debugLevel=3? If you don't want to share it in github, just send me an email. It is also strange that irace does not simply stop and report the error, but fixing the error may fix that.

sbomsdorf · 2025-02-13T16:15:23Z

Okay. I have shared the output via email. Please let me know if you need any other info to support the debugging process.

MLopez-Ibanez added the bug Something isn't working label Feb 13, 2025

sbomsdorf changed the title ~~After running test instances: assertion error in execute_evaluator()~~ Error in Error elitist_race -> race_wrapper -> race_wrapper_helper -> set Feb 13, 2025

sbomsdorf changed the title ~~Error in Error elitist_race -> race_wrapper -> race_wrapper_helper -> set~~ Error in elitist_race -> race_wrapper -> race_wrapper_helper -> set Feb 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in elitist_race -> race_wrapper -> race_wrapper_helper -> set #80

Error in elitist_race -> race_wrapper -> race_wrapper_helper -> set #80

sbomsdorf commented Jan 28, 2025

MLopez-Ibanez commented Jan 29, 2025 via email

sbomsdorf commented Jan 29, 2025

MLopez-Ibanez commented Jan 30, 2025

MLopez-Ibanez commented Feb 3, 2025

sbomsdorf commented Feb 3, 2025

MLopez-Ibanez commented Feb 3, 2025

sbomsdorf commented Feb 11, 2025

MLopez-Ibanez commented Feb 11, 2025

sbomsdorf commented Feb 13, 2025

sbomsdorf commented Feb 13, 2025

MLopez-Ibanez commented Feb 13, 2025

sbomsdorf commented Feb 13, 2025

Error in elitist_race -> race_wrapper -> race_wrapper_helper -> set #80

Error in elitist_race -> race_wrapper -> race_wrapper_helper -> set #80

Comments

sbomsdorf commented Jan 28, 2025

MLopez-Ibanez commented Jan 29, 2025 via email

sbomsdorf commented Jan 29, 2025

MLopez-Ibanez commented Jan 30, 2025

MLopez-Ibanez commented Feb 3, 2025

sbomsdorf commented Feb 3, 2025

MLopez-Ibanez commented Feb 3, 2025

sbomsdorf commented Feb 11, 2025

MLopez-Ibanez commented Feb 11, 2025

sbomsdorf commented Feb 13, 2025

sbomsdorf commented Feb 13, 2025

MLopez-Ibanez commented Feb 13, 2025

sbomsdorf commented Feb 13, 2025