C stress test failed #149

zaikunzhang · 2024-01-29T10:36:56Z

See the branch

https://github.com/libprima/prima/tree/c_stress_test_failure

and the workflow run

https://github.com/libprima/prima/actions/runs/7934551296/job/21665774763

zaikunzhang · 2024-02-09T04:54:34Z

Could you take a look at this? There is a bug here (I have created a branch to reproduce the bug easily: https://github.com/libprima/prima/tree/c_stress_test_failure).

In the new version (which has other issues as you mentioned; I will fix them), the bug is not triggered anymore but we do not know why. There is nothing more scary than a magically disappeared bug.

Many thanks.

Best regards,
Zaikun

nbelakovski · 2024-03-24T14:36:39Z

I think I see the issue. result is not initialized, and in C variables that aren't initialized can theoretically have garbage data in them (basically the contents of whatever memory address they were given). In the branch linked, all of the algorithms except cobyla fail prima_check_problem with the return code indicating a problem mismatch (because problem.calcfc is set for all of them), and so prima_init_result is never called. Then stress.c called prima_free_result on uninitialized memory and hence tried to free memory that was not allocated.

I've been unable to reproduce this locally, even when using the same intel compiler, so the above is theoretical but I think it's a compelling case, particularly since the issue does not arise for cobyla (also note the failed windows test is due to the issue with the setup-fortran action linking the wrong libgfortran which we fixed).

Note that calcfc is no longer specified for all algorithms, it appears this was corrected in c85e631.

However the potential issue of the user calling prima_free_result on uninitialized memory still exists. I see two options for fixing this:

We can call prima_init_result before we call prima_check_problem. However if prima_check_problem fails then we should free the memory and set result.x/result.nlconstr to 0 before returning.
We call memset(result, 0, sizeof(prima_result_t)); at the very beginning of prima_minimize. In this option we keep the order of prima_init_result and prima_check_problem as they are (checking the problem before initializing).

Either option allows the user to call prima_free_result afterwards with no ill effects, since it guarantees that result.x and result.nlconstr will be 0 or they will be allocated. I think option 2 is better since it helps to avoid unnecessary calls to malloc and free.

zaikunzhang · 2024-03-25T00:05:11Z

set result.x/result.nlconstr to 0

We should not set them to an arbitrary normal value if they are not defined yet. (why 0? Why not 1? 100?)

If they are not defined yet, then they should have a velue that indicates "uninitialised" (in the mathematical sense, which is different from "uninitialised" in the programming sense). The ideal value would be NaN.

This is w very important point. Otherwise, the logic is wrong and it may lead to mysterious bugs.

However, here the situation is different. result.x should be x0, and the other should be the constraint value at x0.

BTW, I am against passing the value of x0 to minimize using result.x. The logic is wrong. Before the optimization starts, result.x should be uninitialised (in the mathematical sense).

Thank you.

zaikunzhang · 2024-03-25T00:17:04Z

We should not set them to an arbitrary normal value if they are not defined yet. (why 0? Why not 1? 100?)

BTW, I am against passing the value of x0 to minimize using result.x. The logic is wrong. Before the optimization starts, result.x should be uninitialised (in the mathematical sense).

Let me put is in this way: at any moment, the content of "result" should be consistent. It cannot be partially initialised and partially not (unless the initialisation is being done), and it cannot contain values that seems "normal" when it is indeed "uninitialised" (mathematically), giving us the wrong impression that it is initialised when it is indeed not.

This resolves libprima#149.

nbelakovski · 2024-03-25T02:31:32Z

set result.x/result.nlconstr to 0

We should not set them to an arbitrary normal value if they are not defined yet. (why 0? Why not 1? 100?)

I wasn't clear here: result.x and result.nlconstr are pointers, so I meant setting them to NULL (which is also 0), not an arbitrary value.

As for the other points we discussed them in your office and I've now opened #180 to implement the changes.

zaikunzhang added c Issues related to the C interface or implementation cmake Issues related to CMake ci Issues related to CI labels Jan 29, 2024

nbelakovski added a commit to nbelakovski/prima that referenced this issue Mar 25, 2024

Always initialize the result.

aefb7c9

This resolves libprima#149.

nbelakovski mentioned this issue Mar 25, 2024

Always initialize the result. #180

Merged

zaikunzhang closed this as completed in #180 Mar 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C stress test failed #149

C stress test failed #149

zaikunzhang commented Jan 29, 2024 •

edited

Loading

zaikunzhang commented Feb 9, 2024 •

edited

Loading

nbelakovski commented Mar 24, 2024 •

edited

Loading

zaikunzhang commented Mar 25, 2024

zaikunzhang commented Mar 25, 2024 •

edited

Loading

nbelakovski commented Mar 25, 2024

C stress test failed #149

C stress test failed #149

Comments

zaikunzhang commented Jan 29, 2024 • edited Loading

zaikunzhang commented Feb 9, 2024 • edited Loading

nbelakovski commented Mar 24, 2024 • edited Loading

zaikunzhang commented Mar 25, 2024

zaikunzhang commented Mar 25, 2024 • edited Loading

nbelakovski commented Mar 25, 2024

zaikunzhang commented Jan 29, 2024 •

edited

Loading

zaikunzhang commented Feb 9, 2024 •

edited

Loading

nbelakovski commented Mar 24, 2024 •

edited

Loading

zaikunzhang commented Mar 25, 2024 •

edited

Loading