Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize works unexpectedly on release build (a possible solution included) #19

Open
aferust opened this issue Jan 18, 2025 · 5 comments
Labels
enhancement New feature or request

Comments

@aferust
Copy link

aferust commented Jan 18, 2025

I wanted to figure out why my code works weirdly on release. I found that the residual delegate "y" always initializes with nan values. Then, I took a look at the code to detect this line. When I removed the debug guard, things worked as expected.

@9il
Copy link
Member

9il commented Jan 18, 2025

Hi @aferust could you please check the code works for you if only iwork is initialized?

@aferust
Copy link
Author

aferust commented Jan 19, 2025

No. Indeed, it is "work" that matters in terms of initialization; on the other hand, it has no effect if I initialize "iwork" or not in my case.

@9il
Copy link
Member

9il commented Jan 20, 2025

The bug is critical.

It is very interesting. There is definitely a bug. However, we can't just initialize work to 0 because it wouldn't fix the bug itself. The bug can be in Mir or in LAPACK implementation you are using. If I initialize work[] = T.nan;, mir-optim tests pass on my machine. Does mir-optim tests pass for you in case of work[] = T.nan;?

Could you please transform your code to a unittest that fails on you machine? This way I could find the bug.

@aferust
Copy link
Author

aferust commented Jan 20, 2025

Dear Ilya,

I have tested the below code from your unittest (with an extra writeln) on some environments, using LDC for each setting

  • release arg does nothing in run.dlang.org. so this is from debug.
    [0, 0] // no nan values, good
    [0, 0] // no nan values, good
    [-0, -0]
    [-0, -0]
    [-0, -0]
    [2.94671e-09, -2.88778e-09]

my ubuntu mate os (sudo apt install libopenblas-dev)
debug:
[0, 0] // no nan values, good
[0, 0] // no nan values, good
[-0, -0]
[-0, -0]
[-0, -0]
[2.94671e-09, -2.88778e-09]

release:
[nan, nan] // opps !!!!!!!!!!
[nan, nan] // opps !!!!!!!!!!
[-0, -0]
[-0, -0]
[-0, -0]
[2.94671e-09, -2.88778e-09]

my windows test uses Intel-MKL and the same problem persists for release builds. In your tests, you always assign y[0] = x[0]; and probably you don't get nan values with y. We print y here before it is changed. I can also use a dummy variable for x[0], but this is not what we want, right?

import std.stdio;

import mir.optim.least_squares;

void main()
{
    import mir.ndslice.allocation: slice;
    import mir.ndslice.slice: sliced;
    import mir.blas: nrm2;


    LeastSquaresSettings!double settings;
    auto x = [100.0, 100].sliced;
    auto l = x.shape.slice(-double.infinity);
    auto u = x.shape.slice(+double.infinity);
    optimize!(
        (x, y)
        {
	    writeln(y); /////// check nans here

            y[0] = x[0];
            y[1] = 2 - x[1];
        },
        (x, J)
        {
            J[0, 0] = 1;
            J[0, 1] = 0;
            J[1, 0] = 0;
            J[1, 1] = -1;
        },
    )(settings, 2, x, l, u);

    assert(nrm2((x - [0, 2].sliced).slice) < 1e-8);
}

@9il 9il added the enhancement New feature or request label Jan 20, 2025
@9il
Copy link
Member

9il commented Jan 20, 2025

Thank you for the example.
As far as I understand your code assumes that y and J should be preinitialized with 0.
I agree that this API behavior is better.

I will submit this feature soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants