Fix `_masterFunc2` fail flag caching and add fail flag identification to IPOPT #407

eytanadler · 2024-06-21T01:52:52Z

Purpose

Some optimizers don't simultaneously call _masterFunc with the objective and constraint evaluation. In these cases, if the primal fails, the fail flag will only return the accurate value when the cache is not used (the case that actually calls the objective/constraint function). When the cache is used, the fail flag is always returned as 0. A similar behavior occurs with the gradient evaluation. This PR caches the fail flag to fix this edge case.

Secondly, the pyIPOPT wrapper does not do anything if the _masterFunc returns a True fail flag (1). Even if any of the functions fail, IPOPT will just push on as if nothing happened. A CFD-based optimization case I saw with this was that it would get a mesh failure, immediately evaluate the gradients, go straight back to a mesh failure, and so on. The output file was as if nothing had failed and everything was normal. This is clearly not what should happen. I solved it by returning np.array(np.NaN) in the callback functions if fail == 1.

There are two changes in particular that I'd like feedback on:

There's a certain rare logic branch: the gradients are evaluated for a design variable vector at which the primal has not been evaluated. In this case, _masterFunc2 calls itself to evaluate the primal first. In the current pyOptSparse implementation, if this primal fails, that failure is totally ignored. I modified it so that if the primal fails in this case, the gradient _masterFunc2 evaluation will return that failure value, even if the gradient evaluation succeeds. See lines 452--458 and 511-517 of the new code. Is this the right thing to do?
The only way I could figure out to tell IPOPT that the evaluation failed is to return np.array(np.NaN) in whatever callback function is called. I don't see any more elegant way based on their docs. Although the shape isn't guaranteed to match the function's usual array shape, it appears to work based on my test cases. Does this seem like a reasonable way to solve this problem?

P.S. should I update the patch version?

Expected time until merged

A week

Type of change

Bugfix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (non-backwards-compatible fix or feature)
Code style update (formatting, renaming)
Refactoring (no functional changes, no API changes)
Documentation update
Maintenance update
Other (please describe)

Testing

Run the tests I added. Also try running a simple IPOPT optimization case where some of the functions periodically fail. It should backtrack properly (it would just ignore the fail flag previously). For example:

import numpy as np
from pyoptsparse import Optimization, OPT

iters = 0
def objfunc(xdict):
    x = xdict["xvars"]
    funcs = {"obj": x[0]**2 + np.sin(x[0] - 0.5)}

    # Fail every fourth iteration
    global iters
    fail = iters % 4 == 2
    iters += 1

    return funcs, fail

def sensfunc(xdict, funcsDict):
    x = xdict["xvars"]  # Extract array
    funcsSens = {"obj": {"xvars": [2 * x[0] + np.cos(x[0] - 0.5)]}}
    fail = False
    return funcsSens, fail

# Instantiate Optimization Problem
optProb = Optimization("Optimization problem", objfunc)
optProb.addVarGroup("xvars", 1, "c", value=3, scale=1.0)
optProb.addObj("obj")

# Create optimizer
opt = OPT("IPOPT", options={})
sol = opt(optProb, sens=sensfunc)

The IPOPT output should look something like this (note the alpha cutback warnings, which do not appear with the current implementation):

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  9.5984721e+00 0.00e+00 5.20e+00   0.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  4.4065559e+00 0.00e+00 5.30e+00 -11.0 5.20e+00    -  1.00e+00 1.00e+00f  1
Warning: Cutting back alpha due to evaluation error
   2 -1.9724310e-01 0.00e+00 1.59e+00 -11.0 2.62e+00    -  1.00e+00 5.00e-01f  2
   3 -6.2890484e-01 0.00e+00 3.02e-02 -11.0 5.62e-01    -  1.00e+00 1.00e+00f  1
   4 -6.2907135e-01 0.00e+00 1.51e-03 -11.0 1.05e-02    -  1.00e+00 1.00e+00f  1
Warning: Cutting back alpha due to evaluation error
   5 -6.2907166e-01 0.00e+00 7.55e-04 -11.0 5.53e-04    -  1.00e+00 5.00e-01f  2
   6 -6.2907176e-01 0.00e+00 5.10e-08 -11.0 2.76e-04    -  1.00e+00 1.00e+00f  1
   7 -6.2907176e-01 0.00e+00 1.72e-12 -11.0 1.86e-08    -  1.00e+00 1.00e+00f  1

Checklist

I have run flake8 and black to make sure the Python code adheres to PEP-8 and is consistently formatted
I have formatted the Fortran code with fprettify or C/C++ code with clang-format as applicable
I have run unit and regression tests which pass locally with my changes
I have added new tests that prove my fix is effective or that my feature works
I have added necessary documentation

codecov · 2024-06-21T02:03:30Z

Codecov Report

Attention: Patch coverage is 84.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 62.97%. Comparing base (da0077a) to head (fc15ad5).
Report is 4 commits behind head on main.

Files with missing lines	Patch %	Lines
pyoptsparse/pyIPOPT/pyIPOPT.py	66.66%	4 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main     #407       +/-   ##
===========================================
- Coverage   74.33%   62.97%   -11.36%     
===========================================
  Files          22       22               
  Lines        3300     3317       +17     
===========================================
- Hits         2453     2089      -364     
- Misses        847     1228      +381

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

eytanadler · 2024-06-21T02:03:49Z

Does the Windows action failure have something to do with my changes? It doesn't seem like it, but I'm not very familiar with it

ewu63

Good stuff. I had known that the fail flag is not passed to IPOPT correctly (in fact I think SNOPT is the only one that supports it). From my cursory read of the docs passing NaN seems reasonable (and is in fact how many other optimizers prefer to handle failed evals.
As to the implementation, it looks good, I just left a minor comment if we can deal with the cache once instead of running the same line in multiple blocks.

As for the Windows failure, I suspect it's related to the recent release of numpy 2.0 (which broke our meson build in multiple ways, I haven't had time to investigate). See here for a similar log. I would suggest pinning numpy<2 in the env file first.

pyoptsparse/pyOpt_optimizer.py

tests/test_optimizer.py

pyoptsparse/pyIPOPT/pyIPOPT.py

pyoptsparse/pyOpt_optimizer.py

eytanadler · 2024-06-21T11:22:20Z

Good stuff. I had known that the fail flag is not passed to IPOPT correctly (in fact I think SNOPT is the only one that supports it). From my cursory read of the docs passing NaN seems reasonable (and is in fact how many other optimizers prefer to handle failed evals. As to the implementation, it looks good, I just left a minor comment if we can deal with the cache once instead of running the same line in multiple blocks.

As for the Windows failure, I suspect it's related to the recent release of numpy 2.0 (which broke our meson build in multiple ways, I haven't had time to investigate). See here for a similar log. I would suggest pinning numpy<2 in the env file first.

Thanks for the speedy review! I pinned the version of NumPy in the setup.py (I assume this is what you meant by env file?) to <2, but the Windows action still seems to fail. I also bumped pyOptSparse's patch version because I think this is a substantial enough change that it'd be worth versioning it. @ewu63, let me know if you think otherwise.

ewu63 · 2024-06-21T16:46:44Z

Thanks for the speedy review! I pinned the version of NumPy in the setup.py (I assume this is what you meant by env file?) to <2, but the Windows action still seems to fail. I also bumped pyOptSparse's patch version because I think this is a substantial enough change that it'd be worth versioning it. @ewu63, let me know if you think otherwise.

Windows builds do not use setuptools but instead uses conda. The env file used by GHA is here. Also fine with the patch version bump.

eytanadler · 2024-06-21T18:47:17Z

Thanks for the speedy review! I pinned the version of NumPy in the setup.py (I assume this is what you meant by env file?) to <2, but the Windows action still seems to fail. I also bumped pyOptSparse's patch version because I think this is a substantial enough change that it'd be worth versioning it. @ewu63, let me know if you think otherwise.

Windows builds do not use setuptools but instead uses conda. The env file used by GHA is here. Also fine with the patch version bump.

Yay it worked!

ewu63

Good work!

marcomangano

Looking great! Thanks for adding the tests, you went beyond what I had in mind. Very elegant setup.

We should double check failure handling with other optimizers and maybe add a test, but that should be a separate PR. The changes in pyIPOPT make sense.

eytanadler · 2024-06-23T18:59:16Z

Looking great! Thanks for adding the tests, you went beyond what I had in mind. Very elegant setup.

We should double check failure handling with other optimizers and maybe add a test, but that should be a separate PR. The changes in pyIPOPT make sense.

Thanks! I agree failure handling should be checked, but I didn’t have an immediate good idea for how to test that. I’m sure there’s a way.

eytanadler added 6 commits June 20, 2024 13:32

Cache fail flag

21e58a7

Return NaN from IPOPT callback functions if evaluation failure

9312ab6

Fixed typo in copy pasted code

d883d22

Fail flag tests with _masterFunc

14cffac

Undo accidental change of ipopt linear solver

95bbbbc

Simplify boolean evaluation

ded7b6a

eytanadler added the bug Something isn't working label Jun 21, 2024

eytanadler requested a review from a team as a code owner June 21, 2024 01:52

eytanadler requested review from lamkina, ArshSaja, ewu63 and marcomangano and removed request for lamkina and ArshSaja June 21, 2024 01:52

Formatting

788ac14

A bit more test coverage

79ea720

ewu63 reviewed Jun 21, 2024

View reviewed changes

pyoptsparse/pyOpt_optimizer.py Show resolved Hide resolved

tests/test_optimizer.py Show resolved Hide resolved

VascoSch92 reviewed Jun 21, 2024

View reviewed changes

pyoptsparse/pyIPOPT/pyIPOPT.py Show resolved Hide resolved

pyoptsparse/pyIPOPT/pyIPOPT.py Show resolved Hide resolved

pyoptsparse/pyOpt_optimizer.py Show resolved Hide resolved

eytanadler added 6 commits June 21, 2024 06:52

A couple more unit tests to thoroughly test failure flag caching

4fd147a

More thorough test parameterization

604b7a8

Bounce the interpretation

38b4eda

Upper (exclusive) bound of NumPy v2

1b7d6f1

Last last unit test

49131ff

Formatting

08c3e7e

eytanadler requested a review from ewu63 June 21, 2024 11:22

Upper numpy bound in conda environemnt

fc15ad5

ewu63 approved these changes Jun 21, 2024

View reviewed changes

marcomangano approved these changes Jun 23, 2024

View reviewed changes

marcomangano merged commit 7376d71 into mdolab:main Jun 23, 2024
12 of 13 checks passed

eytanadler deleted the fix_fail branch June 23, 2024 18:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `_masterFunc2` fail flag caching and add fail flag identification to IPOPT #407

Fix `_masterFunc2` fail flag caching and add fail flag identification to IPOPT #407

eytanadler commented Jun 21, 2024 •

edited

Loading

codecov bot commented Jun 21, 2024 •

edited

Loading

eytanadler commented Jun 21, 2024

ewu63 left a comment

eytanadler commented Jun 21, 2024

ewu63 commented Jun 21, 2024

eytanadler commented Jun 21, 2024

ewu63 left a comment

marcomangano left a comment

eytanadler commented Jun 23, 2024

Fix _masterFunc2 fail flag caching and add fail flag identification to IPOPT #407

Fix _masterFunc2 fail flag caching and add fail flag identification to IPOPT #407

Conversation

eytanadler commented Jun 21, 2024 • edited Loading

Purpose

Expected time until merged

Type of change

Testing

Checklist

codecov bot commented Jun 21, 2024 • edited Loading

Codecov Report

eytanadler commented Jun 21, 2024

ewu63 left a comment

Choose a reason for hiding this comment

eytanadler commented Jun 21, 2024

ewu63 commented Jun 21, 2024

eytanadler commented Jun 21, 2024

ewu63 left a comment

Choose a reason for hiding this comment

marcomangano left a comment

Choose a reason for hiding this comment

eytanadler commented Jun 23, 2024

Fix `_masterFunc2` fail flag caching and add fail flag identification to IPOPT #407

Fix `_masterFunc2` fail flag caching and add fail flag identification to IPOPT #407

eytanadler commented Jun 21, 2024 •

edited

Loading

codecov bot commented Jun 21, 2024 •

edited

Loading