Skip to content

Infinite Gradient Handling #582

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

Aziz-Shameem
Copy link

Added support for infinite gradient check as per this

Note that the method added also checks for nan values in grads along with infinite values, since both lead to the same errors in optimization and are therefore, related.

Let me know if any further changes/polishing is required. All the CI tests seem to pass.

Thanks

Copy link
Member

@timmens timmens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

This looks pretty good already. I have added a few review comments to your changes.

Additionally: Can you create a new test file tests/optimagic/optimization/test_invalid_function_value.py, and add two test cases where you check the cases of

  1. Infinite values and
  2. NaN values

These tests should fail without your changes, and pass with your changes; i.e., you will have to check that your code raises the correct error! We have many such tests in our test suite -- I would just search for "pytest.raises".

Thanks again and let me know if you have any questions!

@timmens
Copy link
Member

timmens commented Apr 28, 2025

Hey @Aziz-Shameem!

Just checking in — do you have a sense of the timeline for finishing up this PR? No rush if you're still working on it; just want to plan ahead a bit.

Thanks a lot!

@Aziz-Shameem
Copy link
Author

Hey @timmens

Really sorry for the delay. I was at ICLR, that was taking too much time during the day.
Since the conference has now ended, I should finish this up in a day or two max.

I also wanted to see if I could tackle the second part of this PR (deals with the error handling). I want to take that up after I submit this. Do you think that will be feasible ? @janosg did mention it requires more knowledge of the internals, so I will have to read up for it.

Thanks.

@janosg
Copy link
Member

janosg commented Apr 28, 2025

@Aziz-Shameem let's first finish this PR and then open a seperate one to implement the penalty approach instead of throwing the error.

@Aziz-Shameem
Copy link
Author

Aziz-Shameem commented Apr 29, 2025

Hey @timmens, @janosg
Made the changes, I have also added several tests for many input-combinations.
This does not currently handle numerical derivatives correctly, even though it checks for inf/nan values in the jacobian while evaluating numerical derivatives.

I changed the error message to make it more informative. It now reads as follows :
image

@Aziz-Shameem
Copy link
Author

For handling such cases when using numerical derivatives :
I noticed assert_finite_jac() catches a portion of the cases itself. The others seem to be like the example I attach below :
image

Maybe we could fix a certain (±)THRESHOLD value, have it flag it as infinite when an element of the gradient crosses it, and throw the same error as for the cases when jac is provided explicitly.
If so, this could be added inside this check-function itself.

Copy link
Member

@timmens timmens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the changes, it looks much better now! There are a few things to consider before going forward though:

Feature:

  • Since _assert_finite_jac does not depend on self, I'd make this a regular function, just like _process_fun_value below.
  • For the error message to be more informative, I think we can go one step further.
    • I would add a keyword argument origin: Literal["numerical", "jac", "fun_and_jac"]. I.e. you write params: PyTree, *, origin: ....
    • For the error message you can then make the distinction whether the error occurred in a user provided derivative (jac or fun_and_jac case), or whether it occurred during a numerical derivative, in which case the user needs to check the criterion function fun.
    • The resulting errors could look like this:
      • The optimization failed because the derivative (via jac) contains infinite or NaN values. Please validate the derivative function.
      • The optimization failed because the derivative (via fun_and_jac) contains infinite or NaN values. Please validate the derivative function.
      • The optimization failed because the numerical derivative (computed using fun) contains infinite or NaN values. Please validate the criterion function.
    • Of course, you would still also print the parameter and Jacobian values.
    • Make sure to consistently capitalize Jacobian!

Tests:

You have many tests for different combinations, which is great! However, there is a lot of code duplication, and in this case one general PyTree parameter should be enough. I have added a proposal below. If you customize the error message per origin, then ideally, you should adjust the message in the tests as well, so that we can additionally test that the correct error message is thrown.

Proposal:

import numpy as np
import pytest

from optimagic.exceptions import UserFunctionRuntimeError
from optimagic.optimization.optimize import minimize


# ======================================================================================
# Test setup:
# --------------------------------------------------------------------------------------
# We test that minimize raises an error if the user function returns a jacobian
# containing invalid values (e.g. np.inf, np.nan). To test that this works not only at
# the start parameters, we create jac functions that return invalid values if the
# parameter norm becomes smaller than 1. For this test, we assume the following
# parameter structure: {"a": 1, "b": np.array([2, 3])}
# ======================================================================================

def sphere(params):
    return params["a"] ** 2 + (params["b"] ** 2).sum()


def sphere_gradient(params):
    return {"a": 2 * params["a"], "b": 2 * params["b"]}


def sphere_and_gradient(params):
    return sphere(params), sphere_gradient(params)
        

def params_norm(params):
    squared_norm = (
        params["a"] ** 2 + np.linalg.norm(params["b"]) ** 2
    )
    return np.sqrt(squared_norm)


def get_invalid_jac(invalid_jac_value):
    """Get function that returns invalid jac if the parameter norm < 1."""

    def jac(params):
        if params_norm(params) < 1:
            return invalid_jac_value
        else:
            return sphere_gradient(params)
        
    return jac

def get_invalid_fun_and_jac(invalid_jac_value):
    """Get function that returns invalid fun and jac if the parameter norm < 1."""

    def fun_and_jac(params):
        if params_norm(params) < 1:
            return sphere(params), invalid_jac_value
        else:
            return sphere_and_gradient(params)

    return fun_and_jac

# ======================================================================================
# Tests
# ======================================================================================


INVALID_JACOBIAN_VALUES = [
    {"a": np.inf, "b": 2 * np.array([1, 2])},
    {"a": 1, "b": 2 * np.array([np.inf, 2])},
    {"a": np.nan, "b": 2 * np.array([1, 2])},
    {"a": 1, "b": 2 * np.array([np.nan, 2])},
]


@pytest.mark.parametrize("invalid_jac_value", INVALID_JACOBIAN_VALUES)
def test_minimize_with_invalid_jac(invalid_jac_value):
    params = {
        "a": 1,
        "b": np.array([3, 4]),
    }

    with pytest.raises(
        UserFunctionRuntimeError,
        match="The optimization received Jacobian containing infinite"
    ):
        minimize(
            fun=sphere,
            params=params,
            algorithm="scipy_lbfgsb",
            jac=get_invalid_jac(invalid_jac_value),
        )


@pytest.mark.parametrize("invalid_jac_value", INVALID_JACOBIAN_VALUES)
def test_minimize_with_invalid_fun_and_jac(invalid_jac_value):
    params = {
        "a": 1,
        "b": np.array([3, 4]),
    }

    with pytest.raises(
        UserFunctionRuntimeError,
        match="The optimization received Jacobian containing infinite"
    ):
        minimize(
            params=params,
            algorithm="scipy_lbfgsb",
            fun_and_jac=get_invalid_fun_and_jac(invalid_jac_value),
        )

Comment on lines +14 to +17
from optimagic.exceptions import (
UserFunctionRuntimeError,
get_traceback,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from optimagic.exceptions import (
UserFunctionRuntimeError,
get_traceback,
)
from optimagic.exceptions import UserFunctionRuntimeError, get_traceback

There is no change here anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants