Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion regarding errorVals-absoluteError-relativeError #741

Open
prisae opened this issue Jun 25, 2024 · 6 comments
Open

Discussion regarding errorVals-absoluteError-relativeError #741

prisae opened this issue Jun 25, 2024 · 6 comments
Assignees

Comments

@prisae
Copy link
Contributor

prisae commented Jun 25, 2024

Data error computation

This is partly an issue (problematic default behaviour), and partly a discussion (error calculation).

Issue: Problematic default behaviour

In pygimli.frameworks.Inversion().run() and pygimli.frameworks.MarquardtInversion().run() the parameter errorVals is subtly deprecated in favour of absoluteError and relativeError.

If errorVals is not given, it is calculated via:

if errorVals is None:  # use absoluteError and/or relativeError instead
    absErr = kwargs.pop("absoluteError", 0)
    relErr = kwargs.pop("relativeError",
                        0.01 if np.allclose(absErr, 0) else 0)
    errorVals = pg.abs(absErr / np.asarray(dataVals)) + relErr

In some methods, such as marine CSEM, your data can have very small values, in the range of 1e-10 and less. This will result in np.allclose(absErr, 0) = True and hence give it a relErr = 0.01, which is useless (because the default atol of np.allclose is 1e-08).

What about either:

    relErr = kwargs.pop("relativeError",
                        0.01 if np.allclose(absErr, 0, atol=0) else 0)

or (my preference), not defaulting at all, but raise an error,

    relErr = kwargs.pop("relativeError", 0)
    if np.any(np.isclose(absErr + relErr, 0, atol=0)):
        raise Error that at least one of abs/rel error must be provided

Discussion: Error calculation

If the user provides absolute and/or relative errors, the error values are computed as follows

errorVals = pg.abs(absErr / np.asarray(dataVals)) + relErr

However, given error propagation, I think it should be

errorVals = np.sqrt(pg.abs(absErr / np.asarray(dataVals))**2 + relErr**2)

This will mostly not have a big impact at all, only in the zone where the data values approach the noise level.

@prisae
Copy link
Contributor Author

prisae commented Jun 25, 2024

I thought I'd discuss both here before making a PR.

@halbmy
Copy link
Contributor

halbmy commented Jul 1, 2024

Very good point that did not pop up before. Marine csem might indeed scale badly. Throwing an error is the better choice.

However, I disagree with the error propagation. The relative and absolute errors are not independent error sources, but simple models to estimate errors, e.g. from reciprocal analysis etc. which is common practise and simple of relative or absolute errors are considered are the main source (and small values to catch near zero data). A more rigorous error class could help supporting and analysing data and error statistics.

@halbmy halbmy self-assigned this Jul 1, 2024
@prisae
Copy link
Contributor Author

prisae commented Jul 1, 2024

I'm OK to agree on disagreeing regarding noise. In this case, an error class might be indeed good, or at least not deprecating errorVals, so the user can provided directly the error values instead of absError and relError.

@halbmy
Copy link
Contributor

halbmy commented Jul 8, 2024

Well, the reason for deprecating errorVals was because it is not clear whether it is an absolute (e.g. typical for traveltime tomography) or relative (e.g. typical for ERT as it is the same for u, r and rhoa), and to make way for a bit more rigorous handling (right now internally relative errors are used but this can lead to problems for near-zero data).

@halbmy
Copy link
Contributor

halbmy commented Sep 24, 2024

I totally agree with your suggestions and implemented them (171e47e)

  • In order to keep things running, I accepted your first choice (atol=0) for the current (classic) inversion
  • for the future inversions based on InversionBase, I took your second (preferred) choice throwing an error if no error is given

@prisae
Copy link
Contributor Author

prisae commented Sep 27, 2024

Great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants