Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack traces in (error) logs #512

Open
soxofaan opened this issue Sep 26, 2023 · 7 comments
Open

Stack traces in (error) logs #512

soxofaan opened this issue Sep 26, 2023 · 7 comments
Labels
job management incl. /result minor requires a minor-version (x.1.0 for example) service management
Milestone

Comments

@soxofaan
Copy link
Member

(Follow-up issue for #455)

As mentioned in #455 putting full error stack traces in log messages is not a good idea, we should spec out a separate field for that.

@m-mohr m-mohr added this to the 1.3.0 milestone Sep 26, 2023
@m-mohr m-mohr added job management incl. /result service management minor requires a minor-version (x.1.0 for example) labels Sep 30, 2023
@m-mohr
Copy link
Member

m-mohr commented Oct 19, 2023

Hmm, while I like getting rid of the stack traces in messages, I'm not sure whether they actually need to be exposed to users?
Can't you just generate a unique per log, put it in the id field and store the stack trace internally?

@jdries
Copy link

jdries commented Oct 20, 2023

I think they should be exposed. We have one user segment of advanced coders that are used to dealing with stack traces. Their main criticism of openEO is that it's a 'black box', which makes them feel out of control. Giving the possibility to see stack traces would make a connection to the open source code which they can then inspect and even improve.

There's also the case of UDF's: there it is in fact the user code that can generate stack traces, so they need to see them to resolve issues.

@m-mohr
Copy link
Member

m-mohr commented Oct 20, 2023

Okay, I'll buy that. UDFs is a really compelling argument. Then let's find a definition for it.

@EmileSonneveld
Copy link

This would be a nice example of how an error message from within an UDF would look like:

OpenEO batch job failed: UDF exception while evaluating processing graph. Please check your user defined functions.
  File "<string>", line 62, in apply_udf_data
  File "/opt/venv/lib64/python3.8/site-packages/sklearn/ensemble/_forest.py", line 865, in predict_proba
    X = self._validate_X_predict(X)
  File "/opt/venv/lib64/python3.8/site-packages/sklearn/ensemble/_forest.py", line 599, in _validate_X_predict
    X = self._validate_data(X, dtype=DTYPE, accept_sparse="csr", reset=False)
  File "/opt/venv/lib64/python3.8/site-packages/sklearn/base.py", line 580, in _validate_data
    self._check_feature_names(X, reset=reset)
  File "/opt/venv/lib64/python3.8/site-packages/sklearn/base.py", line 507, in _check_feature_names
    raise ValueError(message)
ValueError: The feature names should match those that were passed during fit.
Feature names seen at fit time, yet now missing:
- 05-10
- 05-11
- 15-10
- 15-11
- 25-10
- ...

Now the user sees this in the editor, making it non-attractive to read:
image

@m-mohr
Copy link
Member

m-mohr commented Dec 18, 2024

Should a stracktrace field be just a string field (probably not ideal?) or an array of objects with properties such as file, line, column, context (e.g. method/function), and details (e.g. code excerpt)?

@soxofaan
Copy link
Member Author

soxofaan commented Jan 6, 2025

hmm it might not be easy to find a common base for stack traces in various tech stacks. In case of the geopyspark driver we even have stack traces that have a part in Java/scala and a part in Python. And with that proof of concept UDF runtime for R, I think we even could produce stack traces that are a mix of R + Python + Java/Scala.

Should a stracktrace field be just a string field ?

I think we can at least assume that it's possible to make the stack trace indeed an array of items, and not a single string dump.
As for the properties of these items, I'm fine with file, line, column and context. Maybe use src to put the code excerpt?

@m-mohr
Copy link
Member

m-mohr commented Jan 7, 2025

I think a small survey of the most common languages would make sense to see what the stacktrace related functions return.

Language file line column context src others? comments / see
Python
C++ / C
Java / Scala string only?
C# string only?
JS / TS / Node - - - - - - string only, not standardized, see https://tc39.es/proposal-error-stacks/ - Alternative: https://github.com/stacktracejs/stacktrace.js
R
Julia file line - func / linfo - from_c, inlined, pointer https://github.com/JuliaLang/julia/blob/5e9a32e7af2837e677e60543d4a15faa8d3a7297/base/stacktraces.jl#L49
Go
Rust
PHP file line - class / function / object - type, args https://www.php.net/manual/en/function.debug-backtrace.php

Contributions are welcome.
See also https://en.wikipedia.org/wiki/Stack_trace

Looking into this more, I'm wondering whether it may be simpler to just make it an array of strings although I like the structured approach much more. Honestly I'm also wondering to what extend stack traces should be exposed by backends. Ideally you may want to limit to the user-submitted code and not expose the full depth of traces...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
job management incl. /result minor requires a minor-version (x.1.0 for example) service management
Projects
None yet
Development

No branches or pull requests

4 participants