Improve `reqwests` error display #6377

crepererum · 2024-09-10T09:25:39Z

Originally reported in #5882 (comment) :

For #5882 we only report Generic S3 error: error decoding response body. We should try to improve the error message. It seems that the Display impl. is not super helpful and we might wanna use Debug instead, see seanmonstar/reqwest#2373

The text was updated successfully, but these errors were encountered:

tustvold · 2024-09-10T09:28:02Z

One thing to be aware of is exposing sensitive headers, or URL path segments.

We could also provide information on common causes of this error - e.g. network/runtime instability

itsjunetime · 2024-09-13T16:04:36Z

take

erratic-pattern · 2024-09-30T00:13:31Z

take

erratic-pattern · 2024-10-06T17:00:52Z

One thing to be aware of is exposing sensitive headers, or URL path segments. We could also provide information on common causes of this error - e.g. network/runtime instability

The documentation for reqwest::Error only mentions URLs as a concern. Hopefully this means that there are no headers in the output, but I do not know for certain.

There is reqwest::Error::without_url which can be used to scrub the URL from the reqwest error. Either the user could explicitly call this when they have sensitive information in URLs, or we could provide a helper method to accomplish this.

Another option is to have for each object storage API provide its own scrubbing method to eliminate specific sensitive headers from AWS, Azure, etc. I do not know enough about these object storage APIs to make this change.

For now I am just going to change to using the Debug information to see if it provides enough information.

see apache#6377 This should help in the interim to identify specific causes of, sometimes confusing, network related errors. I think this change should be temporary until we can find a better way to improve error messages.

crepererum · 2024-10-11T09:55:04Z

Regarding the argument that @tustvold brought up here: #6518 (comment) and that basically boils down to "people should walk the error chain":

I think the issue here is that object_store is somewhat inconsistent: for some errors, Display prints the entire chain e1: e2: e3 but for reqwest errors it does not. That also make the proposal to "walk the error chain" somewhat impractical, because if you walk the chain and use Display to print the chain parts, you gonna repeat some errors. So our standpoint is to really tell people that the chain should be walked, then stuff like this

arrow-rs/object_store/src/aws/client.rs

Lines 69 to 70 in 4adbeee

    
           #[snafu(display("Error performing DeleteObjects request: {}", source))] 
        
           DeleteObjectsRequest { source: crate::client::retry::Error },

is wrong and

#[snafu(display("Error bla bla: {}", source))]

should be

#[snafu(display("Error bla bla"))]

tustvold · 2024-10-11T10:38:34Z

I personally agree that walking the error chain should be the approach, it's the pattern used by the vast majority of other languages, but unfortunately support for this in the ecosystem is limited. If Rust had supported this from day one, perhaps we wouldn't be in this mess, but alas it did not. As it stands I'm honestly not even sure what the consensus is...

Moving to this particular issue, I think we need to surface enough information within the default display implementation to diagnose issues, however, this does not mean we should just expose everything, including what may be sensitive or internal details. My point w.r.t walking the error chain is that in the absence of a clearly articulated error scenario the only option is to either expose everything, which I don't agree with, or instruct people to walk the error chain to get the more detailed information they require.

Perhaps we could reframe this issue as a concrete, what would I like to be in the output that currently is not

crepererum · 2024-10-11T10:44:47Z

Perhaps we could reframe this issue as a concrete, what would I like to be in the output that currently is not

Fine with me. I think #5882 provides a good example. The reqwest error that is stringified only says error decoding response body which is somewhat misleading w/o looking at the error chain. It sounds like the body data is malformed or the data got scrambled, but if you walk the error chain you'll often see it's simply a timeout that occurred while fetching the body bytes. So I wonder if we should wrap reqwest::Error into another error type that tries to extract SOME useful information (it could itself walk the chain and find the underlying IO error for example).

tustvold · 2024-10-11T14:08:09Z

This sounds reasonable to me, it certainly would help users to make more clear that error means that the request was interrupted mid-stream, even if there isn't really anything more that can be said about why.

erratic-pattern · 2024-10-14T20:40:11Z

I feel that cramming more error sources into the Display is a step in the wrong direction. Despite the half-baked edges of Rust error handling making this a tricky decision for library maintainers, following standard practices should lead to a better outcome for both application developers and library developers.

As @crepererum pointed out, right now for a application developer, there is a confusing incompatibility in error messaging between Arrow ecosystem crates and (most?) other third-party crates. Using Display seems to "just work" for getting detailed information for Arrow crates, until the error goes into a third-party crate and then it doesn't.

"So just use an error reporting library, "as the reqwest maintainers suggest. Well, no. Even if you follow "best practice" you're still hindered as an application developer because using any of the available error reporting crates with arrow-rs will produce very redundant and ugly error reports.

To highlight this visually I've made an example repo that creates a simple "application" using object_store with a popular error reporting crate called color_eyre. It attempts to connect to an S3 bucket that doesn't exist and then displays an error.

Here is the output:

Error:
   0: Generic S3 error: Error after 10 retries in 5.723990834s, max_retries:10, retry_timeout:180s, source:error sending request for url (http://169.254.169.254/latest/api/token)
   1: Error after 10 retries in 5.723990834s, max_retries:10, retry_timeout:180s, source:error sending request for url (http://169.254.169.254/latest/api/token)
   2: error sending request for url (http://169.254.169.254/latest/api/token)
   3: client error (Connect)
   4: tcp connect error: Host is down (os error 64)
   5: Host is down (os error 64)

Location:
   examples/object_store.rs:11

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

Maybe this is tolerable, but it's definitely ugly. You are at least getting as much information about the error chain as possible. Additionally, by cramming more information into the top-level error message, you're helping users in the default case to discover useful errors beyond just "Generic S3 error". So there is an argument to be made that the current state is acceptable, even if not ideal..

I think this discussion is beyond the scope of just "improve reqwests error display" and we should create an issue to discuss a more broad improvement to error messaging with both the default rust panic handler as well as the color_eyre (or similar crate) panic handler. If that sounds reasonable, I can write one up with points made from this discussion.

I would also like to create a similar example repo for DataFusion to bring up the discussion there, since they also use the same approach to error messaging, but I think the discussion needs to happen here primarily since DataFusion is largely required to follow whatever convention arrow-rs takes.

erratic-pattern · 2024-10-14T20:55:23Z

From our (InfluxData's) side, we are most likely going to unroll the cause chain for object_store errors manually.

crepererum mentioned this issue Sep 10, 2024

error decoding response body after upgrade to object store 0.10 #5882

Open

crepererum added object-store Object Store Interface enhancement Any new improvement worthy of a entry in the changelog good first issue Good for newcomers labels Sep 10, 2024

crepererum self-assigned this Sep 10, 2024

github-actions bot assigned itsjunetime Sep 13, 2024

crepererum removed their assignment Sep 15, 2024

github-actions bot assigned erratic-pattern Sep 30, 2024

itsjunetime removed their assignment Sep 30, 2024

erratic-pattern mentioned this issue Oct 6, 2024

use Debug impl for generic Reqwest errors to provide more info #6518

Closed

djanderson mentioned this issue Nov 23, 2024

Add example for using a separate threadpool for CPU bound work apache/datafusion#13424

Draft

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `reqwests` error display #6377

Improve `reqwests` error display #6377

crepererum commented Sep 10, 2024

tustvold commented Sep 10, 2024 •

edited

Loading

itsjunetime commented Sep 13, 2024

erratic-pattern commented Sep 30, 2024

erratic-pattern commented Oct 6, 2024

crepererum commented Oct 11, 2024

tustvold commented Oct 11, 2024

crepererum commented Oct 11, 2024

tustvold commented Oct 11, 2024 •

edited

Loading

erratic-pattern commented Oct 14, 2024

erratic-pattern commented Oct 14, 2024 •

edited

Loading

Improve reqwests error display #6377

Improve reqwests error display #6377

Comments

crepererum commented Sep 10, 2024

tustvold commented Sep 10, 2024 • edited Loading

itsjunetime commented Sep 13, 2024

erratic-pattern commented Sep 30, 2024

erratic-pattern commented Oct 6, 2024

crepererum commented Oct 11, 2024

tustvold commented Oct 11, 2024

crepererum commented Oct 11, 2024

tustvold commented Oct 11, 2024 • edited Loading

erratic-pattern commented Oct 14, 2024

erratic-pattern commented Oct 14, 2024 • edited Loading

Improve `reqwests` error display #6377

Improve `reqwests` error display #6377

tustvold commented Sep 10, 2024 •

edited

Loading

tustvold commented Oct 11, 2024 •

edited

Loading

erratic-pattern commented Oct 14, 2024 •

edited

Loading