Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make relevant filesystem controller errors "user errors" and categorize them #2070

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ogayot
Copy link
Member

@ogayot ogayot commented Aug 29, 2024

This PR comes with 3 main changes:

  • changed the way to configure whether StorageRecoverableError should produce crash reports. The main drawback of the previous implementation (using type aliasing) is that inheriting from StorageRecoverableError is resolved at class creation time, not instance creation time.
  • use subclasses of StorageRecoverableErrors to categorize storage errors
  • added title and error code to RecoverableError. This is a step towards implementing RFC9457 (Problem Details for HTTP APIs) or something somehow similar to RFC9457.

The mechanics we use to configure whether an unhandled
StorageRecoverableError should produce a crash report is not very
flexible. Because StorageRecoverableError is basically a type alias,
inheriting from it is essentially the same as inheriting from the type
it was pointing to when the inheritance was declared.

Therefore, if we want different categories of StorageRecoverableError
we need to rebuild them after changing what type
StorageRecoverableError points to.

Instead, RecoverableError now has a class attribute
"produce_crash_report" which controls whether a crash report will be
created when such exception is unhandled in a request handler.

To configure whether we want to produce crash reports, we just need to
update the class attribute. Once can also inherit from RecoverableError
and set a class attribute to a different value in the subtype. This is
what we do to make storage recoverable errors not produce crash reports.

Note than RecoverableError instances can "override" the
"produce_crash_report" setting by creating an instance attribute of the
same name.

Signed-off-by: Olivier Gayot <[email protected]>
In addition to returning a HTTP response of 422 (when configured to do
so), recoverable errors will now include two information in the HTTP
headers:

  * x-error-code: a unique identifier of the error category, e.g.,
    "storage-invalid-usage"
  * x-error-title: a description of the error category, subject to
    translation and encoded as a non-indented JSON string

This is basically one step towards implementing RFC 9457 (Problem
Details for HTTP APIs) ; which we may or may not decide to implement in
the future.

Signed-off-by: Olivier Gayot <[email protected]>
Copy link
Contributor

@Chris-Peterson444 Chris-Peterson444 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. A couple of non-blocking questions/comments. I like the direction!

Comment on lines +33 to +35
# If this exception is uncaught in a request handler, what should we do.
# By default we produce a crash report, but one can change set it to False.
produce_crash_report = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also controls whether the client actually sees (or should treat) this as a recoverable error or not right? Also, having RecoverableError produce crash reports by default feels a little weird without the immediate context of the --no-report-stroage-user-errors flag changing this globally. Could we add some text to the comment about this behavior?

@@ -551,7 +552,7 @@ async def middleware(self, request, handler):
resp.headers["x-error-msg"],
exc_info=exc,
)
if not isinstance(exc, NonReportableException):
if not isinstance(exc, RecoverableError) or exc.produce_crash_report:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is written, if we ever throw a NonReportableException in a request handler then we'll get an apport report but we won't halt the installer, which is probably fine because we want to catch those and convert those to RecoverableError anyways?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants