Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: During an upgrade, the logger should be forced to have a minimum of info. #22522

Open
SpicyLemon opened this issue Nov 13, 2024 · 0 comments

Comments

@SpicyLemon
Copy link
Collaborator

Summary

As a blockchain maintainer and upgrade coordinator, I want nodes configured with a log level of "error" to also output info-level messages during an upgrade so that I can more easily identify what might be going wrong and also so that node operators are less likely to kill the process in the middle of a long upgrade because they think it's stuck. If the node is configured with a higher log level (e.g. debug), then the upgrade should still happen with debug level log output.

Problem Definition

During long-running upgrades, we've found that node operators often interrupt the upgrade thinking that it's stuck and not doing anything. This usually leaves the state store unusable and they then have to restore from a backup which they often didn't make, so they have to track one down. In our upgrades, we make sure to include plenty of logging about what's happening during an upgrade, and those are always info level. Most validators run their nodes with only error logging enabled though. So they don't get to see that the upgrade is doing things. We can tell people to switch to info level during upgrades, but then they have to kill the process and switch back when done.

Further, sometimes things go wrong during an upgrade. Without those info-level messages, it's impossible to know which area to look in while trying to troubleshoot. Even knowing which module is being processed would be a major improvement. But again, those are info-level messages that won't be seen by nodes configured for error level.

The log types used by the SDK make it impossible to detect what log level a logger has and also make it impossible to either change the log level or create a copy of an existing logger, but with a different log level.

Proposed Feature

Create something like NewLoggerWithIncreasedLogLevel(logger, level) that would return itself if the provided logger has at least the provided level, or else create a copy of the logger with the provided level. Then, SDK users could call that at the start of their upgrade handlers so that it'd all run with the desired level, but revert back to the configured one once the upgrade is done and the node is running again. You (the SDK) could even use that to update the logger prior to calling the upgrade handler.

Relatedly, it'd also be nice to see the SDK add periodic log messages during its own upgrade processes. All of the long-running module upgrades I've seen from the SDK haven't had any logging. So even when running with debug level logs, nothing is printed to the logs for more than 30 minutes in some cases. At the very least, it should log when its starting/ending a module upgrade.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 📋 Backlog
Development

No branches or pull requests

1 participant