Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update incident_response.md #334

Merged
merged 3 commits into from
Feb 21, 2024
Merged

Update incident_response.md #334

merged 3 commits into from
Feb 21, 2024

Conversation

mgcolburn
Copy link

some edits throughout, additional points, and linking out to some resources with example IR plans & overall guidance

some edits, additional points, and linking out to some resources for more information
Copy link

@0xicingdeath 0xicingdeath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Content looks great to me. I think the CI is failing due to the prettier lint failing.

@mgcolburn
Copy link
Author

Whoops, some whitespace snuck in on me

Copy link
Contributor

@traviswpeters traviswpeters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great changes throughout! A few minor comments.

@@ -1,48 +1,58 @@
# Incident Response Guidelines
# Incident Response Preparation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is the main header for the whole page, I'd recommend that we either keep Guidelines or drop the last word altogether and just title the page "Incident Response" .

In the text of this page we refer to this as "guidelines," and outside of this doc we often refer to this resource as our "Incident Response Guidelines"

- **Document the deployment and upgrade process**. Deployment and upgrade processes are risky and must be thoroughly documented. This should include how to test the deployment/upgrade (ex: using fork testing) and how to validate it (ex: using a post-deployment script).
- **Document how to contact the users and external dependencies**. Define guidelines regarding which stakeholders to contact, including the timing and mode of communication in case of incidents.
- **Assemble a runbook of common actions you may need to perform**. It's not possible or practical to exhaustively detail how you'll respond to every type of incident. But you _can_ start to document procedures for some of the more important ones as well as actions that might be common across multiple scenarios (e.g., pausing, rotating owner keys, upgrading an implementation). This can also include scripts or snippets of code to facilitate performing these actions in a reproducible manner.
- **Document how to interpret abnormal events emission**. Only emitting events isn't sufficient; proper documentation is crucial, and users should be empowered to identify and decode them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This point could be more clear:

  • The point seems to be focused on documenting how to interpret "abnormal" events.
  • The text that follows seems more generally focused on documenting events.

Consider removing the word "abnormal" from the bolded text.

Also consider starting the non-bolded text with "Emitting events isn't sufficient..." (drop the "Only" as it sounds redundant. )

- **Maintain open communication lines with your dependencies owners**. This will help you to stay informed if one of your dependency is compromised.
- **Subscribe to https://newsletter.blockthreat.io/**. BlockThreat will help you stay informed about recent incidents.
- **Identify similar protocols, and stay informed of any issues affecting them**. This could include forks, implementations on other chains, or protocols in the same general class (e.g., other lending protocols). Being aware of vulnerabilities in similar systems can help preemptively address potential threats in your own.
- **Identify your dependencies, and follow their communication channels to be alerted in case of an issue.** Follow their Twitter, Discord, Telegram, newsletter, etc. This includes both on-chain as well as off-chain (e.g., libraries, toolchain) dependencies.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the point of #2 from the rekt test (Do you keep documentation of all the external services, contracts, and oracles you rely on?), perhaps this is a chance to plug that? Or perhaps under Documentation.

Basically, don't just identify them but also maintain clear documentation enumerating them and the best ways known to date to monitor for relevant alerts.

@montyly montyly added this pull request to the merge queue Feb 21, 2024
Merged via the queue into master with commit 37d63bb Feb 21, 2024
3 checks passed
@montyly montyly deleted the ir-updates branch February 21, 2024 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants