Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(adr): ADR-007 pause unbonding period during equivocation proposal #964

Merged
merged 7 commits into from
Jun 12, 2023
89 changes: 89 additions & 0 deletions docs/docs/adrs/adr-007-pause-unbonding-on-eqv-prop.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
---
sidebar_position: 2
title: ADR Template
---
# ADR 007: Pause validator unbonding during equivocation proposal

## Changelog
* 2023-05-16: Initial Draft

## Status

Proposed

## Context

(Copied from [#747](https://github.com/cosmos/interchain-security/issues/747))
mpoke marked this conversation as resolved.
Show resolved Hide resolved

Currently, if an equivocation slashing proposal is created after more than one
mpoke marked this conversation as resolved.
Show resolved Hide resolved
week has passed since the equivocation, it is possible that the validator in
question could unbond and get away without being slashed, since the unbonding
period is 3 weeks, and the voting period is 3 weeks. For this reason, it might
be good to pause unbondings for validators named in an equivocation slashing
proposal until the proposal's voting period is over.

## Decision

### How

Pausing the unbonding period is already possible thanks to the changes in the
`staking` module of the cosmos-sdk:
- `stakingKeeper.PutUnbondingOnHold` pauses an unbonding period
- `stakingKeeper.UnbondingCanComplete` unpauses an unbonding period

These methods use a counter under the hood, which means we can use them without
conflicts with the *Completion of Unbonding Operations* system. Giving an
unbonding period (already paused by definition because of the *Completion of
Unbonding Operations* system), an additional pause will just increase the
counter, so when this unbonding period has reached its maturity on provider
and all consumer chains, it will remain paused.

### When pause

The unbonding period (if there is any unbonding) should be paused once an
equivocation proposal enters the voting period. For that, the `gov` module's
hook `AfterProposalDeposit` can be used.

If the hook is triggered with a an equivocation proposal in voting period, then
for each equivocation of the proposal, the related unbonding operations of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the related unbonding operations of the related validator must be paused.

This needs to be described more precisely. For example, unbonding operations that are no longer waiting for this particular consumer chain should probably not be put on hold. Also, if the infraction occurred after the unbonding operation was initiated, there is no reason to pause the unbonding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good points, we'll add them.

Copy link
Contributor Author

@tbruyelle tbruyelle May 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, unbonding operations that are no longer waiting for this particular consumer chain should probably not be put on hold.

Actually, for this particular case, I wonder how this is possible. Considering the protocol, how an unbonding period can end on a consumer chain before the provider chain ?

Or maybe when the unbonding period is deliberately configured shorter between consumer and provider ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For

Also, if the infraction occurred after the unbonding operation was initiated,

Fixed dd353c2

related validator must be paused.

Note that even after the voting period has started, a proposal can receive
additional deposits. The hook is triggered however at arrival of a deposit, so
a check to verify that the proposal is not already in voting period is
required.

### When unpause

We can use a `gov` module's hook also here and it is
`AfterProposalVotingPeriodEnded`.

If the hook is triggered with an equivocation proposal, then for each
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may lead to unpausing unbonding operations that occurred after the AfterProposalDeposit hook was called. As a result, those unbondings would no longer wait for the unbonding period on the consumer to elapse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, we'll fix that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed dd353c2

equivocation, unpause all unbonding operations of the related validator.

## Consequences

### Positive

- Validators subject to an equivocation proposal cannot finish unbonding
their tokens before the end of the voting period.

### Negative

- A malicious consumer chain could forge slash packets enabling submission of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this is the main argument against this proposal. The main reason for pausing unbonding operations is to enable the consumer chains to have shorter unbonding periods and avoid unbonding operations to be delayed. However, reducing the unbonding period on a consumer chain would impact the trusting period of all light clients of that consumer chain (even clients on other chains). This may lead to light clients expiring, which would be a major disruption for the consumer chain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative to this proposal is to enable the provider to verify the evidence of equivocation on consumer chains, i.e., #732.

Copy link
Contributor

@jtremback jtremback May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mpoke I'm not sure I follow. Currently (without this PR, or cryptographic verification), we are in a situation where the trusting period should be (consumer unbonding - provider voting period) * ~0.75. With a 20 day consumer unbonding period, that means the trusting period should be about 4-5 days. This is already pretty tough for consumer chains to deal with. If we took the consumer unbonding period down to 14 days like we want to, we'd need a trusting period of 0.

With this PR or cryptographic equivocation, we can go back to the trusting period being consumer unbonding * 0.75, or 10-11 days. So at least in this respect, it seems that this PR or cryptographic verification both mitigate this issue equivalently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtremback Yeah, that’s indeed confusing. Sorry for that.

There are actually two problems we are trying to address here. First, ensuring the correctness of the slashing mechanism of PoS. Second, avoiding delays in the completion of unbonding operations on the provider (which would affect the user experience). Note that the first is a safety concern, while the second is a liveness concern.

For the first, we MUST make sure that the stake of a validator that double signed on a consumer chain is still locked (on the provider) after an equivocation proposal passes. Thus, the following condition MUST hold consumerTrustingPeriod < consumerUnbondingPeriod - providerVotingPeriod. In addition, we MUST leave time for the double signing to be detected, a SlashPacket to be relayed and an equivocation proposal to be submitted. Let’s denote this extra time period with delta1. Then, consumerTrustingPeriod < consumerUnbondingPeriod - delta1 - providerVotingPeriod.

For the second, the following condition SHOULD hold consumerUnbondingPeriod < providerUnbondingPeriod. In addition, we SHOULD leave time to relay both VSCPackets and VSCMaturedPackets, and to account for potential downtimes of the consumer chain (as it happened on Neutron’s launch). Let’s denote this extra time period with delta2. Then, consumerUnbondingPeriod + delta2 < providerUnbondingPeriod.

From these two inequalities, we get consumerTrustingPeriod < providerUnbondingPeriod - delta2 - delta1 - providerVotingPeriod. For the Hub, providerUnbondingPeriod = 21 days, providerVotingPeriod = 14 days, which means consumerTrustingPeriod < 7 - delta2 - delta1. Best case scenario, delta1 = 1 day and delta2 = 1day, which means consumerTrustingPeriod < 5 days. This trusting period must be used for all light clients of the consumer chain, not only the ones on the provider. This means that the chances of a consumer light client expiring are quite high (which would clearly be a very disruptive event). Also, note that these values for the deltas are very low. For example, the network has ~7 days to detect a light client attack on a Hub client (i.e., delta1 ~ 7 days). Also, as we seen with Neutron, having delta2=1day may lead to delays of unbonding operations.

The only way to deal with this problem, is to get rid of the providerVotingPeriod, i.e., consumerTrustingPeriod < providerUnbondingPeriod - delta2 - delta1. This ADR is one way of doing that. The downside is that the suggested approach effectively increases the consumerUnbondingPeriod (when an equivocation proposal is submitted), which means that a malicious consumer can break liveness (i.e., delay unbondings). Given that such an event would result in that consumer being removed (via a ConsumerRemovalProposal), I think the risks are minimal. So, we could go ahead with this proposal for now until #732 is done.

an equivocation proposal on the provider chain, resulting in the freezing of
validator's unbondings for an undeterminated amount of time.
- Misbehavior on a consumer chain can potentially go unpunished, if no one
submits an equivocation proposal in time, or if the proposal doesn't pass.

### Neutral

- This feature can't be used for social slashing, because an equivocation
proposal is only accepted if there's a slash log for the related
validator(s), meaning the consumer chain has reported the equivocation to
the provider chain.

## References

* https://github.com/cosmos/interchain-security/issues/747
* https://github.com/cosmos/interchain-security/pull/791