Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: throttle with retries provider changes #1230

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
7691bf5
wip, tests not fixed yet
shaspitz Jun 20, 2023
a10a239
rm packet query on provider
shaspitz Aug 22, 2023
8a557b3
rm unneeded UTs
shaspitz Aug 22, 2023
196ce38
rm tests from relay_test
shaspitz Aug 22, 2023
0f27c31
rm query and more tests
shaspitz Aug 22, 2023
cf09f5f
rm more tests
shaspitz Aug 22, 2023
87ad0f4
Merge branch 'main' into shawn/throttle-with-retries-provider-changes
shaspitz Aug 22, 2023
7e6264f
builds again and rm debug tests
shaspitz Aug 22, 2023
5e4b845
lint
shaspitz Aug 22, 2023
8350956
fix handling of slash packet and integration test
shaspitz Aug 22, 2023
6d20dd1
Fix TestMultiConsumerSlashPacketThrottling
shaspitz Aug 23, 2023
461878c
fix two more slashing integration tests
shaspitz Aug 23, 2023
56242a6
Update TestSlashRetries, cleanup neededc
shaspitz Aug 23, 2023
1d963fa
cleaned up TestSlashRetries
shaspitz Aug 23, 2023
e8acd9e
UT for TestOnRecvDowntimeSlashPacket
shaspitz Aug 23, 2023
8ed33f3
cleans
shaspitz Aug 24, 2023
ecac6a4
use helper in throttle test
shaspitz Aug 24, 2023
f6d4650
lintz
shaspitz Aug 24, 2023
956e595
Revert "rm packet query on provider"
shaspitz Aug 24, 2023
db8dc1b
cmd file too
shaspitz Aug 24, 2023
78a8269
fully restore query
shaspitz Aug 24, 2023
73db33b
Merge branch 'main' into shawn/throttle-with-retries-provider-changes
shaspitz Aug 24, 2023
5bfccc3
Revert "Merge branch 'main' into shawn/throttle-with-retries-provider…
shaspitz Aug 24, 2023
5196394
make e2e test pass, with todos
shaspitz Aug 25, 2023
37e0e93
clean
shaspitz Aug 25, 2023
b1cb354
Revert "Revert "Merge branch 'main' into shawn/throttle-with-retries-…
shaspitz Aug 25, 2023
599854a
lint
shaspitz Aug 25, 2023
8945156
Update CHANGELOG.md
shaspitz Aug 25, 2023
0544fd3
slightly longer buffer
shaspitz Aug 25, 2023
d8f5690
merge main
shaspitz Aug 25, 2023
f91cb70
build(deps): bump actions/checkout from 3 to 4 (#1257)
dependabot[bot] Sep 5, 2023
aca8362
build(deps)!: bump github.com/cosmos/ibc-go/v7 from 7.2.0 to 7.3.0 (#…
dependabot[bot] Sep 5, 2023
66adc8a
build(deps): bump github.com/cosmos/cosmos-sdk from 0.47.4 to 0.47.5 …
dependabot[bot] Sep 5, 2023
cc9064d
chore: Separate semver (#1217)
shaspitz Sep 5, 2023
6da7fef
docs: cleanup changelog (#1260)
mpoke Sep 5, 2023
840d290
fix!: validate MsgTransfer before calling Transfer() (#1244)
mpoke Sep 5, 2023
8ec7bc5
docs: Create adr-012-separate-releasing.md (#1229)
shaspitz Sep 5, 2023
b152c03
fix: remove addr validation for provider fee pool addr param (#1262)
NeverHappened Sep 6, 2023
6bdfff9
Merge branch 'main' into shawn/throttle-with-retries-provider-changes
shaspitz Sep 6, 2023
3b27006
Revert "Merge branch 'main' into shawn/throttle-with-retries-provider…
shaspitz Sep 6, 2023
afa32f4
Merge branch 'feat/throttle-complete-attempt-2' into shawn/throttle-w…
shaspitz Sep 6, 2023
6ee88e2
fmt
shaspitz Sep 6, 2023
1ed2f56
Update steps_downtime.go
shaspitz Sep 14, 2023
fcf0b92
Update tests/e2e/steps_downtime.go
shaspitz Sep 14, 2023
53c02ea
waittime instead of sleep
shaspitz Sep 14, 2023
e9d745d
Update x/ccv/provider/client/cli/query.go
shaspitz Sep 14, 2023
b685958
Update x/ccv/provider/client/cli/query.go
shaspitz Sep 14, 2023
1ceddcb
rm throttled packets from query
shaspitz Sep 14, 2023
e228953
rm provider query
shaspitz Sep 14, 2023
7d3dd64
whoopsies forgot to rm some boilerplate
shaspitz Sep 14, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

Add an entry to the unreleased provider section whenever merging a PR to main that is not targeted at a specific release. These entries will eventually be included in a provider release.

* (feat!) [#1230](https://github.com/cosmos/interchain-security/pull/1230) Throttle with retries provider changes.
* (feature!) [#1244](https://github.com/cosmos/interchain-security/pull/1244) Update the default consumer unbonding period to 2 weeks.
* (deps) [#1259](https://github.com/cosmos/interchain-security/pull/1259) Bump [cosmos-sdk](https://github.com/cosmos/cosmos-sdk) to [v0.47.5](https://github.com/cosmos/cosmos-sdk/releases/tag/v0.47.5).
* (deps!) [#1258](https://github.com/cosmos/interchain-security/pull/1258) Bump [ibc-go](https://github.com/cosmos/ibc-go) to [v7.3.0](https://github.com/cosmos/ibc-go/releases/tag/v7.3.0).
Expand Down
37 changes: 0 additions & 37 deletions proto/interchain_security/ccv/provider/v1/query.proto
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
syntax = "proto3";

Check failure on line 1 in proto/interchain_security/ccv/provider/v1/query.proto

View workflow job for this annotation

GitHub Actions / break-check

Previously present message "QueryThrottledConsumerPacketDataRequest" was deleted from file.

Check failure on line 1 in proto/interchain_security/ccv/provider/v1/query.proto

View workflow job for this annotation

GitHub Actions / break-check

Previously present message "QueryThrottledConsumerPacketDataResponse" was deleted from file.

Check failure on line 1 in proto/interchain_security/ccv/provider/v1/query.proto

View workflow job for this annotation

GitHub Actions / break-check

Previously present message "ThrottledPacketDataWrapper" was deleted from file.

Check failure on line 1 in proto/interchain_security/ccv/provider/v1/query.proto

View workflow job for this annotation

GitHub Actions / break-check

Previously present message "ThrottledSlashPacket" was deleted from file.
package interchain_security.ccv.provider.v1;

option go_package = "github.com/cosmos/interchain-security/v3/x/ccv/provider/types";
Expand All @@ -10,7 +10,7 @@
import "interchain_security/ccv/v1/shared_consumer.proto";
import "interchain_security/ccv/v1/wire.proto";

service Query {

Check failure on line 13 in proto/interchain_security/ccv/provider/v1/query.proto

View workflow job for this annotation

GitHub Actions / break-check

Previously present RPC "QueryThrottledConsumerPacketData" on service "Query" was deleted.
// ConsumerGenesis queries the genesis state needed to start a consumer chain
// whose proposal has been accepted
rpc QueryConsumerGenesis(QueryConsumerGenesisRequest)
Expand Down Expand Up @@ -65,14 +65,6 @@
"/interchain_security/ccv/provider/throttle_state";
}

// QueryThrottledConsumerPacketData returns a list of pending packet data
// instances (slash packet and vsc matured) for a single consumer chain
rpc QueryThrottledConsumerPacketData(QueryThrottledConsumerPacketDataRequest)
returns (QueryThrottledConsumerPacketDataResponse) {
option (google.api.http).get =
"/interchain_security/ccv/provider/pending_consumer_packets";
}

// QueryRegisteredConsumerRewardDenoms returns a list of consumer reward
// denoms that are registered
rpc QueryRegisteredConsumerRewardDenoms(
Expand Down Expand Up @@ -141,7 +133,7 @@

message QueryThrottleStateRequest {}

message QueryThrottleStateResponse {

Check failure on line 136 in proto/interchain_security/ccv/provider/v1/query.proto

View workflow job for this annotation

GitHub Actions / break-check

Previously present field "4" with name "packets" on message "QueryThrottleStateResponse" was deleted.
// current slash_meter state
int64 slash_meter = 1;
// allowance of voting power units (int) that the slash meter is given per
Expand All @@ -151,35 +143,6 @@
// full
google.protobuf.Timestamp next_replenish_candidate = 3
[ (gogoproto.stdtime) = true, (gogoproto.nullable) = false ];
// data relevant to currently throttled slash packets
repeated ThrottledSlashPacket packets = 4;
}

message QueryThrottledConsumerPacketDataRequest { string chain_id = 1; }

message QueryThrottledConsumerPacketDataResponse {
string chain_id = 1;
uint64 size = 2;
repeated ThrottledPacketDataWrapper packetDataInstances = 3
[ (gogoproto.nullable) = false ];
}

// A query wrapper type for the global entry and data relevant to a throttled
// slash packet.
message ThrottledSlashPacket {
interchain_security.ccv.provider.v1.GlobalSlashEntry global_entry = 1
[ (gogoproto.nullable) = false ];
interchain_security.ccv.v1.SlashPacketData data = 2
[ (gogoproto.nullable) = false ];
}

// ThrottledPacketDataWrapper contains either SlashPacketData or
// VSCMaturedPacketData
message ThrottledPacketDataWrapper {
oneof data {
interchain_security.ccv.v1.SlashPacketData slash_packet = 1;
interchain_security.ccv.v1.VSCMaturedPacketData vsc_matured_packet = 2;
}
}

message QueryRegisteredConsumerRewardDenomsRequest {}
Expand Down
39 changes: 15 additions & 24 deletions tests/e2e/actions.go
Original file line number Diff line number Diff line change
Expand Up @@ -1937,50 +1937,41 @@ func (tr TestRun) assignConsumerPubKey(action assignConsumerPubKeyAction, verbos
tr.waitBlocks(chainID("provi"), 2, 30*time.Second)
}

// slashThrottleDequeue polls slash queue sizes until nextQueueSize is achieved
type slashThrottleDequeue struct {
chain chainID
currentQueueSize int
nextQueueSize int
// slashMeterReplenishmentAction polls the slash meter on provider until value is achieved
type slashMeterReplenishmentAction struct {
targetValue int64
// panic if timeout is exceeded
timeout time.Duration
}

func (tr TestRun) waitForSlashThrottleDequeue(
action slashThrottleDequeue,
func (tr TestRun) waitForSlashMeterReplenishment(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The slash packet queue size of the provider is no longer relevant, but the slash meter value is relevant. I've replaced this action to wait on slash meter being replenished

action slashMeterReplenishmentAction,
verbose bool,
) {
timeout := time.Now().Add(action.timeout)
initialGlobalQueueSize := int(tr.getGlobalSlashQueueSize())
initialSlashMeter := tr.getSlashMeter()

if initialGlobalQueueSize != action.currentQueueSize {
panic(fmt.Sprintf("wrong initial queue size: %d - expected global queue: %d\n", initialGlobalQueueSize, action.currentQueueSize))
if initialSlashMeter >= 0 {
panic(fmt.Sprintf("No need to wait for slash meter replenishment, current value: %d", initialSlashMeter))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a bad idea to panic on? It makes the tests flakey, because if they happen to run slowly, the slash meter might be already-full, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I don't see a good way to decouple the logic of the slash meter away from time. Any test for throttling must have a pretty good and mostly deterministic control over chain time

}

for {
globalQueueSize := int(tr.getGlobalSlashQueueSize())
chainQueueSize := int(tr.getConsumerChainPacketQueueSize(action.chain))
slashMeter := tr.getSlashMeter()
if verbose {
fmt.Printf("waiting for packed queue size to reach: %d - current: %d\n", action.nextQueueSize, globalQueueSize)
fmt.Printf("waiting for slash meter to be replenished, current value: %d\n", slashMeter)
}

// check if global queue size is equal to chain queue size
if globalQueueSize == chainQueueSize && globalQueueSize == action.nextQueueSize { //nolint:gocritic // this is the comparison that we want here.
// check if meter has reached target value
if slashMeter >= action.targetValue {
break
}

if time.Now().After(timeout) {
panic(fmt.Sprintf("\n\n\nwaitForSlashThrottleDequeuemethod has timed out after: %s\n\n", action.timeout))
panic(fmt.Sprintf("\n\nwaitForSlashMeterReplenishment has timed out after: %s\n\n", action.timeout))
}

time.Sleep(500 * time.Millisecond)
tr.WaitTime(5 * time.Second)
}
// wair for 2 blocks to be created
// allowing the jailing to be incorporated into voting power
tr.waitBlocks(action.chain, 2, time.Minute)
}

func uintPointer(i uint) *uint {
return &i
}

// GetPathNameForGorelayer returns the name of the path between two given chains used by Gorelayer.
Expand Down
4 changes: 2 additions & 2 deletions tests/e2e/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -242,8 +242,8 @@ func (tr *TestRun) runStep(step Step, verbose bool) {
tr.registerRepresentative(action, verbose)
case assignConsumerPubKeyAction:
tr.assignConsumerPubKey(action, verbose)
case slashThrottleDequeue:
tr.waitForSlashThrottleDequeue(action, verbose)
case slashMeterReplenishmentAction:
tr.waitForSlashMeterReplenishment(action, verbose)
case startRelayerAction:
tr.startRelayer(action, verbose)
case registerConsumerRewardDenomAction:
Expand Down
40 changes: 5 additions & 35 deletions tests/e2e/state.go
Original file line number Diff line number Diff line change
Expand Up @@ -168,19 +168,6 @@ func (tr TestRun) getChainState(chain chainID, modelState ChainState) ChainState
chainState.ProviderKeys = &providerKeys
}

if modelState.GlobalSlashQueueSize != nil {
globalQueueSize := tr.getGlobalSlashQueueSize()
chainState.GlobalSlashQueueSize = &globalQueueSize
}

if modelState.ConsumerChainQueueSizes != nil {
consumerChainQueueSizes := map[chainID]uint{}
for c := range *modelState.ConsumerChainQueueSizes {
consumerChainQueueSizes[c] = tr.getConsumerChainPacketQueueSize(c)
}
chainState.ConsumerChainQueueSizes = &consumerChainQueueSizes
}

if modelState.RegisteredConsumerRewardDenoms != nil {
registeredConsumerRewardDenoms := tr.getRegisteredConsumerRewardDenoms(chain)
chainState.RegisteredConsumerRewardDenoms = &registeredConsumerRewardDenoms
Expand Down Expand Up @@ -667,9 +654,10 @@ func (tr TestRun) getProviderAddressFromConsumer(consumerChain chainID, validato
return addr
}

func (tr TestRun) getGlobalSlashQueueSize() uint {
func (tr TestRun) getSlashMeter() int64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this take the chain as an argument instead of always using provi?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo no because the slash meter is only relevant to the provider. If we ever plan to have multiple providers in these tests then that's a different story, but I don't see that happening

//#nosec G204 -- Bypass linter warning for spawning subprocess with cmd arguments.
cmd := exec.Command("docker", "exec", tr.containerConfig.instanceName, tr.chainConfigs[chainID("provi")].binaryName,
cmd := exec.Command("docker", "exec",
tr.containerConfig.instanceName, tr.chainConfigs[chainID("provi")].binaryName,

"query", "provider", "throttle-state",
`--node`, tr.getQueryNode(chainID("provi")),
Expand All @@ -680,26 +668,8 @@ func (tr TestRun) getGlobalSlashQueueSize() uint {
log.Fatal(err, "\n", string(bz))
}

packets := gjson.Get(string(bz), "packets").Array()
return uint(len(packets))
}

func (tr TestRun) getConsumerChainPacketQueueSize(consumerChain chainID) uint {
//#nosec G204 -- Bypass linter warning for spawning subprocess with cmd arguments.
cmd := exec.Command("docker", "exec", tr.containerConfig.instanceName, tr.chainConfigs[chainID("provi")].binaryName,

"query", "provider", "throttled-consumer-packet-data",
string(consumerChain),
`--node`, tr.getQueryNode(chainID("provi")),
`-o`, `json`,
)
bz, err := cmd.CombinedOutput()
if err != nil {
log.Fatal(err, "\n", string(bz))
}

size := gjson.Get(string(bz), "size").Uint()
return uint(size)
slashMeter := gjson.Get(string(bz), "slash_meter")
return slashMeter.Int()
}

func (tr TestRun) getRegisteredConsumerRewardDenoms(chain chainID) []string {
Expand Down
73 changes: 15 additions & 58 deletions tests/e2e/steps_downtime.go
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,8 @@ func stepsDowntime(consumerName string) []Step {
chainID(consumerName): ChainState{
ValPowers: &map[validatorID]uint{
validatorID("alice"): 509,
validatorID("bob"): 500,
// Bob's stake may or may not be slashed at this point depending on comet vs cometmock
// See https://github.com/cosmos/interchain-security/issues/1304
validatorID("carol"): 501,
},
},
Expand Down Expand Up @@ -278,7 +279,7 @@ func stepsThrottledDowntime(consumerName string) []Step {
validator: validatorID("bob"),
},
state: State{
// slash packet queued on consumer, but powers not affected on either chain yet
// slash packet queued for bob on consumer, but powers not affected on either chain yet
chainID("provi"): ChainState{
ValPowers: &map[validatorID]uint{
validatorID("alice"): 511,
Expand Down Expand Up @@ -312,11 +313,6 @@ func stepsThrottledDowntime(consumerName string) []Step {
validatorID("bob"): 0, // bob is jailed
validatorID("carol"): 500,
},
// no provider throttling engaged yet
GlobalSlashQueueSize: uintPointer(0),
ConsumerChainQueueSizes: &map[chainID]uint{
chainID(consumerName): uint(0),
},
},
chainID(consumerName): ChainState{
// VSC packet applying jailing is not yet relayed to consumer
Expand All @@ -328,13 +324,13 @@ func stepsThrottledDowntime(consumerName string) []Step {
},
},
},
// Invoke carol downtime slash on consumer
{
action: downtimeSlashAction{
chain: chainID(consumerName),
validator: validatorID("carol"),
},
state: State{
// powers not affected on either chain yet
chainID("provi"): ChainState{
ValPowers: &map[validatorID]uint{
validatorID("alice"): 511,
Expand All @@ -343,10 +339,9 @@ func stepsThrottledDowntime(consumerName string) []Step {
},
},
chainID(consumerName): ChainState{
// VSC packet applying jailing is not yet relayed to consumer
ValPowers: &map[validatorID]uint{
validatorID("alice"): 511,
validatorID("bob"): 500,
validatorID("bob"): 500, // VSC packet jailing bob is not yet relayed to consumer
validatorID("carol"): 500,
},
},
Expand All @@ -364,42 +359,35 @@ func stepsThrottledDowntime(consumerName string) []Step {
ValPowers: &map[validatorID]uint{
validatorID("alice"): 511,
validatorID("bob"): 0,
validatorID("carol"): 500, // not slashed due to throttling
},
GlobalSlashQueueSize: uintPointer(1), // carol's slash request is throttled
ConsumerChainQueueSizes: &map[chainID]uint{
chainID(consumerName): uint(1),
validatorID("carol"): 500, // slash packet for carol recv by provider, carol not slashed due to throttling
},
},
chainID(consumerName): ChainState{
ValPowers: &map[validatorID]uint{
validatorID("alice"): 511,
validatorID("bob"): 0,
validatorID("bob"): 0, // VSC packet applying bob jailing is also relayed and recv by consumer
validatorID("carol"): 500,
},
},
},
},
// TODO(Shawn): Improve this test to have the consumer retry it's downtime slash, and to assert queue size on consumer.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this TODO, the e2e test ends without retrying the slash packet. This will be addressed in a separate PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, we are removing the Queue size from the chain state, despite this TODO, right? It reads to me as if that would reintroduce it, but maybe I'm missing context

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, we are removing the queues from the provider in this PR. The queues exist on the consumers as of #1024, so incoming e2e test will assert queue size on consumer

// See https://github.com/cosmos/interchain-security/issues/1103 and https://github.com/cosmos/interchain-security/issues/1233
{
action: slashThrottleDequeue{
chain: chainID(consumerName),
currentQueueSize: 1,
nextQueueSize: 0,
action: slashMeterReplenishmentAction{
targetValue: 0, // We just want slash meter to be non-negative

// Slash meter replenish fraction is set to 10%, replenish period is 20 seconds, see config.go
// Meter is initially at 10%, decremented to -23% from bob being jailed. It'll then take three replenishments
// for meter to become positive again. 3*20 = 60 seconds + buffer = 80 seconds
timeout: 80 * time.Second,
// for meter to become positive again. 3*20 = 60 seconds + buffer = 100 seconds
timeout: 100 * time.Second,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, was there a reason the buffer needs to be increased?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an artifact of #1304 and trying to get this to work with cometmock

},
state: State{
chainID("provi"): ChainState{
ValPowers: &map[validatorID]uint{
validatorID("alice"): 511,
validatorID("bob"): 0,
validatorID("carol"): 0, // Carol is jailed upon packet being handled on provider
},
GlobalSlashQueueSize: uintPointer(0), // slash packets dequeued
ConsumerChainQueueSizes: &map[chainID]uint{
chainID(consumerName): 0,
validatorID("carol"): 500, // Carol still not slashed, packet must be retried
},
},
chainID(consumerName): ChainState{
Expand All @@ -412,36 +400,5 @@ func stepsThrottledDowntime(consumerName string) []Step {
},
},
},
// A block is incremented each action, hence why VSC is committed on provider,
// and can now be relayed as packet to consumer
{
action: relayPacketsAction{
chainA: chainID("provi"),
chainB: chainID(consumerName),
port: "provider",
channel: 0,
},
state: State{
chainID("provi"): ChainState{
ValPowers: &map[validatorID]uint{
validatorID("alice"): 511,
validatorID("bob"): 0,
validatorID("carol"): 0,
},
GlobalSlashQueueSize: uintPointer(0),
ConsumerChainQueueSizes: &map[chainID]uint{
chainID(consumerName): 0,
},
},
chainID(consumerName): ChainState{
ValPowers: &map[validatorID]uint{
validatorID("alice"): 511,
// throttled update gets to consumer
validatorID("bob"): 0,
validatorID("carol"): 0,
},
},
},
},
}
}
Loading
Loading