Prevent duplicate event when anchoring reg or cred in multisigs #271

rodolfomiranda · 2024-07-27T19:06:05Z

This PR fixes an inconsistency that occurs when a member of a multisig joins an anchoring event (for a registry creation or credential issuance/revocation) and when it's KEL already contains that event, resulting in a new event with the same anchor.
That situation can be easily replicated if the threshold of the multisig is less that the total number of members, and the last member joins with a time delay that allows the propagation of the event.

To fix that problem, the code now checks if the last event already has the anchor event and avoid creating a new event in the KEL.

codecov · 2024-07-27T23:15:26Z

Codecov Report

Attention: Patch coverage is 89.47368% with 4 lines in your changes missing coverage. Please review.

Project coverage is 83.84%. Comparing base (850af59) to head (a3e2b21).

Files	Patch %	Lines
src/keri/app/credentialing.ts	88.88%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #271      +/-   ##
==========================================
+ Coverage   83.82%   83.84%   +0.02%     
==========================================
  Files          48       48              
  Lines        4229     4260      +31     
  Branches     1034     1062      +28     
==========================================
+ Hits         3545     3572      +27     
- Misses        656      684      +28     
+ Partials       28        4      -24

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

lenkan · 2024-08-12T10:34:07Z

In my opinion, we should try to reduce the amount of API-calls we do in each method. So I am wondering if there is a way to resolve this issue without adding an additional call.

What is the actual scenario here?

Member 1 creates the acdc, issuance and anchoring events.
Member 2 creates the acdc, issuance and anchoring events.
Member 3 does what exactly, also what happens when they do this thing?

I would expect keria to respond with HTTP 400 Bad Request if member 3 tries to create an inconsistent state.

Maybe member 3 can detect that the issuance is already anchored and it only needs to "import" the credential?

rodolfomiranda · 2024-08-12T12:57:19Z

What is currently happening (as implemented in KERIA and Signify), for example when creating a Registry:

member 1 creates an anchoring event for the registry and sends exn to other members
member 2 receives exn from member 1, creates anchoring event for the registry and sends exn to other members
member 1 receives exn from member 2. Since it's the designated member, it verifies that signature threshold was satisfied, sends event+signatures to witnesses, collect witness receipts and distribute them among members. Finally creates the registry in db
member 2 receives receipts, verifies threshold and creates the registry in DB.
member 3 receives receipts (KERIA) and creates the event in the KEL. Member 3 also receives the exn to join the registry creation (Signify) so it starts creating the anchoring event from its last event in the KEL that already have the anchor, so it ends up creating another event with the exact anchor information but in sn+1. It never receives enough signatures to validate the event and so it never creates the registry. Also its KEL has one event more than the other members.

For KERIA, the event that member3 is trying to create is valid. It would be tricky to make it aware of those edge cases automatically, but not imposible. I added the "check" at the client side for simplicity and also because it gives more control on the decision, but I know it's not the perfect solution. Every time we automate decisions, we'll find edge cases.

Regarding the additional API call, I also agree that we should reduce those, but it this case the client needs to get the latest status of the KEL to proceed correctly. It'd also face an extra call even if KERIA responds with 400 I think.

iFergal · 2024-08-29T07:46:38Z

src/keri/app/credentialing.ts

+
+        // check if last event already has the anchor in it
+        // and avoid creating a new event if it does
+        const lastEvent = events[events.length - 1];


I'm not too familiar with registry events but I'm also curious if there's a way to generally prevent duplicate anchoring events in a KEL directly in KERIA, at least with the same signing keys.

The trouble here is if there are many concurrent things happening, and the duplicate event is e.g. not actually the lastEvent but the event before that (events.length - 2 or so on)

I also think this should be handled server side and in a way that avoids any race conditions.

It might actually just be something we can add to keripy directly, maybe Sam or Phil have some ideas

I agree that something can also be done in the keripy/keria side. Those types of race conditions are always tricky and recover from them is not trivial. There's some code on keripy that assigns one of the members as the lead. And I remember Phil mentioning that they cover for some race conditions, but probably not all cases.

Anyway, the goal of this PR is to solve a specific use case that is that one member of a multisig initiates the creation of a new event and the others join; however one of them joins late, after the event was already completed (thresholds satisfied) with the signatures of the other members. We want this tardy member to create the correct event, not a new one. And we can catch the error on the client side because we know that the same anchor is already on the KEL. There are no race conditions in this use case. One member start the event, and the rest of the members join. This require an extra call, but I think it worth it since prevent other problems.

I think it would still be cleaner if the client side did the redundant signing while late joining and KERIA just accepts it and doesn't add it to the KEL (as it's already a duplicate).

Doing on the client side here doesn't cover the case I mentioned for events.length - 2 and also doesn't cover the case of the final threshold signature appearing at the same time as the client is signing but after they did this check (which is a race condition).

Albeit we could have it as a stop gap solution perhaps

Though, then again, that would leave an inconsistent KEL locally since ultimately the controller of the KEL is on the client side, hmm

You are right in both, this PR only covers the use case of events.lenght - 1. And for the race condition, we need something on keripy that I was trying to avoid, or postpone it for later, for simplicity and urgent need.
We do also need a way to recover from the duplication in case it happens.

I haven't reviewed the nitty gritty details of it but if it's urgent, perhaps it's OK so long as we open the appropriate issues now and tackle it soon (and not let it fall into the pile of issues :P)

In light of this discussion, I'm interested to see what @rodolfomiranda and @iFergal think about @lenkan PR solution #286

@lenkan's PR makes sense to me and avoids my race condition concerns here

lenkan · 2024-08-30T09:55:38Z

@rodolfomiranda Sorry for delay here. We are hitting this too. I think we will be able to end up in the same situation anyway if we are unlucky with the timing? What do you think? Is there a way to move this check closer to the DB transaction to avoid race conditions? I am not as familiar as you with the internals of KERIA.

lenkan · 2024-10-03T15:31:39Z

I have had some more time to investigate and create other reproductions of similar issues. At the moment, I think it would be better if these methods accept the sequence number as an argument instead of trying to calculate it. That way you can pass it in from the exn message that you received from the other group participants. Does that make sense? We had a similar discussion about this here: #222 (comment)

rodolfomiranda added 7 commits July 27, 2024 15:54

avoid duplicating an anchor

97607aa

fix date iso format

c65ee57

conditionals

60ce790

pretty

be95cd2

more conditionals

83e15e0

fix test

bceda2c

pretty again

fd73bde

rodolfomiranda requested a review from lenkan July 27, 2024 23:27

rodolfomiranda added 3 commits July 29, 2024 12:39

+ tests

6dcf772

pretty

8261cd9

test coverage

a3e2b21

rodolfomiranda requested a review from m00sey July 30, 2024 01:36

m00sey requested a review from pfeairheller July 30, 2024 14:34

rodolfomiranda mentioned this pull request Aug 27, 2024

Feature request: a way for a multisig member to catch up updates WebOfTrust/keria#283

Open

iFergal reviewed Aug 29, 2024

View reviewed changes

lenkan mentioned this pull request Oct 16, 2024

fix: a way to prevent duplicated anchor events for multisig issuance #286

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent duplicate event when anchoring reg or cred in multisigs #271

Prevent duplicate event when anchoring reg or cred in multisigs #271

rodolfomiranda commented Jul 27, 2024 •

edited

Loading

codecov bot commented Jul 27, 2024 •

edited

Loading

lenkan commented Aug 12, 2024

rodolfomiranda commented Aug 12, 2024

iFergal Aug 29, 2024 •

edited

Loading

lenkan Aug 30, 2024

iFergal Aug 30, 2024

rodolfomiranda Aug 30, 2024

iFergal Aug 30, 2024

iFergal Aug 30, 2024

rodolfomiranda Aug 30, 2024

iFergal Aug 30, 2024

2byrds Oct 16, 2024

iFergal Oct 16, 2024

lenkan commented Aug 30, 2024

lenkan commented Oct 3, 2024

Prevent duplicate event when anchoring reg or cred in multisigs #271

Are you sure you want to change the base?

Prevent duplicate event when anchoring reg or cred in multisigs #271

Conversation

rodolfomiranda commented Jul 27, 2024 • edited Loading

codecov bot commented Jul 27, 2024 • edited Loading

Codecov Report

lenkan commented Aug 12, 2024

rodolfomiranda commented Aug 12, 2024

iFergal Aug 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lenkan commented Aug 30, 2024

lenkan commented Oct 3, 2024

rodolfomiranda commented Jul 27, 2024 •

edited

Loading

codecov bot commented Jul 27, 2024 •

edited

Loading

iFergal Aug 29, 2024 •

edited

Loading