Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error in chatroom after leaving a group #1718

Conversation

Silver-IT
Copy link
Member

@Silver-IT Silver-IT requested a review from taoeffect September 8, 2023 06:09
@Silver-IT Silver-IT self-assigned this Sep 8, 2023
@Silver-IT Silver-IT linked an issue Sep 8, 2023 that may be closed by this pull request
Copy link
Member

@taoeffect taoeffect left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review ready!

frontend/model/contracts/chatroom.js Outdated Show resolved Hide resolved
frontend/controller/actions/group.js Outdated Show resolved Hide resolved
@Silver-IT Silver-IT force-pushed the 1645-error-after-leaving-a-group-related-to-chatroom branch from 7cb122e to 4623096 Compare September 11, 2023 02:54
@Silver-IT Silver-IT changed the base branch from master to e2e-protocol September 11, 2023 02:55
Copy link
Member Author

@Silver-IT Silver-IT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taoeffect, seems ready for your review.

Copy link
Member

@taoeffect taoeffect left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preliminary review ready

package.json Outdated Show resolved Hide resolved
frontend/model/contracts/chatroom.js Outdated Show resolved Hide resolved
Copy link
Member

@taoeffect taoeffect left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge conflicts + continuing questions about operator

frontend/model/contracts/chatroom.js Outdated Show resolved Hide resolved
@Silver-IT Silver-IT requested a review from taoeffect September 17, 2023 23:27
Copy link
Member

@taoeffect taoeffect left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, interesting. This PR implements the same strategy as the comment you left on Slack, and I believe it has the same issue as what I mentioned - namely that by removing the sideEffect we now no longer leave the group on the 2nd device.

However I found additional issues in the PR while following the steps here.

In [Tab1] this happens after u1 leaves the group: (EDIT: I think Ricardo & I just fixed this in a recent PR that was merged to e2e-protocol!)

Screenshot 2023-10-20 at 9 39 53 AM

In [Tab2] u1 remains in the group and this appears in the console:

Screenshot 2023-10-20 at 9 40 02 AM

Since both you and @corrideat are working on this same thing, and since @corrideat is also very deep in the details of what is going on here, I'm going to ask him to have a look at this issue and see if he can address it fully, so don't worry about updating this PR.

@taoeffect taoeffect added the Note:Reassigned These PRs have been reassigned to someone else label Oct 20, 2023
@Silver-IT
Copy link
Member Author

This PR implements the same strategy as the comment you left on Slack, and I believe it has the same issue as what I mentioned - namely that by removing the sideEffect we now no longer leave the group on the 2nd device.

This change in this PR fixes the error of leaving group when there are several devices that user signed in on.
All of the devices get updated by syncing the contract messages. One message will leave user from all the devices. The message should be created only once if the user leaves group and chatroom once
But the previous code that calling gi.actions/chatroom/leave message in sideEffect creates several messages of gi.contracts/chatroom/leave so that's why @corrideat ignores multiple messages by ignoring throw new Error and using console.warn.

@Silver-IT
Copy link
Member Author

In [Tab1] this happens after u1 leaves the group: (EDIT: I think Ricardo & I just fixed this in a recent PR that was merged to e2e-protocol!)
In [Tab2] u1 remains in the group and this appears in the console:

I don't think these errors are related to this Issue #1718 (anything related to chatroom or leaving group).
These errors are mainly because of unstable login process.

@corrideat
Copy link
Member

@Silver-IT I've reviewed this PR and, while it looks like it'd work, I see a couple of issues:

  1. Although we already have actions sending other actions, like you're doing, I think doing that across contracts should be avoided. One of the reasons is that it isn't very fault tolerant, and another reason is that it breaks contract isolation (we need to break this sometimes, but I'd say it should be a last resort)
  2. This solutions works for this case but it doesn't solve the general issue of potentially duplicated actions.
  3. It's unclear for developers what should be a side-effect and what shouldn't be, and this is an example where this choice can have subtle implications.
  4. Not an issue, but I like that you got rid of the leavingGroup field.

I've made a fix that addresses this in a different way; mostly keeping things as they are but adding a new hook to avoid duplicate messages. It also avoids sending the leaveChatRoom action when leaving a group. See commit 3b1987e.

@taoeffect taoeffect mentioned this pull request Oct 24, 2023
@taoeffect
Copy link
Member

Closing in favor of #1779

@taoeffect taoeffect closed this Oct 24, 2023
@corrideat
Copy link
Member

corrideat commented Oct 25, 2023

I think it's true that we should avoid making contractA's actions (gi.actions/...) in contractB's sideEffect (gi.contracts/...). But I think it's fine to have actions sending other actions. This approach doesn't affect across the other contracts.

Actions sending other actions is fine when it happens for the same contract. Actions sending other actions for a different contract is problematic because it can lead to an inconsistent state or a deadlock, since now you have multiple contracts to take care of.

I can see that your solution is calling gi.actions/chatroom/leave in the gi.contracts/group/removeMember/sideEffect. My solution was to avoid this call.

I guess you mean that you avoid gi.actions/chatroom/leave multiple times? Because you're still calling it right here https://github.com/okTurtles/group-income/pull/1718/files#diff-2eacbfa3933896fc06878c4b26a2b9cae799d62e8d2086152a3e4ed37d33ca93R720.

Actually, having checked your solution, it could fix the issue (I haven't tested it yet). But I don't like one thing that trying to call gi.actions/chatroom/leave many times (even though one of them is executed and others could be ignored) in gi.contracts/group/removeMember/sideEffect throughout all the group members. Why should many times of calling (sbp call gi.actions/chatroom/leave) is necessary?

Apart from avoiding cross-contract issues by using a side-effect (which is AFAIU also an advantage for the chel command), another reason for this is to achieve eventual consistency. One of the issues with the previous solutions is / was that leaving a group required multiple actions in the group. This means that multiple other people have the option of doing this

You had:

  • /group/remove*
  • /group/leaveChatRoom
  • /chatroom/leave

Suppose that a client for some reason leaves the group but the /group/leaveChatRoom or /chatroom/leave actions are not sent. In this case, you end up in a situation where you're still a 'member' of the chatroom even when you shouldn't. By making this a side-effect, you ensure that such incorrect states can get corrected.

If you see, I'm checking before making the action as well. However, you're right that making the call multiple times is not ideal. I think, ideally, we could have some kind of key to check whether that action is queued or not (although, this is something that should be benchmarked, the answer may not be obvious).

preSendHook seems to be used to ignore all the calls except the first one. gi.actions/chatroom/leave should be called only one time if the user left the group once.

Small correction, preSendCheck isn't ignoring all calls but the first; it could be ignoring all calls or even none. Its purpose is to introduce a recall mechanism, i.e., to make sending a message conditional. But yes, in this case in most situations either all calls will be ignored or none of them will be.

Not sure what you mean.

I mean that this issue shows that where checks are done and how they are done can affect (and break) the application in ways that are non-obvious and that avoiding this as much as possible by being consistent is a good way to avoid it.

Right now, we have:

Actions in controllers

These define outgoing operations and do not affect the state

Action processors

These define changes to the state

Action side effects

These define actions / changes to a different state (e.g., the UI, another contract (indirectly), etc.)

IMHO, sending a leave action from the group to the chatroom should be a part of the sideEffects because that's an action that needs to happen in another contract.

OTOH, actions in controllers could work, but they should not be used for ensuring state correctness. I know that this is already done in some places, and I think we should get rid of this as much as possible.

leavingGroup is already removed in another PR.

@Silver-IT
Copy link
Member Author

Silver-IT commented Oct 25, 2023

Actions sending other actions is fine when it happens for the same contract. Actions sending other actions for a different contract is problematic because it can lead to an inconsistent state or a deadlock, since now you have multiple contracts to take care of.

  • I don't think actions sending other actions doesn't make lead to the deadlock. A very huge number of actions could be made in a second when there are many active users in our platform. Each action works individually, and as a result it could lead to the inconsistent state. I don't think deadlock should be happening in this case.
  • I agree with you on that it could lead to the inconsistent state.

I guess you mean that you avoid gi.actions/chatroom/leave multiple times? Because you're still calling it right here.

In my solution, it's avoided. The gi.actions/chatroom/leave is called only once for each chatroom inside the group if the user is leaved the group once.

Suppose that a client for some reason leaves the group but the /group/leaveChatRoom or /chatroom/leave actions are not sent. In this case, you end up in a situation where you're still a 'member' of the chatroom even when you shouldn't. By making this a side-effect, you ensure that such incorrect states can get corrected.

Correct. I think it's what you called inconsistent state in the above. Actually, I have already faced this kind of issue and described here about it (not same but similar). But I don't think we have solution for this at the moment. I think preSendHook couldn't be the right solution. For example in your solution, preSendHook is used to make decision to make chatroom actions (gi.actions/chatroom/leave) depending on the chatroom contract state. But the problem is that it is being called in group contract sideEffect. Those two contract states are actually individual and the process in one contract should not depend on other contract state. For example, if we call sbp('chelonia/contract/sync', contractIDs), all the contracts are being synced individually and their states are independent.

Totally

I have some agreement in your solution and also have some opinions either. What I would like to say at this moment, is that we should avoid making actions that sending several actions inside of it. But now, we don't have right solution about it and we have many actions that are working that way. I think we should discuss more and find the proper solution for all of them and start working on it. IMHO, I prefer to merge this PR, instead of the last commits for preSendHook and things related.

@corrideat
Copy link
Member

corrideat commented Oct 25, 2023

I don't think actions sending other actions doesn't make lead to the deadlock.

It can lead to a deadlock because writing to a contract may result in a sync (even if it doesn't). The reasoning for this is that you need to have the contract state and you need to be subscribed to a contract. So, if you have an action that is writing to another contract, you could end up with a cyclic dependency that deadlocks.

Upon closer inspection, it doesn't seem like it'd cause a deadlock

A very huge number of actions could be made in a second when there are many active users in our platform. Each action works individually, and as a result it could lead to the inconsistent state. I don't think deadlock should be happening in this case.

That's correct. The issue is _not_ the number of actions but rather using e.g., `await`, which can cause a deadlock.

await sbp('example/action1', {contractID: '1'})

is safe if 'example/action1' does call 'example/action2' and 'example/action3' on '1' while awaiting on them. However, it is not necessarily safe if one of those are for a different contract.

I agree with you on that it could lead to the inconsistent state.

I think that this is something we need to avoid.

But now, we don't have right solution about it and we have many actions that are working that way. I think we should discuss more and find the proper solution for all of them and start working on it. IMHO, I prefer to merge #1728, instead of the last commits for preSendHook and things related.

I still think that a side-effect here is a better approach to maintain state consistency. I'm still reviewing #1728.

@corrideat
Copy link
Member

To recap, it seems like we have two main ways of handling this:

  1. This PR, which moves the chatroom action into the group action (instead of the side-effect, as it was before). This way the leave action is sent once
  2. The other PR, which keeps the chatroom leave action as a side-effect, adding a check to prevent duplicates (it also removes the need for the leave chatroom action when leaving a group, but that's besides the point).

They are both a way of ensuring that the duplicate leave action doesn't happen. Personally, I think that this belongs into a side-effect because it's more resilient to errors or omissions. However, what you propose could also work.

@taoeffect
Copy link
Member

taoeffect commented Oct 25, 2023

@Silver-IT @corrideat

From what I've read about the discussion above, and from discussions with @corrideat on a call, here's what I've been able to discern:

  • To me, it sounds like there is not much of a difference between the approach @Silver-IT took and the one @corrideat is taking. Both approaches seem good to me. 👍
  • There is one slight advantage to having the chatroom leave actions being done in the sideEffect (@corrideat's PR) instead of in the gi.actions (@Silver-IT's PR), and that is — anyone in the chatroom can cause the person who is leaving to leave the chatroom. So because this sideEffect will be call multiple times (by each member in the room), if there was a problem with anyone's connection when sending the chatroom/leave action, then someone else might successfully call it instead.

That is a very unlikely situation, and another way to ensure that that the chatroom/leave action is called is to use the persistent action queue that @snowteamer implemented in #1768. Then the approach that @Silver-IT took would also be very robust too.

Either way - thanks to preSendCheck, I don't think there is much of a difference between sending the messages in the gi.actions or sending them in the sideEffect.

One more thing: it was my fault for not noticing sooner that both of you were working on the same issues. If I had noticed, I would have asked @Silver-IT not to work on those issues, because they are touching on low-level changes that @corrideat is also working on. My mistake.

Now that we have already merged @corrideat's PR, I think to not lose time we should just move forward with the decision we've made, because ultimately it doesn't make much of a difference whether the actions happen in the sideEffect or the gi.actions. Also, because #1728 contains the same changes as #1718, @corrideat will have to send in a separate PR to address issues #1716 and #1766. @Silver-IT, since you also worked on those issues, I would appreciate your help with reviewing @corrideat's PR when he sends in those changes. Again, my apologies for not noticing that he was working on the same thing.

@Silver-IT
Copy link
Member Author

Silver-IT commented Oct 25, 2023

@taoeffect, I don't think these issues(#1766, #1645) are related to the low-level changes. And #1716
seems not to be related to either (I have not fully fixed it yet, about 80% fixed. That's why I can't say with 100% confident). Please check my solution in PR #1728, I don't need any low-level changes and I've fixed them.
And I still have opinion in using preSendHook with this reason, and still do not approve the latest commits to be merged.

Also, I have already fixed them in PR #1728, so I don't think we would lose time to fix them.

@taoeffect
Copy link
Member

@Silver-IT

And I still have opinion in using preSendHook with #1718., and still do not approve the latest commits to be merged.

That link is broken but I think you're referring to this:

Correct. I think it's what you called inconsistent state in the above. Actually, I have already faced this kind of issue and described #1334 (comment) about it (not same but similar). But I don't think we have solution for this at the moment. I think preSendHook couldn't be the right solution. For example in your solution, preSendHook is used to make decision to make chatroom actions (gi.actions/chatroom/leave) depending on the chatroom contract state. But the problem is that it is being called in group contract sideEffect. Those two contract states are actually individual and the process in one contract should not depend on other contract state. For example, if we call sbp('chelonia/contract/sync', contractIDs), all the contracts are being synced individually and their states are independent.

I'll let @corrideat respond to that.

Please check my solution in PR #1728, I don't need any low-level changes and I've fixed them.

Also, I have already fixed them in PR #1728, so I don't think we would lose time to fix them.

From what I remember from speaking with @corrideat, there are still issues with PR #1728 and involve lower level fixes to address. See his comment here:

#1728 (comment)

@Silver-IT
Copy link
Member Author

Silver-IT commented Oct 26, 2023

From what I remember from speaking with @corrideat, there are still issues with PR #1728 and involve lower level fixes to address. See his comment here:

I am pretty sure that this PR is much stable than any other branches which are branched off of the the e2e-protocol. And the issue which @corrideat described is not related to the leaving chatroom or leaving group. It's because of unstable login process. As you can see in the PR #1728, I fixed many issues in login process. I am sure there are more, and I am currently working on debugging and fixing them. Regarding the required low-level changes, it wasn't need until now, and it might be needed later but it won't be related to the joining/leaving group. It would be related to the syncing and removing contracts.

So, two things are remaining at the moment, I think.

  • Unstable login process. Still have bugs which should be fixed regarding the updated logic of syncing/removing and joining group process
  • Low-level code might be needed. Syncing group contracts which the user has already left from are still making errors. It seems like to need low-level changes in syncing contract.

Nothing is related to the contract process, sideEffect functions. I think the best solution to fix the first stuff (Unstable login process) is to add two actions; gi.actions/identity/joinGroup and gi.actions/identity/leaveGroup. But I will keep trying to find another solution based on the current codebase, since nobody wants to add them at the moment.

@corrideat
Copy link
Member

corrideat commented Oct 26, 2023

@Silver-IT I'm responding to individual comments here, and there is an overall response at the bottom.

Correct. I think it's what you called inconsistent state in the above. Actually, I have already faced this kind of issue and described #1334 (comment) about it (not same but similar). But I don't think we have solution for this at the moment. I think preSendHook couldn't be the right solution. For example in your solution, preSendHook is used to make decision to make chatroom actions (gi.actions/chatroom/leave) depending on the chatroom contract state. But the problem is that it is being called in group contract sideEffect. Those two contract states are actually individual and the process in one contract should not depend on other contract state. For example, if we call sbp('chelonia/contract/sync', contractIDs), all the contracts are being synced individually and their states are independent.

That's a good point. I can see the potential for a race condition happening here. It depends on how things are being scheduled, and something that I'll look into.

I am pretty sure that this PR is much stable than any other branches which are branched off of the the e2e-protocol.

It could very well be, but it still doesn't work in various leaving / re-joining scenarios.

And the issue which @corrideat described is not related to the leaving chatroom or leaving group. It's because of unstable login process.

Correct.

As you can see in the PR #1728, I fixed many issues in login process.

Correct.

I am sure there are more, and I am currently working on debugging and fixing them.

That's correct, there are some further issues. Please see my response below.

I think the best solution to fix the first stuff (Unstable login process) is to add two actions; gi.actions/identity/joinGroup and gi.actions/identity/leaveGroup. But I will keep trying to find another solution based on the current codebase, since nobody wants to add them at the moment.

Sorry, this is something that has been lost in the background. Although I'm not entirely sure that these actions fix all of the issues we have regarding the login, I agree that this is probably a better way than the current login state-based approach. I also feel that for the issues it doesn't solve, it'd make it at least easier to troubleshoot. For this, I too would like this implemented.

General response

We've been having a lot of back and forth regarding these issues, and I concur that there are some unhandled situations in the login process that must be handled, as well as re. leaving and joining.

It's been unfortunate that, in addition to the issues we were already facing, I made some changes to internals.js to ensure most things are in a queue. While I hope that these changes will make things more stable in the long run, it's been a big change that may have introduced or surfaced some additional issues.

It's also been unfortunate that we've been working in parallel on the same issues, because that's time and effort that could've gone into other things. Therefore, I think that from now on we should coördinate to avoid this situation. I'm sorry on my part that communication on this hasn't been as fluid as it could've been, and that I didn't catch this sooner.

Since there are some remaining issues, I propose we do as follows:

  • I'm currently working on the login-related issues, taking some of the things from your good work in the PR as well as fixing some of the remaining issues. This should be ready soon, ideally within this week.
  • Because many of these issues are in Chelonia itself or are related to changes made in Chelonia, I'd rather take over on these issues myself.
  • If you come across any other issues in the meantime, I propose either of this:
    • You can report the issue as you've been doing with steps to reproduce.
    • If you want to also address it, you can submit a patch for that particular issue. I'll be happy to review and incorporate that.
  • I'd greatly appreciate it if, once the PR is done, you could review it thoroughly.
  • We can implement the gi.actions/identity/[join|leave]Group actions. Either of us can do it, but let's just agree on that to avoid duplicating efforts there. I'd prefer we do this after my PR is merged, because I can see it introducing some otherwise difficult to solve conflicts.
    • You mentioned that this would fix some of the login issues. Can you elaborate on this?

How does that sound?

@Silver-IT
Copy link
Member Author

Silver-IT commented Oct 26, 2023

You mentioned that this would fix some of the login issues. Can you elaborate on this?

@corrideat, I have been working in debugging and fixing login issues in the PR #1728. Having checked the console very carefully, I can say that it is currently working well as I expected, no more redundant process it is making anymore. The main problem is the login process, especially managing loginState. Now, we call gi.actions/group/join by the user's loginState when try to login, and also loginState is something that can be set by the group contractIDs from the rootState. There are two ways to set loginState. That's the main reason of the error in login process.

If we implement gi.actions/identity/joinGroup and gi.actions/identity/leaveGroup, there won't be the above issue. We can trust 100% the groupIds(meaning the joined groupIDs) in identity contract state, and we can just sync them when logging in. If it (to join and leave group) works this way, there will be only one way to set groupIDs (it's loginState now), and the login process would be much cleaner.

@Silver-IT Silver-IT deleted the 1645-error-after-leaving-a-group-related-to-chatroom branch March 7, 2024 07:05
@Silver-IT Silver-IT restored the 1645-error-after-leaving-a-group-related-to-chatroom branch March 7, 2024 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Note:Reassigned These PRs have been reassigned to someone else
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error after leaving a group related to chatroom
3 participants