-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(lightPush): improve peer usage and improve readability #2155
Conversation
…ise, decouple from protocol implementations
…f peer renewal mechanism
@waku-org/js-waku I didn't include unit tests because it would increase PR too much as it requires integrating test runners into |
size-limit report 📦
|
void this.reliabilityMonitor.attemptRetriesOrRenew( | ||
connectedPeer.id, | ||
() => this.protocol.send(encoder, message, connectedPeer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we test this?
So seems like we are not relying on PeerManager
anymore to retrieve peers to be used for the protocols: moved from hasPeers()
which relies on PeerManager -> getConnectedPeers()
which gets all available connections, I'm curious how renewing peers would affect management. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is something that was there so I assume we should have tests there
for connections - yes, we move away but not from hasPeers
but form BaseProtocolSDK.connectedPeers
that proved to be out of sync quite often
the reason for it is:
- to simplify process for LightPush
- alight with
status-go
usage of LightPush that proved to be reliable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I believe hasPeers
would work with tests. Considering if it would be required to double check with getConnectedPeers
, especially as we do renewals and what not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BaseProtocolSDK.connectedPeers()
was being inconsistent because of race conditions in shared peer management, which isn't the case with #2137
codec: string, | ||
libp2p: Libp2p |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so that we decouple protocol
specifics and operate over connections and codecs, this way we can change BaseProtocol
and not change this code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also makes me think: if we indeed are going in this direction where we don't do peer management for LightPush, which leaves us with only doing it for Filter as Store also doesn't rely on peer management.
Couple of questions:
- Should ReliabilityMonitor be re-thought? Do we still need to do any renewals if LightPush fails for a node? (renewal was simply removing the peer, instead of disconnecting with improvements in feat(filter): enhancing protocol peer management with mutex locks #2137). This is the case because often, especially with LightPush, RLN limits are hit which signify that there isn't a need for a node to disconnect, but simply be not actively used for a while and can be rotated back in. Perhaps round-robin here could also make sense between all available LP nodes.
- How does this affect peer management in general, if only one protocol is using it?
I also think so, but wasn't doing any changes not to increase PR that much
As you mentions, there are still things that can make
We still use it in |
Do you have any ideas as to how we can tackle that?
What would |
@@ -49,11 +49,6 @@ class LightPushSDK extends BaseProtocolSDK implements ILightPushSDK { | |||
message: IMessage, | |||
_options?: ProtocolUseOptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
options are not being accounted for in send
anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly! as I mentioned here it will be follow up
@weboko
I'm not certain how some of these tests passed on this PR, while are now failing on #2137. I'm moving my PR to draft until we are able to recognise some of these design decisions and how to approach them. Let me know your thoughts. |
agree with that as mentioned here #2137 (comment)
From design PoV I think And I want to add to it that |
do you want to follow up to address this?
Ok cool, I agree |
Problem
This PR addresses a couple of problems.
LightPush
usage of peersLightPush
currently usesrenewal
mechanics fromBaseProtocol
when we consume peers from a selected pool.But as
status-desktop
approach proved - we don't need any hustle with management of peers / connections forLightPush
and it is sufficient to use any connected peer at the moment.Another problem that exists right now(partially was solved in #2145) - a lot of errors marked as
No stream available
. After investigating it extensively I figured that what actually happens is when we try to push to a set of peers fromconnectedPeers
pool supported inBaseProtocolSDK
andconnectedPeers
is out of sync withlibp2p
connections then creation of a stream would fail and get returned.js-waku/packages/sdk/src/protocols/lightpush/index.ts
Line 87 in b93134a
waitForRemotePeer
improvementsThis is something I figured while working on it so included in the same PR as it was difficult to decouple.
In
waitForRemotePeer
we were spawning multiple promises forMetadataService
in case some connections were present. This is something that can be done and was moved intowaitForMetadata
.Some debt and refactoring
In neighboring areas I reshaped code a bit to decrease level of deepness and better readability.
Also,
LightPushSDK
was renamed toLightPush
as we discussed while ago.Solution
Implement very straightforward mechanism for
LightPush
to get peers to use.I also keep usage of
this.reliabilityMonitor.attemptRetriesOrRenew
in case a peer that was used failed and needs to be renewed or dropped.Notes
dogfooding
app and proved to be working reliably and existing E2E tests shouldn't regress.waitForRemotePeer
part of thewaku
interface #1761