-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply timedCancellable
across the board to control how long side-effects are allowed to complete
#450
Comments
pingNode
timeouttimedCancellable
across the board to control how long side-effects are allowed to complete
This is going to be a pretty large change. Converting everything to
|
I don't think the default timeout for the |
There are several places where Pretty much all of them concern around network usage.
For long running computations, we're talking about generators and related iterators. In these cases, the cancellation of generators hasn't been entirely worked out yet, and instead it is expected that you would stop consuming the generator (if you don't ask for the next item, it would continue computing). The only case where cancellation applies is when you do ask for the next item, but you must await for it, in which case you'd like to have the ability to do Remember similar to As for the network, pretty much all network operations should be cancellable or timed cancellable if there's a timeout. This requires a big major refactor of the Alternatively you can also not use the decorators, and just directly use So we have to decide whether to use static parameters and the decorators, or whether to swap to just using the It's important to understand that the decorators are for convenience, but they are not infinitely flexible, it trades off some flexibility for some better convenience. |
Yes, because if decorators are static, then the parameters they take are static too. This is of course only necessary for "default" deadlines. Whatever is passed into the method/function will override it. If we do use static deadlines, then it does make sense to propagate these static parameters directly from the config object. I discussed that you can also use signals and promise cancellable directly in the method to avoid static parameterisation... |
I suggest we do a priority here.
Also additionally there's the |
I'm thinking there's sort of like 3 levels of defaults here:
I'm not sure if |
I think it is possible to do all 3. This is the function head: function timedCancellable(
lazy: boolean = false,
delay: number = Infinity,
errorTimeoutConstructor: new () => Error = contextsErrors.ErrorContextsTimedTimeOut,
) It's possible to do something like: function timedCancellable(
lazy: boolean | string | symbol = false,
delay: number | string | symbol = Infinity,
errorTimeoutConstructor: new () => Error = contextsErrors.ErrorContextsTimedTimeOut,
) And then do // If it was `undefined`, it would be set to `false`
if (typeof lazy !== 'boolean') {
lazy = this[lazy];
}
// If it was `undefined`, it would be set to `Infinity`
if (typeof delay !== 'number') {
delay = this[delay];
} However this pattern may not be scalable to other decorators which have actual string/symbol parameters... So it may be better to actually have a separate set of parameters to refer to the instance. function timedCancellable(
lazy?: boolean,
delay?: number,
errorTimeoutConstructor: new () => Error = contextsErrors.ErrorContextsTimedTimeOut,
defaults?: {
lazy: 'lazyProp';
delay: 'delayProp';
}
) But I find this syntax way too verbose... We could also use a more different type like arrow functions. function timedCancellable(
lazy: boolean | (this: any) => boolean = false,
delay: number | (this: any) => number = Infinity,
errorTimeoutConstructor: new () => Error = contextsErrors.ErrorContextsTimedTimeOut,
) It'd be unlikely for a decorator parameter to be an arrow function, but if they needed to differentiate, they could also check the number of parameters... etc. Notice the Using it would be like: class X {
lazyProp = true;
@timedCancellable((this) => this.lazyProp)
method() {
}
} |
The last style is likely the more flexible. |
Full work on this might take 1 sprint and a bit. |
The todos relating to cancellability should also be done in this issue. This issue is an epic so there are multiple subissues. @tegefaulkes you can add more issues into this epic as you see fit. |
I think we can split up the work in order of:
|
It might be better to address the locking along side the other issues when I encounter any locks. |
@CMCDragonkai pointed out that we want to reduce the punch packet interval. We can set it to something like 100 or 50 ms. We can also send a whole bunch of packets at once when trying to connect. In my experience however if you commonucate too much with a closed port some firewalls will start to ignore you. With a good connection you only need to send 2 packets each way within a second to establish a connection through a NAT. For connections with packet drop you;ll need to send more to make up for the drop but you don't need a crazy amount. |
I thought I attached more issues to this epic earlier, I need to fix that. |
Need a new sub issue for Identity manager connections. This is also needed for discovery. |
As discussed before, the creation of a We can fix this by having the This is not a critical change, currently if a uncomposed forward connection exists it will be cleaned up after a TTL period of no activity. |
Let's keep it the way it is... the proxy's internal connection logic is somewhat separated by the Node Connection and GRPC connection. This is due to how we are using HTTP connect protocol to connect these 2 disparate parts. When we get around to working on QUIC or WebRTC #234 and replacing GRPC with jsonrpc in #249, then we won't need any workarounds, everything should just be tightly integrated. So if it can still work with the TTLs, let's just continue doing this. It does make sense that it's possible to create proxy connections without there being a corresponding node connection. That's the main issue here. And I think we can continue with this design. |
I'd want to introduce the ability to set |
So this means when we move to JSON RPC, we can add another parameter to our If you don't set We can have these 3 meanings:
It might be good idea to have |
I think given that we should only allow for the default timeout or a specified timeout that is less than the default. Any values larger than the default can be clipped to the default value. Or if we want to avoid the implicit behaviour we can throw an error for a bad value. A slight problem with this is that we currently don't enforce any timeouts in the RPC system. And the only way to convey timeouts is through the metadata which the RPC system doesn't care about or even perceive as it's part of the middleware or user implementation. unless we add timeout parameter to the message structure then it's up to the middleware to handle the timeout logic. But currently the middle ware can't affect any timeouts or handler cancellation. It's meant to short circuit and reject the current stream immediately. If we want to allow the middle ware to affect cancellation then we need to pass it an abort controller and have it handle it's own abortion logic. That way we can create timeout or cancellation logic inside of the middleware. while we're at it we may as well provide the middleware with other contextual information such as the |
I think it depends on a few things.
The timeout is then a proxy for the amount of work that should be accepted by the server, and the expected deadline by the agent to complete. The timeout should be negotiated but then clients can ask for a shorter timeout or for a longer timeout but only up the maximum timeout. How do timeouts work for streams? Is it the timeout for the entire stream or only for a single message of the stream? How does grpc implement deadlines for their streams? I would prefer timeouts per message because it measures liveness and can be used for infinite streaming if that's what we are trying to implement. Therefore each rpc method should have a max timeout on the server side. The caller if not specifying a timeout can default to max. But can specify a lower timeout. |
I don't think the middleware should be dealing with timeouts. The timeouts should be part of the message params just like other metadata. The timeout info must be passed into the subsequent methods. There are some methods that will cost time but won't take a timeout. These are synchronous methods... For all other async methods a timeout can be used to cancel any operation. So technically this means the timeout isn't 100 percent accurate since it won't cancel sync operations on the server side. On the client side though, the timeout would just cancel the call and ignore any data. The only way to deal with this is an exception on the server side which could cancel sync operations. |
The deadline mechanism works in the JSON RPC system now. It's just a matter of doing agent migration. But also ensuring that the subissues here are completed. |
Applying it to sigchain depends on agent migration. Applying it to identity management only requires bridging it to the underlying IO which can be just the |
This issue is very old and predates the network stack overhaul. I'm going to close it as not planned. |
This is still required. It's independent of networking. It's basically any kind of operation that is meant to be timed and cancellable. It's a general pattern. Of course new issues should be more specific. |
Specification
With the application of
timedCancellable
to functions within the nodes domain. We have encountered new connection errors randomly during tests. These error started after applying a deadline of 1500 ms topingNode
usage. We suspect the problem is coming fromopenConnectionForward
after setting such a short deadline. These may be bugs that existed previously that was never revealed by the current testing. Currently there is 1NodeConnectionManager
test where we override the deadline foropenConnectionForward
.The occur randomly and are as follows. the can be re-created by running all of the nodes domain tests concurrently.
Additional context
timedCancellable
toSigchain
such as the deadline in nodes claim process #243Tasks
The text was updated successfully, but these errors were encountered: