Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent requests without Send requirement for single threaded runtimes #386

Open
dsherret opened this issue Mar 15, 2023 · 8 comments
Open
Labels
enhancement New feature or request

Comments

@dsherret
Copy link

In #284 there is some discussion about having #[tower_lsp::async_trait(?Send)] which I presume is a way to not have the Send requirement on futures. That said, the change seems to execute the requests sequentially and have mutable methods. How feasible would it be to have the option to not have the Send requirement and still execute requests concurrently for single threaded runtimes? Then the client can decide which requests should run sequentially or on a separate thread and also the code consuming this can use Rc and RefCells instead of needing to use Arc and Mutex.

@ebkalderon
Copy link
Owner

ebkalderon commented Mar 15, 2023

Thanks for opening this ticket, @dsherret! Just to clarify: the prototype introduced in #284 does execute requests concurrently wherever possible, provided their handlers take &self. However, some methods that necessarily mutate state (especially the handlers for initialize, shutdown, and the on-opened/closed/saved/changed events) have been changed to to take &mut self, ensuring they are always executed sequentially and in a coherent order. Is this not sufficient?

EDIT: Further detail may be found in this comment in particular: #284 (comment)

@dsherret
Copy link
Author

dsherret commented Mar 15, 2023

Maybe this is just an issue with how our implementation is designed and we're doing it wrong, but it wouldn't work for us because within requests like did_open we sometimes call into the client and we never want to call into the client while being in an &mut self state (in our case, holding a .write().await with tokio's RwLock) because the client could call back into the server and cause a deadlock.

https://github.com/denoland/deno/blob/3a46a89e34aab3bea9f555ef6c80af33a7bc2194/cli/lsp/language_server.rs#L2897-L2900

All this said, having it be Send is more a nice to have.

@ebkalderon
Copy link
Owner

ebkalderon commented Mar 15, 2023

Would the client ever call back into the server while the server is held exclusively in a &mut self method, though? I haven't found any instances in the LSP specification where this is ever the case. Perhaps you could give an example?

Most, if not all, LSP state modification flows as I understand them tend to go along this direction:

  1. Client sends a textDocument/didOpen notification to the server, which mutates state.
  2. Server sends back request A to the client in between.
  3. Client responds to request A.
    • Would the client ever send another request B in between that must be responded to by the server before response A can be sent? Is this an expected flow according to the specification?
  4. Server finishes processing textDocument/didOpen and concurrent processing of requests may resume.

Is this not the case?

@ebkalderon
Copy link
Owner

FWIW, the necessary bar for marking a LanguageServer trait method as &mut self under the proposal described in #284 is very high. Specifically, it would have to be a request or notification which necessarily blocks user input and needs to be fully processed before server-side work can continue further. The did_* trait methods seemed to fit this bill IMO, though I could be wrong about this. Servers are still permitted to send requests or notifications back to the client in these method handlers, but I don't think we can reasonably expect the server to be able to respond to any concurrent requests until its internal text document state has been fully synchronized with the client.

@dsherret
Copy link
Author

dsherret commented Mar 15, 2023

(Sorry, I said we never want to call back into the client in an &mut self (rw_lock.write().async), but actually we don’t want to call into the client even in a &self (rw_lock.read().async) in case the client calls into a &mut self)

Basically, the spec can say whatever it likes to prevent this, but what happens in practice with clients is not always according to the spec. We previously had deadlocks occurring because of this scenario, so we just never do it anymore and no longer have issues related to that for what I know (that said, the code previously used an async Mutex for every request instead of an async RwLock, so that was probably the main reason).

@ebkalderon
Copy link
Owner

ebkalderon commented Mar 15, 2023

I see what you mean, but none of the methods proposed in #284 to take &mut self would ever make sense to be called again by the client in direct response to a server-sent request. Such cases cannot happen in practice unless requests are being sent to the server in a wholly nonsensical order from the client. I agree there are potential deadlock risks with other methods, but I don't think that's the case with these specifically.

Let's see how this might play out below...

Server lifecycle messages

Methods like initialize, initialized, and shutdown are pretty obvious candidates for being marked &mut self since they inherently require exclusive access to server state and are only called once during the application lifecycle. As such, it makes sense to process these sequentially and without any other &self methods running concurrently.

Considering that initialize is an inherently serial method and the server is allowed to reject all messages sent before or during its execution with error code -32002 (server not initialized) until the server has initialized, and shutdown is allowed to reject all methods sent before or during its execution with error code -32600 (invalid request), I think all three methods would benefit from taking exclusive access to self.

Text document & workspace state change notifications

Whether the did_.* methods should be marked &mut self here is less clear cut, but I think the case in favor of this is still strong. These notifications are emitted by the client in response to a text document or workspace state change that occurred on the client side (there is a separate set of requests for synchronizing state changes from the server side to the client).

The client is the one notifying the server of the state change, not the other way around. This means it should already have all the information it needs on hand already and should be able to respond immediately to any request the server sends, thereby avoiding the trap described in #386 (comment). This means it is impossible for the following example flow to ever happen:

  1. Client calls did_open(&mut self) on the server.
  2. Server sends request "foo" to the client and expects a response before processing further incoming requests.
  3. Client sends another request to the server to gather some information in order to respond to "foo".
  4. Server deadlocks, because did_open(&mut self) is still processing and therefore can't respond to request "foo."

Let's imagine that the client sends multiple did.* notifications in a row, responding to any server-to-client requests in between. This should be fine, as the handlers for these notifications take &mut self and will be processed serially on the server side.

As a bonus, this serial processing guarantees that any and all &self methods ahead in the request queue will have resolved and responded to the client by now. Any spawned background tasks feeding data into these &self methods which may have temporarily acquired exclusive access to shared state should have let go of the Arc<RwLock<T>> or Rc<RefCell<T>> at this point. If they haven't, then that's a logic error in the server implementation itself... This would lead to a deadlock regardless of whether you're using building your server on top of tower-lsp, std::thread, or something else.

Aside: For the most part, any language server which permits concurrent processing of did.* document state change notifications is very likely to encounter deadlocks in some form and/or get out of sync with the client state over time. Generally speaking, language servers want to handle these incoming textDocument/did* notifications on the main loop as early as possible before processing other incoming requests to ensure the state changes have been fully applied and in the correct order.

All the above points considered, I'm fairly confident that these methods should be fine to mark &mut self and this change would not increase the risk of deadlocks for users. In fact, I suspect it might decrease it slightly (though is claim is entirely speculation on my part and not based on anything empirical; I'd like to see actual measurements confirming or denying this either way).

Execute command request

The final method marked &mut self in the proposal linked in #284 is the workspace/executeCommand request. This one I'm really not confident about, and I'm definitely open to leaving this one &self if there are any drawbacks to doing so.

The rationale behind this change stems from the presumption that command execution is expected to be synchronous: the command being executed might update state on the server side, or it might not. If the state is updated, then processing this request serially and exclusively relative to other pending requests is important (just like the did_.* methods described above), especially since the server will need to call the workspace/applyEdit command back on the client to synchronize these state changes back before workspace/executeCommand can respond.

Other candidates?

Looking back at this helpful suggestion on Reddit by @matklad, I suspect there may be other LanguageServer trait methods that would also benefit from taking mutable reference to self. However, I'm not entirely sure what these methods should be and whether these changes would suit all tower-lsp users uniformly, so I've avoided taking any stances on that for now.

(Please do correct me if I'm fundamentally misunderstanding some core assumptions about, not just the spec, but more the typical flows one sees in an average LSP interaction!)

@ebkalderon
Copy link
Owner

ebkalderon commented Mar 15, 2023

Apologies for the massive wall of text! ❤️ With that said, I'd be happy to continue discussion of the merits and drawbacks of supporting &mut self methods over on #284, if you'd like.

In the meantime, I'd be happy to split out the work in #284 into two separate parts if that would help? That is, I could draft a release of tower-lsp which simply drops the Send bound on all futures while leaving all LanguageServer trait methods &self for now (addressing the direct concern of this ticket) and then we could revisit the issue of &mut self receivers separately.

@ebkalderon ebkalderon added the enhancement New feature or request label Mar 16, 2023
@dsherret
Copy link
Author

@ebkalderon sorry for my delay responding here and thanks for the overview! I think you're right that having &mut self (especially the initialization methods) is probably ok. That said, I think it is nice for consumers to have control over this if they'd like to and not have the Send requirement.

In the meantime, I'd be happy to split out the work in #284 into two separate parts if that would help? That is, I could draft a release of tower-lsp which simply drops the Send bound on all futures while leaving all LanguageServer trait methods &self for now (addressing the direct concern of this ticket) and then we could revisit the issue of &mut self receivers separately.

That would be extremely helpful! It would be great for us to be able to drop the Send requirement because we will then be able to get rid of a lot of needless mutexes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants