Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC should set connection timeout. #411

Open
JackyWoo opened this issue Apr 12, 2023 · 3 comments
Open

RPC should set connection timeout. #411

JackyWoo opened this issue Apr 12, 2023 · 3 comments

Comments

@JackyWoo
Copy link
Contributor

JackyWoo commented Apr 12, 2023

In my project thread was stuck by NuRaft auto forwarding RPC connect method.

How to reproduce?

1.Thread node n1, n2, n3 and let's assume that n3 is leader.
2.Make network partition separating n3 with n1, n2 by iptables.
3. wait new leader
4. send request to n1

PS: you can increase OS option net.ipv4.tcp_syn_retries to increase system max connect timeout.

details see JDRaftKeeper/RaftKeeper#42

Solution

I think we should add connection timeout.

@greensky00
Copy link
Contributor

greensky00 commented Apr 18, 2023

There is already a timeout logic that calls when_done with failure, so as to make peer.cxx close the RPC client (hence the socket and connection). Do you mean this logic did not work as explained?

NuRaft/src/asio_service.cxx

Lines 1028 to 1040 in 728644d

ptr<asio::steady_timer> timer =
cs_new<asio::steady_timer>(impl_->get_io_svc());
timer->expires_after
( std::chrono::duration_cast<std::chrono::nanoseconds>
( std::chrono::milliseconds( SEND_RETRY_MS ) ) );
timer->async_wait( std::bind( &asio_rpc_client::send_retry,
this,
self,
timer,
req,
when_done,
send_timeout_ms,
std::placeholders::_1 ) );

@JackyWoo
Copy link
Contributor Author

As my understanding the timer here is set to wait other thread connect and send data and when time is up it will retry send.

And I find socket connect has no timout:

NuRaft/src/asio_service.cxx

Lines 1231 to 1248 in 728644d

resolver_.async_resolve
( q,
[self, this, req, when_done, host, port, send_timeout_ms]
( std::error_code err,
asio::ip::tcp::resolver::iterator itor ) -> void
{
if (!err) {
asio::async_connect
( socket(),
itor,
std::bind( &asio_rpc_client::connected,
self,
req,
when_done,
send_timeout_ms,
std::placeholders::_1,
std::placeholders::_2 ) );
} else {

I am not familar with NuRaft RPC and please correct me if I am wrong.

@greensky00
Copy link
Contributor

Ok got it, the existing timeout logic does not work for auto forwarding as there is no retry for auto forwarding. Need more time to think about it in detail, to unify it with existing timeout logic. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants