Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] crimson/net: potential optimizations and evaluation #27430

Closed
wants to merge 6 commits into from

Conversation

cyx1231st
Copy link
Member

@cyx1231st cyx1231st commented Apr 8, 2019

Got 34.6% better IOPS in average.

Optimizations ideas (PoC implementation with v1):

  • [A] gather message buffers when send:
    • Gather buffers from pending messages and send them together;
    • Minimize the attempts to check keepalive/keepalive_ack when sending;
    • Reduce seastar tasks in the write path;
  • [B] batch message reads:
    • Keep decoding outside the fast read path;
    • Batch message decodings;
    • Reduce seastar tasks in the read path;
  • [C] increase out buffer to 65536
    • Improve batching in the write path;
  • [D] set seastar::net::posix_data_source_impl with buf_size = 65536 (NOT included)
    • Improve batching in the read path;

[E] Batching heavily uses seastar::promise<>, so applied optimizations rzarzynski/seastar@682bd90 and rzarzynski/seastar@682bd90 from @rzarzynski

Test command:

$perf_crimson_msgr --round=2097152 --cbs=4096 --sbs=4096 -c 3
perf settings:
  client[>> v1:0.0.0.0:9010/0](bs=4096, rounds=2097152, jobs=1, depth=512)
  server[v1:0.0.0.0:9010/0](bs=4096, core=0)

Test result (RelWithDebInfo build):

rounds [master] [A~D] [A~D]+E
1      10.97s   8.23s 8.09s
2      10.92s   8.20s 8.08s
3      10.82s   8.26s 8.10s
4      10.87s   8.20s 8.09s
5      10.93s   8.23s 8.15s

Copy link
Contributor

@rzarzynski rzarzynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable. Looking forward for the ProtovolV2 implementation! 👍 Thanks, @cyx1231st!

bufferlist bl;
size_t len = msgs.size();
while (!msgs.empty()) {
auto msg = msgs.front();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there is no need to bump up the counter of MessageRef. We could ::pop() after processing current message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

seastar::future<> ProtocolV1::write_messages(std::queue<MessageRef>& msgs)
{
bufferlist bl;
size_t len = msgs.size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unused?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

{
bufferptr ptr(msg_len);
memset(ptr.c_str(), 0, msg_len);
msg_data.append(ptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See ceph::bufferlist::append_zero.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -595,6 +595,45 @@ void ProtocolV1::start_accept(SocketFRef&& sock,

// open state

seastar::future<> ProtocolV1::write_messages(std::queue<MessageRef>& msgs)
{
bufferlist bl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to ::reserve() required space for the C-string taking ::append() calls:

  bl.reserve(msgs.size() * (sizeof(CEPH_MSGR_TAG_MSG) + sizeof(header) + sizeof(footer)));

This should prevent from allocating 4k at first call and wasting memory when the queue has few messages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry not familiar with bufferlist interface, can I still use bl.append(...) to fill the C-strings, after calling bl.reserve(...)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, it's OK.

if (msg == conn.out_q.front()) {
conn.out_q.pop();
}
return write_messages(conn.out_q)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@cyx1231st cyx1231st force-pushed the wip-seastar-msgr-send-all branch from 59793a2 to c05484d Compare April 11, 2019 15:21
@cyx1231st cyx1231st changed the title [RFC] crimson/net: gather message buffers when send [RFC] crimson/net: potential optimizations and evaluation Apr 11, 2019
@cyx1231st cyx1231st force-pushed the wip-seastar-msgr-send-all branch 2 times, most recently from b1d3902 to bf4919b Compare April 23, 2019 05:15
Use gather buffers from pending messages and send them together.
Minimize the attempts to check keepalive/keepalive_ack when sending.

Signed-off-by: Yingxin Cheng <[email protected]>
Dispatch cooked messages together, and give chance for dispatchers to
slow down reading buffers.

Signed-off-by: Yingxin Cheng <[email protected]>
@cyx1231st
Copy link
Member Author

Close this in favor of #27836 #27788 and ceph/seastar#4

@cyx1231st cyx1231st closed this Apr 29, 2019
@cyx1231st cyx1231st deleted the wip-seastar-msgr-send-all branch October 9, 2019 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants