Process kafka requests in a separate scheduling group #24973

mmaslankaprv · 2025-01-29T13:15:06Z

Added code that allows using separate scheduling group for Kafka
requests processing. The scheduling group can be switched based on the
API type. By default processing is done in the group returned by
kafka::server::get_request_handler_sg

Backports Required

Release Notes

none

Added property allowing us to control if `kafka` scheduling group should be used to handle parsing and handling Kafka API requests. Signed-off-by: Michał Maślanka <[email protected]>

Signed-off-by: Michał Maślanka <[email protected]>

Added code that allows using separate scheduling group for Kafka requests processing. The scheduling group can be switched based on the API type. By default processing is done in the group returned by `kafka::server::get_request_handler_sg` Signed-off-by: Michał Maślanka <[email protected]>

Signed-off-by: Michał Maślanka <[email protected]>

vbotbuildovich · 2025-01-29T17:15:33Z

CI test results

test results on build#61350

test_id	test_kind	job_url	test_status	passed
rptest.tests.cloud_storage_timing_stress_test.CloudStorageTimingStressTest.test_cloud_storage.cleanup_policy=compact.delete	ducktape	https://buildkite.com/redpanda/redpanda/builds/61350#0194b27d-3fa4-4236-9700-4026364e9e88	FLAKY	1/2
rptest.tests.compaction_recovery_test.CompactionRecoveryTest.test_index_recovery	ducktape	https://buildkite.com/redpanda/redpanda/builds/61350#0194b27d-3fa6-4560-9166-e42bc00c4ad4	FLAKY	1/3
rptest.tests.datalake.simple_connect_test.RedpandaConnectIcebergTest.test_translating_avro_serialized_records.cloud_storage_type=CloudStorageType.S3	ducktape	https://buildkite.com/redpanda/redpanda/builds/61350#0194b27d-3fa4-4236-9700-4026364e9e88	FLAKY	1/2
storage_single_thread_rpunit.storage_single_thread_rpunit	unit	https://buildkite.com/redpanda/redpanda/builds/61350#0194b234-9348-4a86-8544-18742a9bcac9	FLAKY	1/2

StephanDollberg · 2025-01-30T09:30:35Z

src/v/kafka/server/connection_context.cc

@@ -682,6 +684,10 @@ connection_context::dispatch_method_once(request_header hdr, size_t size) {
                     ? std::make_optional<ss::sstring>(*hdr.client_id)
                     : std::nullopt,
    };
+
+    co_await ss::coroutine::switch_to(


@travisdowns related to the discussion from yesterday, does this even allocate the coroutine frame if await_ready returns true (because we are already in the right group - ignoring for a second whether we actually are here or not)? I thought not.

Which coroutine do you refer to? In any case there shouldn't be a coroutine frame allocated per switch_to, whether the switch happens or not. What await_ready returning true does is avoids an unnecessary task switch (which is probably more expensive even than a coro frame in "micro" costs, and certainly in macro costs as it breaks out of the current executing request, etc).

travisdowns · 2025-01-31T13:23:43Z

src/v/kafka/server/server.cc

 coordinator_ntp_mapper& server::coordinator_mapper() {
    return _group_router.local().coordinator_mapper().local();
 }

+ss::scheduling_group server::get_request_scheduling_group(api_key key) const {
+    if (key == produce_request::api_type::key) {


For this kind of stuff usually we want to add a handler method I think, rather than having the kafka server know anything about the different keys. E.g., a get_request_sg() method on the generic handler which returns the group (or none/default) to use?

Is the problem that the handlers shouldn't access config?

good point, i will change that

travisdowns · 2025-01-31T16:01:37Z

src/v/resource_mgmt/cpu_scheduling.h

+        /**
+         * Scheduling group to process requests received via the REST API of
+         * admin server.
+         */


Love to see these comments, thanks!

travisdowns

Looks good pending moving the SG definition into the handler.

mmaslankaprv added 4 commits January 29, 2025 14:06

config: added property to contol kafka handler scheduling group

01b6b4d

Added property allowing us to control if `kafka` scheduling group should be used to handle parsing and handling Kafka API requests. Signed-off-by: Michał Maślanka <[email protected]>

kafka/server: pass in scheduling group for requests processing

01d654b

Signed-off-by: Michał Maślanka <[email protected]>

resources: added commend describing scheduling group purpose

44c7696

Signed-off-by: Michał Maślanka <[email protected]>

mmaslankaprv requested a review from a team as a code owner January 29, 2025 13:15

github-actions bot added the area/redpanda label Jan 29, 2025

mmaslankaprv requested review from dotnwat, travisdowns, StephanDollberg and ballard26 January 29, 2025 18:26

StephanDollberg approved these changes Jan 30, 2025

View reviewed changes

travisdowns reviewed Jan 31, 2025

View reviewed changes

travisdowns approved these changes Jan 31, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process kafka requests in a separate scheduling group #24973

Process kafka requests in a separate scheduling group #24973

mmaslankaprv commented Jan 29, 2025 •

edited

Loading

vbotbuildovich commented Jan 29, 2025

StephanDollberg Jan 30, 2025

travisdowns Jan 31, 2025

travisdowns Jan 31, 2025

mmaslankaprv Jan 31, 2025

travisdowns Jan 31, 2025

travisdowns left a comment

Process kafka requests in a separate scheduling group #24973

Are you sure you want to change the base?

Process kafka requests in a separate scheduling group #24973

Conversation

mmaslankaprv commented Jan 29, 2025 • edited Loading

Backports Required

Release Notes

vbotbuildovich commented Jan 29, 2025

CI test results

StephanDollberg Jan 30, 2025

Choose a reason for hiding this comment

travisdowns Jan 31, 2025

Choose a reason for hiding this comment

travisdowns Jan 31, 2025

Choose a reason for hiding this comment

mmaslankaprv Jan 31, 2025

Choose a reason for hiding this comment

travisdowns Jan 31, 2025

Choose a reason for hiding this comment

travisdowns left a comment

Choose a reason for hiding this comment

mmaslankaprv commented Jan 29, 2025 •

edited

Loading