async mode between orchagent and syncd got broken by #1023
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why I did it
Async mode can be configured via "config synchronous_mode disable" CLI.
(see for more info https://github.com/sonic-net/SONiC/blob/master/doc/synchronous-mode/synchronous-mode-cfg.md)
But after async mode become operational (post reboot) the syncd process shutdowns every couple of minutes.
The breakage was introduced via
sonic-net/sonic-sairedis#1362
How I did it
Added newly introduced flex counter operations into lua script to explicitly specify that no redis updates are needed.
The lua script runs by syncd as part of ConsumerTable IPC
(see for more info https://r12f.com/sonic-book/4-2-4-producer-consumer-table.html)
The script invoked with ARGV[2] == "0" for sync mode and with ARGV[2] == "1" for async mode
(ARGV[2] is m_modifyRedis in C++ code).
How to verify it
After enabling async mode via
"config synchronous_mode disable"
"config save -y"
"reboot"
the device comes back up and and syncd container is stable
and commands like "show interfaces status" works
Which release branch to backport (provide reason below if selected)
Description for the changelog
Fixed constant syncd shutdown in async mode.
A picture of a cute animal (not mandatory but encouraged)
Here are syslogs output example when async mode enabled (the issue is seen):
2025 Jun 4 20:56:59.067942 sonic ERR syncd#syncd: :- run: Runtime error: RedisReply catches system_error: command: *7\r\n$7\r\nEVALSHA\r\n$40\r\n06dfdd67c2f44f63b63b928611c69779957900bf\r\n$1\r\n2\r\n$29\r\nASIC_STATE_KEY_VALUE_OP_QUEUE\r\n$10\r\nASIC_STATE\r\n$3\r\n128\r\n$1\r\n1\r\n, reason: ERR user_script:105: unsupported operation command: set_counter_group, FIXME script: 06dfdd67c2f44f63b63b928611c69779957900bf, on @user_script:105.: Input/output error: Input/output error
2025 Jun 4 20:56:59.068001 sonic NOTICE syncd#syncd: :- sendShutdownRequest: sending switch_shutdown_request notification to OA for switch: oid:0x21000000000000