-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modify syncd init script for supporting yml #1411
Conversation
…onic-net#1402) Fixes sonic-net/sonic-buildimage#19411 During fast reboot temp view is not used and any failure will not result in SAI failure dump taken. Hence moving the handling of SAI failure dump above the ignore of non temp view.
|
if [ ! -z "$PLT_CONFIG_YML" ] && [ -f $PLATFORM_DIR/common_config_support ]; then | ||
cp -f $HWSKU_DIR/*.config.yml /tmp | ||
cp -f /etc/sai.d/sai.profile /tmp | ||
CONFIG_YML=$(find /tmp -name '*.yml') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@geans-pin what about this config.bcm file? It doesn't have .yml suffix, but it is yaml format, how to deal with it? https://github.com/sonic-net/sonic-buildimage/blob/b73d613bf581076192dd0150cb35d6d2de6645b1/device/arista/x86_64-arista_7060dx5_64s/Arista-7060DX5-64S/th4-a7060dx5-64s.config.bcm#L4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use the file name .yml suffix to identify the file format. Can you change the sai.profile and *.config.bcm to *.yml as following BRCM SDK rule ?
SAI 1.11.0 added support for bulk neighbor entries. Adding support for neighbor bulk operations to syncd. * added neighbor entry capability to bulk operations in syncd * added unit tests for neighbor bulk operations * added code coverage for neighbor bulk operations Signed-off-by: Nikola Dancejic <[email protected]>
what is motivation here? please provide extended descrption for this pr |
I had added the comment to remind this in HLD. Please check the PR of HLD |
still not description in PR and seems like build is failing |
Update the SAI submodule for 202405 to pickup the fix for sonic-net/sonic-buildimage#19972 Signed-off-by: Arvindsrinivasan Lakshmi Narasimhan <[email protected]>
…onic-net#1412) (sonic-net#1433) On Mellanox platforms, the /tmp/sai.profile file inside syncd might end up with invalid formatting after running the script modified by this PR, such as missing new lines after concatenation or redundant new lines. This PR aims to fix that. Co-authored-by: Tomer Shalvi <[email protected]>
Also, in the HLD PR. Please check the following description. #Note, the Overwrite Section should be located after normal section in the common config file, otherwise the logic will overwrite all properties |
here: #1411 (comment) no description |
With the new fix, we don't have this limitation. So, we don't need the comment now. Geans |
…1445) The counters for syncd (switch chip) were attempted to be added to gbsyncd (gearbox phys), and vice versa. This issue is introduced by sonic-net#1362 When setting the redis attribute SAI_REDIS_SWITCH_ATTR_FLEX_COUNTER_GROUP and SAI_REDIS_SWITCH_ATTR_FLEX_COUNTER, the operation is applied to every contexts (both syncd and gbsyncd). However, the counters to initialize could only exist in one context. The fix is to check that the target switch id exists in a specific context, for the set operation of any redis type attribute; if not, skip the set on that context. Before the fix, we see the error below in syslog ERR gbsyncd#syncd: :- processFlexCounterEvent: port VID oid:0x1000000000002, was not found (probably port was removed/splitted) and will remove from counters now After the fix, log like below is printed on info level: INFO swss#orchagent: :- containsSwitch: context phys failed to find switch oid:0x21000000000000 This change is needed by the SONiC release 202405 and later. Co-authored-by: byu343 <[email protected]>
61e80b3
to
0a5fcb7
Compare
Rename file to avoid clone issue in case insensitive file system. This caused auto-cherry-pick don't work.
/azpw run |
/AzurePipelines run |
Azure Pipelines successfully started running 1 pipeline(s). |
Fix checker issue : remove white space
Remove white space
remove white space
remove white space
Can you help the checker failed ? In this PR, we don't change this src/singaler.cpp actually. Geans Assertion failed: pfd.revents & POLLIN (src/signaler.cpp:265) |
…onic-net#1461) Backported PR sonic-net#1420 from master to 202405. Signed-off-by: Rajkumar P R <[email protected]>
To fix RPC build issue
…sting by replacing redis-rdb-tool with rdb-cli (sonic-net#1391) (sonic-net#1471) Why I did it Fix issue: sonic-net#1387 The latest redis-rdb-tools-0.1.15 doesn't support Redis 7.0. Redis 7.0 was released in 2020 and adopted by SONiC's latest version. So, this issue turned out. https://github.com/sripathikrishnan/redis-rdb-tools I.e., the rdb-tools is far behind the Redis 7.0. The librdb can perfectly fix this issue. Please see quote from https://github.com/redis/librdb. Motivation behind this project There is a genuine need by the Redis community for a versatile RDB file parser that can export data, perform data analysis, or merely extract raw data from RDB and RESTORE it against a live Redis server. However, available parsers have shortcomings in some aspects such as lack of long-term support, lagging far behind the latest Redis release, and usually not being optimized for memory, performance, or high-traffic streaming for production environments. Additionally, most of them are not written in C, which limits the reuse of Redis components and the potential to contribute back to Redis repo. To address these issues, it is worthwhile to develop a new parser with a modern architecture, that maybe can also challenge the current integrated RDB parser of Redis and even replace it in the future. So, the below PRS are to replace rdbtools with librdb's tool rdb-cli. sonic-net/sonic-buildimage#19268 Co-authored-by: JunhongMao <[email protected]>
done < $from_file | ||
echo "# End of $message" >> $to_file | ||
echo "Merged $from_file to $to_file" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On 7260 hwsku, syncd is not running with this change:
2024 Nov 25 10:04:26.043570 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Merged /usr/share/sonic/device/x86_64-broadcom_common/x86_64-broadcom_b97/broadcom-sonic-th2.config.bcm to /tmp/config.bcm
2024 Nov 25 10:04:26.043962 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Merging /etc/sai.d/config.bcm with /usr/share/sonic/device/x86_64-broadcom_common/x86_64-broadcom_b97/broadcom-sonic-th2.config.bcm, merge files stored in /tmp/config.bcm
2024 Nov 25 10:04:26.359627 str2-7260cx3-acs-9 INFO lldp#supervisord 2024-11-25 10:04:26,358 INFO success: rsyslogd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2024 Nov 25 10:04:26.411278 str2-7260cx3-acs-9 DEBUG syncd#syncd: :> syncd_main: enter
2024 Nov 25 10:04:26.415024 str2-7260cx3-acs-9 NOTICE syncd#syncd: :- initialize: initializeing metadata log function
2024 Nov 25 10:04:26.415082 str2-7260cx3-acs-9 WARNING syncd#syncd: :- parseCommandLine: param -s is depreacated, use -z
2024 Nov 25 10:04:26.415106 str2-7260cx3-acs-9 WARNING syncd#syncd: :- parseCommandLine: unknown option B
2024 Nov 25 10:04:26.415128 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd /usr/bin/syncd: invalid option -- 'B'#015
2024 Nov 25 10:04:26.415150 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Usage: syncd [-d] [-p profile] [-t type] [-u] [-S] [-U] [-C] [-s] [-z mode] [-l] [-g idx] [-x contextConfig] [-b breakConfig] [-h]#015
2024 Nov 25 10:04:26.415174 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -d --diag#015
2024 Nov 25 10:04:26.415195 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Enable diagnostic shell#015
2024 Nov 25 10:04:26.415218 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -p --profile profile#015
2024 Nov 25 10:04:26.415240 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Provide profile map file
2024 Nov 25 10:04:26.415261 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd #015
2024 Nov 25 10:04:26.415283 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -t --startType type
2024 Nov 25 10:04:26.415348 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd #015
2024 Nov 25 10:04:26.415373 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Specify start type (cold|warm|fast|fastfast|express)
2024 Nov 25 10:04:26.415395 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd #015
2024 Nov 25 10:04:26.415414 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -u --useTempView
2024 Nov 25 10:04:26.415452 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd #015
2024 Nov 25 10:04:26.415475 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Use temporary view between init and apply
2024 Nov 25 10:04:26.415532 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd #015
2024 Nov 25 10:04:26.415556 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -S --disableExitSleep
2024 Nov 25 10:04:26.415578 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd #015
2024 Nov 25 10:04:26.415621 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Disable sleep when syncd crashes
2024 Nov 25 10:04:26.415658 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd #015
2024 Nov 25 10:04:26.419067 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -U --enableUnittests
2024 Nov 25 10:04:26.419125 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd #015
2024 Nov 25 10:04:26.419148 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Metadata enable unittests
2024 Nov 25 10:04:26.419169 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd #015
2024 Nov 25 10:04:26.419190 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -C --enableConsistencyCheck#015
2024 Nov 25 10:04:26.419211 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Enable consisteny check DB vs ASIC after comparison logic#015
2024 Nov 25 10:04:26.419231 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -s --syncMode#015
2024 Nov 25 10:04:26.419250 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Enable synchronous mode (depreacated, use -z)#015
2024 Nov 25 10:04:26.419274 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -z --redisCommunicationMode#015
2024 Nov 25 10:04:26.419297 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Redis communication mode (redis_async|redis_sync|zmq_sync), default: redis_async#015
2024 Nov 25 10:04:26.419319 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -l --enableBulk#015
2024 Nov 25 10:04:26.419341 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Enable SAI Bulk support#015
2024 Nov 25 10:04:26.419366 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -g --globalContext#015
2024 Nov 25 10:04:26.419389 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Global context index to load from context config file#015
2024 Nov 25 10:04:26.419410 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -x --contextConfig#015
2024 Nov 25 10:04:26.419431 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Context configuration file#015
2024 Nov 25 10:04:26.419452 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -b --breakConfig#015
2024 Nov 25 10:04:26.419474 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Comparison logic 'break before make' configuration file#015
2024 Nov 25 10:04:26.419495 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -w --watchdogWarnTimeSpan#015
2024 Nov 25 10:04:26.419518 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Watchdog time span (in microseconds) to watch for execution#015
2024 Nov 25 10:04:26.419540 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd -h --help#015
2024 Nov 25 10:04:26.419561 str2-7260cx3-acs-9 INFO syncd#supervisord: syncd Print out this message#015
2024 Nov 25 10:04:26.440339 str2-7260cx3-acs-9 NOTICE syncd#dsserve: child /usr/bin/syncd exited status: 256
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zhaohui,
I found the following PR which I merged from master caused the issue.
c4aea50
In my previous local PR for running UT, I didn't hit this issue. Recently, I merged my PR and rebase with master.
The change from the following PR was merged to the syncd_init_common.sh and caused the issue.
Can you check the following PR ? Not sure how this merged to community master ?
I try to remove the change of the PR change from Stephen Sun, the issue is resolved.
Can you please confirm this again ?
Geans
3 weeks ago
Do not poll counters in bulk mode during initialization for objects t…
47
48
49
50
51
SUPPORTING_BULK_COUNTER_GROUPS=$(echo $SYNCD_VARS | jq -r '.supporting_bulk_counter_groups')
if [ "$SUPPORTING_BULK_COUNTER_GROUPS" != "" ]; then
CMD_ARGS+=" -B $SUPPORTING_BULK_COUNTER_GROUPS"
fi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zhaohui,
I found the following PR which I merged from master caused the issue. c4aea50
In my previous local PR for running UT, I didn't hit this issue. Recently, I merged my PR and rebase with master. The change from the following PR was merged to the syncd_init_common.sh and caused the issue.
Can you check the following PR ? Not sure how this merged to community master ? I try to remove the change of the PR change from Stephen Sun, the issue is resolved.
Can you please confirm this again ?
Geans
3 weeks ago
Do not poll counters in bulk mode during initialization for objects t… 47 48 49 50 51 SUPPORTING_BULK_COUNTER_GROUPS=$(echo $SYNCD_VARS | jq -r '.supporting_bulk_counter_groups') if [ "$SUPPORTING_BULK_COUNTER_GROUPS" != "" ]; then CMD_ARGS+=" -B $SUPPORTING_BULK_COUNTER_GROUPS" fi
@geans-pin could you please help fix the conflicts? And then I will test it again with your latest submit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any conflicts. The issue is caused by the previous merged commits. Can you check with the PR owner of the previous commit by Stephen Sun? I have no idea why he add the new option -B? I don't see any UT on this PR from Stephen Sun.
BTW, I can see the same issue when syncd running with this script syncd_init_common.sh without my PR change
Geans
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@geans-pin To make things easier, #1437 was merged before your PR, we have to rebase the code.
I will test the changes in your PR without -B option, I mean based on 202405 image, not master image.
@ ZhaohuiS , Please help look into @geans-pin analysis shared here before and help resolve the conflicts.? |
Zhaohui, FYI, This is community checker test issue which is not conflict and not related with my PR code change. Please help to resolve it. Assertion failed: pfd.revents & POLLIN (src/signaler.cpp:265) |
Let me rebase update the branch and see if this issue can be resolved |
Rebase with upstream master
I rebase with upstream master and see the following compile error. Any idea ? sai_serialize.h:76:20: error: 'sai_prefix_compression_entry_t' does not name a type Geans make[5]: 'saimetadata.c' is up to date. |
Not sure if you complete the rebase, what we need in this PR is for the syncd/scripts/syncd_init_common.sh and syncd/scripts/brcm_common_config_ut.sh. But I noticed there are other updates for other files. Can you confirm? |
Yes, only two scripts required in this PR. I am thinking to clone a new branch from master and try again~ Let me know your thoughts Geans |
I bring my changes to a whole new PR and try.. Let's see if any difference |
FYI, in the new fresh PR, I don't see the strange compile error which looks better but see this Azure Pipelines ran longer error. Geans ##[error]The job running on agent Azure Pipelines 13 ran longer than the maximum time of 60 minutes. For more information, see https://go.microsoft.com/fwlink/?linkid=2077134 Job preparation parameters |
if you rebased with master, then SAI changes to latest 1.15.1 which contains preffix compression entry, and your test_sai_*cpp files are no longer required, are you sure your changes is meant for master branch ? and not some older one ? |
Yes, the PR is meant for master branch. It' just a script file changes. Not sure why lots of strange compile error coming in when rebase with master using the Github Desktop. Anyway, I have cloned a new PR from master with the same changes. |
Zhaohui, With /azpw run, all checker pass on PR1474. Please check the PR1474, I will close this old PR1411. |
No description provided.