Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SNMP output having fewer unicast queues than expected #330

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

justin-wong-ce
Copy link

@justin-wong-ce justin-wong-ce commented Sep 5, 2024

SNMP assumes the first half of the queues to be unicast and the second half to be multicast. This assumption was fine for most hwSkus but not new ones being added to support, i.e. x86_64-arista_7060x6.

This change will allow configurations where there are more than half of maximum queues set as unicast. It however does not account for the case where there are more than half of the maximum queues set as multicast - that will cause problems for the sonic-mgmt snmp/test_snmp_queue.py test.

If the SNMP is expected to only return unicast queues, then I will make further changes to the unit test and source code to reflect this expectation in functionality.
If not, sonic-mgmt's snmp/test_snmp_queue.py should not expect only unicast queues to be returned, and SNMP should return all queues, not half (https://github.com/sonic-net/sonic-mgmt/blob/ac37f7f0fd7c7857559325a61d5ca6e6c4932194/tests/snmp/test_snmp_queue.py#L92) (grep UC)

Note:
Only tested on Arista hwSkus.

- What I did
Accommodate for the case where the number of Unicast queues exceed the number of Multicast queues.

- How I did it
Allow Unicast queues to be added to response when their queue number is greater than half of the total number of queues.

- How to verify it
Passing unit tests and SNMP tests in sonic-mgmt. Note: Only ran on Arista hwSkus

- Description for the changelog

SNMP assumes the first half of the queues to be unicast and the second half to be multicast.
This assumption was fine for most hwSkus but not new ones being added to
support, i.e. x86_64-arista_7060x6

Only tested on Arista hwSkus.
@r12f
Copy link

r12f commented Sep 9, 2024

@Janetxxx , do you mind to cherry pick this change to our internal test branch?

@Janetxxx
Copy link

@Janetxxx , do you mind to cherry pick this change to our internal test branch?

@r12f Done.

@r12f
Copy link

r12f commented Sep 10, 2024

Great! Thanks Janet!

@StormLiangMS
Copy link

hi @qiluo-msft @SuvarnaMeenakshi could you help to review?

Janetxxx
Janetxxx previously approved these changes Sep 23, 2024
Copy link

@Janetxxx Janetxxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justin-wong-ce @r12f I have tested it in the internal setup, it worked.

---------------------------------------------------------------- generated xml file: /data/sonic-mgmt-int/tests/logs/snmp/test_snmp_queue.xml ----------------------------------------------------------------
------------------------------------------------------------------------------------------- live log sessionfinish -------------------------------------------------------------------------------------------
04:57:31 __init__.pytest_terminal_summary         L0067 INFO   | Can not get Allure report URL. Please check logs
================================================================================= 1 passed, 1 warning in 1169.64s (0:19:29) ==================================================================================

@Janetxxx
Copy link

@qiluo-msft Could you please help to merge this PR, thanks!

@SuvarnaMeenakshi
Copy link
Contributor

@justin-wong-ce , did you verify if the output of SNMP walk of queue counter from any other Arista hwsku is the same before and after the fix?
Can you share the output if possible?

@justin-wong-ce
Copy link
Author

@justin-wong-ce , did you verify if the output of SNMP walk of queue counter from any other Arista hwsku is the same before and after the fix? Can you share the output if possible?

I checked its the same for the following hwSkus:

  • Arista-7050CX3-32S C32
  • Arista-7060CX-32S D48C8
  • Arista-7260CX3 D108C8

I will follow up with the output via email.

if self.queue_type_map[namespace].get(queue_sai_oid) == 'SAI_QUEUE_TYPE_UNICAST':
pq_count = pq_count + 1

# If there are fewer unicast queues than half of max queues, we use the old assumption of second half mcast
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is best that we don't do this - instead return only unicast queues at all times.

# To simulate vendor OID, we wrap queues by max priority groups
port_max_queues = Namespace.dbs_get_all(self.db_conn, mibs.STATE_DB,
mibs.buffer_max_parm_table(self.oid_name_map[if_index]))['max_queues']
pq_count = math.ceil(int(port_max_queues) / 2)
max_queues_half = math.ceil(int(port_max_queues) / 2)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can achieve https://github.com/sonic-net/sonic-snmpagent/pull/330/files#r1786870229 by not doing this check and keeping pq_count as the number of actual unicast queues at all times.

pq_count = math.ceil(int(port_max_queues) / 2)
max_queues_half = math.ceil(int(port_max_queues) / 2)
if pq_count < max_queues_half:
pq_count = max_queues_half
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a new testcase which could fail the code before your PR?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a new sub-case to an existing test case to check.

Note: If we will commit to returning all queues, or only returning unicast queues (no matter the number), then this test case, along with other test cases will need to be updated.

Added a unit test subcase and relevant mock tablesto check for SNMP
behavior when there are more unicast queues than half of the total
number of queues.

This is to ensure SNMP does not operate on the false assumption where
there is always a 50-50 split between unicast and multicast queues.
Copy link

@Janetxxx Janetxxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. It passed tests in my local setup.

---------------------------------------------------------------- generated xml file: /data/sonic-mgmt-int/tests/logs/snmp/test_snmp_queue.xml ----------------------------------------------------------------
------------------------------------------------------------------------------------------- live log sessionfinish -------------------------------------------------------------------------------------------
10:03:29 __init__.pytest_terminal_summary         L0067 INFO   | Can not get Allure report URL. Please check logs
================================================================================= 1 passed, 1 warning in 1103.72s (0:18:23) ==================================================================================

@Janetxxx
Copy link

@qiluo-msft Could you please help to review the new testcase? Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants