Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent ET8001-2FR4 do a CMIS reinit even only with the alias name set #551

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

chiourung
Copy link
Contributor

Description

When breakout ET8001-2FR4 to 2x400G, and configstates would be ConfigSuccess for 4 lanes and ConfigUndefined for the others.

root@ais800-64o-1:/home/admin# show interfaces transceiver status Ethernet0
Ethernet0:
        Tx fault flag on media lane 1: N/A
        Tx fault flag on media lane 2: N/A
        Tx fault flag on media lane 3: N/A
        Tx fault flag on media lane 4: N/A
        Tx fault flag on media lane 5: N/A
        Tx fault flag on media lane 6: N/A
        Tx fault flag on media lane 7: N/A
        Tx fault flag on media lane 8: N/A
        Rx loss of signal flag on media lane 1: False
        Rx loss of signal flag on media lane 2: False
        Rx loss of signal flag on media lane 3: False
        Rx loss of signal flag on media lane 4: False
        Rx loss of signal flag on media lane 5: False
        Rx loss of signal flag on media lane 6: False
        Rx loss of signal flag on media lane 7: False
        Rx loss of signal flag on media lane 8: False
        TX disable status on lane 1: False
        TX disable status on lane 2: False
        TX disable status on lane 3: False
        TX disable status on lane 4: False
        TX disable status on lane 5: False
        TX disable status on lane 6: False
        TX disable status on lane 7: False
        TX disable status on lane 8: False
        Disabled TX channels: 0
        Current module state: ModuleReady
        Reason of entering the module fault state: No Fault detected
        Datapath firmware fault: False
        Module firmware fault: False
        Module state changed: False
        Data path state indicator on host lane 1: DataPathActivated
        Data path state indicator on host lane 2: DataPathActivated
        Data path state indicator on host lane 3: DataPathActivated
        Data path state indicator on host lane 4: DataPathActivated
        Data path state indicator on host lane 5: DataPathActivated
        Data path state indicator on host lane 6: DataPathActivated
        Data path state indicator on host lane 7: DataPathActivated
        Data path state indicator on host lane 8: DataPathActivated
        Tx output status on media lane 1: True
        Tx output status on media lane 2: True
        Tx output status on media lane 3: True
        Tx output status on media lane 4: True
        Tx output status on media lane 5: True
        Tx output status on media lane 6: True
        Tx output status on media lane 7: True
        Tx output status on media lane 8: True
        Rx output status on host lane 1: True
        Rx output status on host lane 2: True
        Rx output status on host lane 3: True
        Rx output status on host lane 4: True
        Rx output status on host lane 5: True
        Rx output status on host lane 6: True
        Rx output status on host lane 7: True
        Rx output status on host lane 8: True
        Tx loss of signal flag on host lane 1: False
        Tx loss of signal flag on host lane 2: False
        Tx loss of signal flag on host lane 3: False
        Tx loss of signal flag on host lane 4: False
        Tx loss of signal flag on host lane 5: False
        Tx loss of signal flag on host lane 6: False
        Tx loss of signal flag on host lane 7: False
        Tx loss of signal flag on host lane 8: False
        Tx clock and data recovery loss of lock on host lane 1: False
        Tx clock and data recovery loss of lock on host lane 2: False
        Tx clock and data recovery loss of lock on host lane 3: False
        Tx clock and data recovery loss of lock on host lane 4: False
        Tx clock and data recovery loss of lock on host lane 5: False
        Tx clock and data recovery loss of lock on host lane 6: False
        Tx clock and data recovery loss of lock on host lane 7: False
        Tx clock and data recovery loss of lock on host lane 8: False
        Rx clock and data recovery loss of lock on media lane 1: False
        Rx clock and data recovery loss of lock on media lane 2: False
        Rx clock and data recovery loss of lock on media lane 3: False
        Rx clock and data recovery loss of lock on media lane 4: False
        Rx clock and data recovery loss of lock on media lane 5: False
        Rx clock and data recovery loss of lock on media lane 6: False
        Rx clock and data recovery loss of lock on media lane 7: False
        Rx clock and data recovery loss of lock on media lane 8: False
        Configuration status for the data path of host line 1: ConfigUndefined
        Configuration status for the data path of host line 2: ConfigUndefined
        Configuration status for the data path of host line 3: ConfigUndefined
        Configuration status for the data path of host line 4: ConfigUndefined
        Configuration status for the data path of host line 5: ConfigSuccess
        Configuration status for the data path of host line 6: ConfigSuccess
        Configuration status for the data path of host line 7: ConfigSuccess
        Configuration status for the data path of host line 8: ConfigSuccess

When execute the api "api.scs_apply_datapath_init(host_lanes_mask)" for 0xf0,
the config status of lanes 1...4 would be ConfigUndefined and the config status of lanes 5...8 would be ConfigSuccess.
If the config state is ConfigUndefined, then it would do CMIS reinit for any set in the CONFIG_DB PORT table.
If the config state is ConfigUndefined, then it would do CMIS reinit when restart pmon.

Motivation and Context

In is_cmis_application_update_required, it check if the application is the same and if datapath is DataPathActivated.
It would not happen that the state of the config is fail, but the dataptach is activated.
I think skip CMIS int when config state is ConfigUndefined is saved.

How Has This Been Tested?

Additional Information (Optional)

@prgeor prgeor requested a review from mihirpat1 December 1, 2024 14:44
@prgeor
Copy link
Collaborator

prgeor commented Dec 1, 2024

@mihirpat1 can you reviwe?

@mihirpat1
Copy link
Contributor

@chiourung Can you please elaborate on the issue here? I am not able to understand on why CMIS init should be skipped if the interested lanes are in ConfigUndefined state

@chiourung
Copy link
Contributor Author

@chiourung Can you please elaborate on the issue here? I am not able to understand on why CMIS init should be skipped if the interested lanes are in ConfigUndefined state

When breakout ET8001-2FR4 to 2x400G, and configstates would be ConfigSuccess for 4 lanes and ConfigUndefined for the others.
For example, breakout Ethernet0 to 2x400G.
If Ethernet0 is configured first, then the configuration states would be

        Configuration status for the data path of host line 1: ConfigSuccess
        Configuration status for the data path of host line 2: ConfigSuccess
        Configuration status for the data path of host line 3: ConfigSuccess
        Configuration status for the data path of host line 4: ConfigSuccess
        Configuration status for the data path of host line 5: ConfigUndefined
        Configuration status for the data path of host line 6: ConfigUndefined
        Configuration status for the data path of host line 7: ConfigUndefined
        Configuration status for the data path of host line 8: ConfigUndefined

After Ethernet4 has been configured, the configuration states would be as follows

        Configuration status for the data path of host line 1: ConfigUndefined
        Configuration status for the data path of host line 2: ConfigUndefined
        Configuration status for the data path of host line 3: ConfigUndefined
        Configuration status for the data path of host line 4: ConfigUndefined
        Configuration status for the data path of host line 5: ConfigSuccess
        Configuration status for the data path of host line 6: ConfigSuccess
        Configuration status for the data path of host line 7: ConfigSuccess
        Configuration status for the data path of host line 8: ConfigSuccess

The configuration states of Ethernet0 would be ConfigUndefined. If the config state is ConfigUndefined, then it would do CMIS reinit for any set in the CONFIG_DB PORT table.
After Ethernet0 do CMIS init again, the configuration states of Ethernet4 would be ConfigUndefined.
Ethernet4 would do CMIS reinit for any set in the CONFIG_DB PORT table.
After Ethernet4 do CMIS init again, the configuration states of Ethernet0 would be ConfigUndefined.
Why does it need to do CMIS init again when the active application code is as expected and datapath is activated but only configstate is ConfigUndefined?

@mihirpat1
Copy link
Contributor

@chiourung Can you please elaborate on the issue here? I am not able to understand on why CMIS init should be skipped if the interested lanes are in ConfigUndefined state

When breakout ET8001-2FR4 to 2x400G, and configstates would be ConfigSuccess for 4 lanes and ConfigUndefined for the others. For example, breakout Ethernet0 to 2x400G. If Ethernet0 is configured first, then the configuration states would be

        Configuration status for the data path of host line 1: ConfigSuccess
        Configuration status for the data path of host line 2: ConfigSuccess
        Configuration status for the data path of host line 3: ConfigSuccess
        Configuration status for the data path of host line 4: ConfigSuccess
        Configuration status for the data path of host line 5: ConfigUndefined
        Configuration status for the data path of host line 6: ConfigUndefined
        Configuration status for the data path of host line 7: ConfigUndefined
        Configuration status for the data path of host line 8: ConfigUndefined

After Ethernet4 has been configured, the configuration states would be as follows

        Configuration status for the data path of host line 1: ConfigUndefined
        Configuration status for the data path of host line 2: ConfigUndefined
        Configuration status for the data path of host line 3: ConfigUndefined
        Configuration status for the data path of host line 4: ConfigUndefined
        Configuration status for the data path of host line 5: ConfigSuccess
        Configuration status for the data path of host line 6: ConfigSuccess
        Configuration status for the data path of host line 7: ConfigSuccess
        Configuration status for the data path of host line 8: ConfigSuccess

The configuration states of Ethernet0 would be ConfigUndefined. If the config state is ConfigUndefined, then it would do CMIS reinit for any set in the CONFIG_DB PORT table. After Ethernet0 do CMIS init again, the configuration states of Ethernet4 would be ConfigUndefined. Ethernet4 would do CMIS reinit for any set in the CONFIG_DB PORT table. After Ethernet4 do CMIS init again, the configuration states of Ethernet0 would be ConfigUndefined. Why does it need to do CMIS init again when the active application code is as expected and datapath is activated but only configstate is ConfigUndefined?

@chiourung In this example,

  1. Are you using some CLI to dynamically configure port to breakout mode? If yes, what is the CLI being used?
  2. The part which I am not able to understand is why executing api.scs_apply_datapath_init(host_lanes_mask) for 0xf0, causes the config status of lanes 1...4 to be ConfigUndefined and the config status of lanes 5...8 to be ConfigSuccess? Shouldn't the lanes 1 to 4 remain unaffected (i.e. it should have config status as ConfigSuccess) since the configuration is only being applied to lanes 5-8?

@chiourung
Copy link
Contributor Author

@chiourung Can you please elaborate on the issue here? I am not able to understand on why CMIS init should be skipped if the interested lanes are in ConfigUndefined state

When breakout ET8001-2FR4 to 2x400G, and configstates would be ConfigSuccess for 4 lanes and ConfigUndefined for the others. For example, breakout Ethernet0 to 2x400G. If Ethernet0 is configured first, then the configuration states would be

        Configuration status for the data path of host line 1: ConfigSuccess
        Configuration status for the data path of host line 2: ConfigSuccess
        Configuration status for the data path of host line 3: ConfigSuccess
        Configuration status for the data path of host line 4: ConfigSuccess
        Configuration status for the data path of host line 5: ConfigUndefined
        Configuration status for the data path of host line 6: ConfigUndefined
        Configuration status for the data path of host line 7: ConfigUndefined
        Configuration status for the data path of host line 8: ConfigUndefined

After Ethernet4 has been configured, the configuration states would be as follows

        Configuration status for the data path of host line 1: ConfigUndefined
        Configuration status for the data path of host line 2: ConfigUndefined
        Configuration status for the data path of host line 3: ConfigUndefined
        Configuration status for the data path of host line 4: ConfigUndefined
        Configuration status for the data path of host line 5: ConfigSuccess
        Configuration status for the data path of host line 6: ConfigSuccess
        Configuration status for the data path of host line 7: ConfigSuccess
        Configuration status for the data path of host line 8: ConfigSuccess

The configuration states of Ethernet0 would be ConfigUndefined. If the config state is ConfigUndefined, then it would do CMIS reinit for any set in the CONFIG_DB PORT table. After Ethernet0 do CMIS init again, the configuration states of Ethernet4 would be ConfigUndefined. Ethernet4 would do CMIS reinit for any set in the CONFIG_DB PORT table. After Ethernet4 do CMIS init again, the configuration states of Ethernet0 would be ConfigUndefined. Why does it need to do CMIS init again when the active application code is as expected and datapath is activated but only configstate is ConfigUndefined?

@chiourung In this example,

  1. Are you using some CLI to dynamically configure port to breakout mode? If yes, what is the CLI being used?
  2. The part which I am not able to understand is why executing api.scs_apply_datapath_init(host_lanes_mask) for 0xf0, causes the config status of lanes 1...4 to be ConfigUndefined and the config status of lanes 5...8 to be ConfigSuccess? Shouldn't the lanes 1 to 4 remain unaffected (i.e. it should have config status as ConfigSuccess) since the configuration is only being applied to lanes 5-8?
  1. config interface breakout Ethernet0 2x400G
  2. This is the behavior of the transceiver, not all transceivers behave in this way.

@mihirpat1
Copy link
Contributor

@chiourung Can you please elaborate on the issue here? I am not able to understand on why CMIS init should be skipped if the interested lanes are in ConfigUndefined state

When breakout ET8001-2FR4 to 2x400G, and configstates would be ConfigSuccess for 4 lanes and ConfigUndefined for the others. For example, breakout Ethernet0 to 2x400G. If Ethernet0 is configured first, then the configuration states would be

        Configuration status for the data path of host line 1: ConfigSuccess
        Configuration status for the data path of host line 2: ConfigSuccess
        Configuration status for the data path of host line 3: ConfigSuccess
        Configuration status for the data path of host line 4: ConfigSuccess
        Configuration status for the data path of host line 5: ConfigUndefined
        Configuration status for the data path of host line 6: ConfigUndefined
        Configuration status for the data path of host line 7: ConfigUndefined
        Configuration status for the data path of host line 8: ConfigUndefined

After Ethernet4 has been configured, the configuration states would be as follows

        Configuration status for the data path of host line 1: ConfigUndefined
        Configuration status for the data path of host line 2: ConfigUndefined
        Configuration status for the data path of host line 3: ConfigUndefined
        Configuration status for the data path of host line 4: ConfigUndefined
        Configuration status for the data path of host line 5: ConfigSuccess
        Configuration status for the data path of host line 6: ConfigSuccess
        Configuration status for the data path of host line 7: ConfigSuccess
        Configuration status for the data path of host line 8: ConfigSuccess

The configuration states of Ethernet0 would be ConfigUndefined. If the config state is ConfigUndefined, then it would do CMIS reinit for any set in the CONFIG_DB PORT table. After Ethernet0 do CMIS init again, the configuration states of Ethernet4 would be ConfigUndefined. Ethernet4 would do CMIS reinit for any set in the CONFIG_DB PORT table. After Ethernet4 do CMIS init again, the configuration states of Ethernet0 would be ConfigUndefined. Why does it need to do CMIS init again when the active application code is as expected and datapath is activated but only configstate is ConfigUndefined?

@chiourung In this example,

  1. Are you using some CLI to dynamically configure port to breakout mode? If yes, what is the CLI being used?
  2. The part which I am not able to understand is why executing api.scs_apply_datapath_init(host_lanes_mask) for 0xf0, causes the config status of lanes 1...4 to be ConfigUndefined and the config status of lanes 5...8 to be ConfigSuccess? Shouldn't the lanes 1 to 4 remain unaffected (i.e. it should have config status as ConfigSuccess) since the configuration is only being applied to lanes 5-8?
  1. config interface breakout Ethernet0 2x400G
  2. This is the behavior of the transceiver, not all transceivers behave in this way.

@prgeor for viz
@chiourung Can we work with the module vendor to fix the behavior since unrelated lanes are changing the ConfigStatus value which is not inline with the CMIS spec?

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants