You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
INT Lab as a ring topology with 300 static EVCs with primary_path and backup_path between Novi01 and Novi06
Performed a link down on interface 11 of Novi01 (forcing primary to go down)
During network convergence mef_eline makes a lot of requests to flow_manager to delete, and doesn't handle the error completely this is also related to issue mef_eline high concurrency EVC deletion can leak flows (leave them installed) #483. Plus, in this case it stayed in that case without converging to the backup_path which is really problematic since it would mean full outage, and it would only recover in a next network link up event. See below one example of a problematic EVC that didn't converge.
It's expected that if increasing the api_concurrency_limit this is supposed to go away, I'll cover this test to confirm that it'll behave as expected. However, this is also already a sign that the default value isn't suitable to handle 300+ EVCs, so this needs to be better documented. Also, finally mef_eline should handle this error more completely, logging the error is the minimum expected in this case, but not doing anything after that it's problematic. It should probably mark it for a next deployment to try to recover from this failure (this isn't a consistency check, just part of the rest of error handling)
It should probably mark it for a next deployment to try to recover from this failure (this isn't a consistency check, just part of the rest of error handling)
This needs to be assessed, discussed, and see if it'd work reliably, and or think about anything that can still be done.
Scenario:
backup_path
which is really problematic since it would mean full outage, and it would only recover in a next network link up event. See below one example of a problematic EVC that didn't converge.It's expected that if increasing the
api_concurrency_limit
this is supposed to go away, I'll cover this test to confirm that it'll behave as expected. However, this is also already a sign that the default value isn't suitable to handle 300+ EVCs, so this needs to be better documented. Also, finally mef_eline should handle this error more completely, logging the error is the minimum expected in this case, but not doing anything after that it's problematic. It should probably mark it for a next deployment to try to recover from this failure (this isn't a consistency check, just part of the rest of error handling)The text was updated successfully, but these errors were encountered: