Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function notifies losing LNS with a direct method #1577

Merged
merged 19 commits into from
Mar 4, 2022

Conversation

spygi
Copy link
Contributor

@spygi spygi commented Mar 2, 2022

PR for issue #1497

What is being addressed

When the winning LNS changes on the function side, we have to inform the losing LNS proactively that it lost the race. The reason why we do this is in order for its Edge Hub to not retry to create the connection which would result in a connection ping-pong.

How is this addressed

  • A new direct method is called from the DeduplicationExecutionItem when a connection switch happens.
  • When LNS receives it (in ModuleConnectionHost), it sets itself as the losing LNS and closes the connection.
  • Unit tests added for LNS (ModuleConnectionHostTest) and the function side (MessageDeduplicationTests), integration tests for Function (DeduplicationTestWithRedis) and a simulation test (SimulatedLoadTest) for the whole flow.

Verification

Simulation test in the CI

- LNS marks itself as the losing one and drops the connection
- Unit, integration + simulation tests.
@codecov-commenter
Copy link

codecov-commenter commented Mar 2, 2022

Codecov Report

Merging #1577 (3bb6604) into dev (5a86399) will increase coverage by 0.07%.
The diff coverage is 95.45%.

Impacted file tree graph

@@            Coverage Diff             @@
##              dev    #1577      +/-   ##
==========================================
+ Coverage   86.46%   86.54%   +0.07%     
==========================================
  Files         235      235              
  Lines        9201     9258      +57     
==========================================
+ Hits         7956     8012      +56     
- Misses       1245     1246       +1     
Impacted Files Coverage Δ
LoRaEngine/LoraKeysManagerFacade/FacadeStartup.cs 0.00% <0.00%> (ø)
...ade/FunctionsBundler/DeduplicationExecutionItem.cs 93.93% <96.55%> (+1.63%) ⬆️
LoRaEngine/LoraKeysManagerFacade/HttpUtilities.cs 80.00% <100.00%> (+2.22%) ⬆️
...ndCloudToDeviceMessage/SendCloudToDeviceMessage.cs 81.37% <100.00%> (-0.19%) ⬇️
...csStation/ModuleConnection/ModuleConnectionHost.cs 94.30% <100.00%> (+1.60%) ⬆️
...kSrvModule/LoRaWan.NetworkServer/MetricRegistry.cs 100.00% <100.00%> (ø)
...oRaWan.NetworkServer/ManagedConnectionException.cs 0.00% <0.00%> (-25.00%) ⬇️
...kPreferredGateway/PreferredGatewayExecutionItem.cs 80.70% <0.00%> (-3.51%) ⬇️
...LoraKeysManagerFacade/LoRaDeviceCacheRedisStore.cs 82.00% <0.00%> (+10.00%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5a86399...3bb6604. Read the comment docs.

@spygi spygi linked an issue Mar 2, 2022 that may be closed by this pull request
6 tasks
@spygi spygi marked this pull request as ready for review March 2, 2022 22:38
Copy link
Collaborator

@p-schuler p-schuler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logging is too verbose and I think we should add a metric tracking the ownership changes. That would allow an operator to track aggressive connection changes and alert on them.

Generally: When Mark strategy is configured, I think we could end up in a situation, where every message (worst case) does switch the connection. We could end up in a scenario, where we have a lot of direct methods going to the edge to request the closing of the connection. I'm not sure this is a concern really, as we make it clear that Mark should not be used in general, but on only for specific devices.

spygi added 3 commits March 3, 2022 11:08
- removes extra logs/scope
- renames from drop connection to close
@github-actions github-actions bot added OTAATest Indicate OTAATest passed MacTest Indicate MacTest passed ClassCTest Indicate ClassCTest passed C2DMessageTest Indicate C2DMessageTest passed MultiGatewayTest Indicate MultiGatewayTest passed MultiConcentratorTest CupsTest LnsDiscoveryTest labels Mar 3, 2022
@p-schuler p-schuler added the 1️⃣ Priority 1 (One) label Mar 4, 2022
@p-schuler p-schuler temporarily deployed to CI_AZURE_ENVIRONMENT March 4, 2022 04:13 Inactive
@p-schuler p-schuler temporarily deployed to CI_AZURE_ENVIRONMENT March 4, 2022 04:13 Inactive
@p-schuler p-schuler temporarily deployed to CI_ALL_IN_ONE_ARM_GATEWAY March 4, 2022 04:15 Inactive
@p-schuler p-schuler temporarily deployed to CI_CONCENTRATOR_EFLOW_ARM32 March 4, 2022 04:15 Inactive
@p-schuler p-schuler temporarily deployed to CI_EFLOW_AZURE_VM March 4, 2022 04:15 Inactive
@p-schuler p-schuler temporarily deployed to CI_AZURE_ENVIRONMENT March 4, 2022 04:24 Inactive
@p-schuler p-schuler added 1️⃣ Priority 1 (One) and removed 1️⃣ Priority 1 (One) labels Mar 4, 2022
@p-schuler p-schuler temporarily deployed to CI_AZURE_ENVIRONMENT March 4, 2022 05:58 Inactive
@p-schuler p-schuler temporarily deployed to CI_AZURE_ENVIRONMENT March 4, 2022 05:58 Inactive
@p-schuler p-schuler temporarily deployed to CI_EFLOW_AZURE_VM March 4, 2022 05:59 Inactive
@p-schuler p-schuler temporarily deployed to CI_ALL_IN_ONE_ARM_GATEWAY March 4, 2022 05:59 Inactive
@p-schuler p-schuler temporarily deployed to CI_CONCENTRATOR_EFLOW_ARM32 March 4, 2022 05:59 Inactive
@p-schuler p-schuler temporarily deployed to CI_AZURE_ENVIRONMENT March 4, 2022 06:08 Inactive
@github-actions github-actions bot added the ABPTest Indicate ABPTest passed label Mar 4, 2022
The setup was wrong. The real flow would never have to call the API again when the function bundler was executed. In multigw the bundler always is executed and hence, the fcnt is set (to 0 if the message was already handled). This makes sure we actually mock the behavior of the real code executing and simplifies the logic.
@p-schuler p-schuler merged commit ebfbb4f into dev Mar 4, 2022
@p-schuler p-schuler deleted the feature/1497-direct-method-to-LNS branch March 4, 2022 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1️⃣ Priority 1 (One) ABPTest Indicate ABPTest passed C2DMessageTest Indicate C2DMessageTest passed ClassCTest Indicate ClassCTest passed CupsTest fullci Run full continuous integration LnsDiscoveryTest MacTest Indicate MacTest passed MultiConcentratorTest MultiGatewayTest Indicate MultiGatewayTest passed OTAAJoinTest Indicate OTAAJoinTest passed OTAATest Indicate OTAATest passed SensorDecodingTest Indicate SensorDecodingTest passed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement function based connection state change notification
5 participants