don't process operation status duplicates #3053

Bravo555 · 2024-08-06T08:40:14Z

Proposed changes

If c8y mapper receives operation message with the same status again, don't send smartrest operation status message again.

It's still not clear what should be done if operation is cleared while processing or if some component publishes a previous status (e.g. c8y mapper received operation message with status SUCCESSFUL, but later received operation with the same ID with status EXECUTING).

Types of changes

Bugfix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Improvement (general improvements like code refactoring that doesn't explicitly fix a bug or add any new functionality)
Documentation Update (if none of the other choices apply)
Breaking change (fix or feature that would cause existing functionality to not work as expected)

Paste Link to the issue

Checklist

I have read the CONTRIBUTING doc
I have signed the CLA (in all commits with git commit -s)
I ran cargo fmt as mentioned in CODING_GUIDELINES
I used cargo clippy as mentioned in CODING_GUIDELINES
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if appropriate)

Further comments

codecov · 2024-08-06T08:53:21Z

Codecov Report

Attention: Patch coverage is 91.40127% with 27 lines in your changes missing coverage. Please review.

Project coverage is 78.6%. Comparing base (5358ddc) to head (79fb63c).
Report is 62 commits behind head on main.

Files	Patch %	Lines
...xtensions/c8y_mapper_ext/src/operations/handler.rs	93.0%	6 Missing and 13 partials ⚠️
crates/extensions/c8y_mapper_ext/src/tests.rs	85.7%	0 Missing and 5 partials ⚠️
...ions/c8y_mapper_ext/src/operations/handlers/mod.rs	40.0%	3 Missing ⚠️

Additional details and impacted files

Files	Coverage Δ
...ions/c8y_mapper_ext/src/operations/handlers/mod.rs	`76.6% <40.0%> (-0.4%)`	⬇️
crates/extensions/c8y_mapper_ext/src/tests.rs	`92.6% <85.7%> (-0.1%)`	⬇️
...xtensions/c8y_mapper_ext/src/operations/handler.rs	`94.3% <93.0%> (-0.9%)`	⬇️

... and 7 files with indirect coverage changes

rina23q · 2024-08-06T09:18:52Z

It's still not clear what should be done if operation is cleared while processing or if some component publishes a previous status (e.g. c8y mapper received operation message with status SUCCESSFUL, but later received operation with the same ID with status EXECUTING).

Once the operation is marked final state (either SUCCESSFUL or FAILED), SmartREST cannot change the state (doable via REST API though). However, I believe we should not update the operation state after operation gets final state. In short, ignore further operation status (ideally logging warning for this case).

github-actions · 2024-08-06T11:20:19Z

Robot Results

✅ Passed	❌ Failed	⏭️ Skipped	Total	Pass %	⏱️ Duration
491	0	2	491	100	1h24m39.436972s

albinsuresh

Added some queries. Also noticed that the codecov coverage has gone down. Something to look at.

albinsuresh · 2024-08-07T08:52:25Z

crates/extensions/c8y_mapper_ext/src/operations/handler.rs


-        let running_operation = self.running_operations.remove(&topic);
+        let status = match GenericCommandState::from_command_message(&message.message) {
+            Ok(command) if command.status.is_empty() && is_clearing_message(&message.message) => {


Suggested change

Ok(command) if command.status.is_empty() && is_clearing_message(&message.message) => {

Ok(command) if command.is_cleared() => {

Fixed in 0a18530

albinsuresh · 2024-08-07T09:26:36Z

crates/extensions/c8y_mapper_ext/src/operations/handler.rs

+                    operation
+                        .handle
+                        .await
+                        .expect("operation task should not panic");


Although not done in this PR, I'm having second thoughts about that expect call. Isn't it safer to just log the operation handler failure, if at all one happens, rather than crashing the whole mapper?

context.update() only panics when it can't send a message to the MQTT publisher via a channel. The MQTT sender it uses is a cloned from C8yMapper, so if this happens we can't send any MQTT messages. This is an unrecoverable failure, so it should be a panic. The actor runtime should restart the entrire actor.

The fact that we only panic on MQTT channel failure should become more clear after it's extracted.

albinsuresh · 2024-08-07T09:35:09Z

crates/extensions/c8y_mapper_ext/src/operations/handler.rs

+
+                // we got a new status, await we handled the previous one and handle the new one
+                let context = Arc::clone(&self.context);
+                let handle = tokio::spawn(async move {


I didn't funny understand why you moved to this model where new tasks are spawned for each state transition as opposed to the previous model of single task per operation.

Because we need to extract status from every message to prevent illegal state transitions, it makes sense to do it once in the handle method of the handler for all operation types. And for that we need to know what is the current state of the operation.
It also achieves separation of concerns: operation handler keeps state and handles the lifecycle of the operation, and context.update() is just a function that sends a smartrest response when given an operation message.
Ultimately, we only spawn a task so we that we can eventually generate a smartrest response (possibly after downloads and uploads) for that operation message. The logic to determine if a state transition is valid should be lifted up.

albinsuresh · 2024-08-07T09:41:56Z

crates/extensions/c8y_mapper_ext/src/operations/handler.rs

+                // TODO(marcel):
+                //- what happens when clearing message is posted by other client while we're
+                // handling a message? E.g. if we're doing a download to handle SUCCESSFUL status, do


I'd process it only after the previous state transition is fully processed.

Already was happening, TODO removed and clarification comment in 0a18530

albinsuresh · 2024-08-07T09:43:19Z

crates/extensions/c8y_mapper_ext/src/operations/handler.rs

+                //- what happens when clearing message is posted by other client while we're
+                // handling a message? E.g. if we're doing a download to handle SUCCESSFUL status, do
+                // we abort or keep going?
+                // - what if we get a message that moves back status, i.e. SUCCESSFUL -> EXECUTING?


That shouldn't be allowed. Log a warning and ignore.

Fixed in 0a18530

rina23q · 2024-08-07T13:35:56Z

crates/extensions/c8y_mapper_ext/src/operations/handler.rs

+    #[allow(clippy::match_like_matches_macro)]
+    match (previous, next) {
+        ("successful", "executing") => false,
+        ("failed", "executing") => false,


[successful -> failed] and [failed -> successful] cases should be also forbidden. To be precise, any state transitions from the final state (successful/failed) is forbidden. So, something like below is more accurate. (not sure if this is valid Rust, however, hope you understand what I wanted to say)

("successful", _) => false, ("failed", _) => false,

Fixed in 0ebe371

albinsuresh

The changes look fine, except for a minor issue regarding duplicate state tranistions.

albinsuresh · 2024-08-08T11:12:15Z

crates/extensions/c8y_mapper_ext/src/operations/handler.rs

+    #[allow(clippy::match_like_matches_macro)]
+    match (previous, next) {
+        // not really a transition but true for simplicity
+        (prev, next) if prev == next => true,


Suggested change

(prev, next) if prev == next => true,

(prev, next) if prev == next => false,

Allowing duplicate state transitions is problematic when we use SmartREST. For example, if the mapper processes two executing responses back-to-back, the first mapped 501 message will properly change the status of the original request. But if we send another 501 message for the duplicated executing response, then it might wrongly change the status of the next operation of the same type, if multiple requests of the same type are queued together on the cloud. This is a side effect of the id-less design of SmartREST that does not target a specific operation id, but always tries to change that status of the operations in the order in which they were triggered. So, we should just skip mapping duplicate state transitions. Fortunately, the check in line no. 174 is already doing that, which bypasses this case here. But, it's still better to keep it consistent here as well.

Yeah, it depends what we mean by "transition". Here I meant more like an identity transition of the status on the operation itself, not really related to reporting of the transition to Cumulocity. As the TODO says, if it were declared in tedge_api it would be a bit more clear, but here it looks like the logic is duplicated, which I agree with.
Still, for consistency's sake, changed the case to false in 46f1f2a.

albinsuresh

LGTM.

If c8y mapper receives operation message with the same status again, don't send smartrest operation status message again. It's still not clear what should be done if operation is cleared while processing or if some component publishes a previous status (e.g. c8y mapper received operation message with status SUCCESSFUL, but later received operation with the same ID with status EXECUTING). Signed-off-by: Marcel Guzik <[email protected]>

Bravo555 requested review from didier-wenzek, albinsuresh, jarhodes314 and rina23q as code owners August 6, 2024 08:40

Bravo555 temporarily deployed to Test Pull Request August 6, 2024 08:40 — with GitHub Actions Inactive

Bravo555 temporarily deployed to Test Auto August 6, 2024 11:05 — with GitHub Actions Inactive

albinsuresh reviewed Aug 7, 2024

View reviewed changes

Bravo555 had a problem deploying to Test Pull Request August 7, 2024 13:04 — with GitHub Actions Failure

rina23q reviewed Aug 7, 2024

View reviewed changes

Bravo555 temporarily deployed to Test Pull Request August 7, 2024 14:04 — with GitHub Actions Inactive

Bravo555 temporarily deployed to Test Auto August 7, 2024 14:10 — with GitHub Actions Inactive

Bravo555 force-pushed the fix/operations-status-duplicates branch from 0ebe371 to 4f1f11b Compare August 8, 2024 09:37

Bravo555 temporarily deployed to Test Pull Request August 8, 2024 09:37 — with GitHub Actions Inactive

Bravo555 had a problem deploying to Test Auto August 8, 2024 09:43 — with GitHub Actions Failure

Bravo555 temporarily deployed to Test Auto August 8, 2024 11:09 — with GitHub Actions Inactive

albinsuresh reviewed Aug 8, 2024

View reviewed changes

Bravo555 temporarily deployed to Test Pull Request August 8, 2024 11:21 — with GitHub Actions Inactive

Bravo555 temporarily deployed to Test Auto August 8, 2024 11:26 — with GitHub Actions Inactive

albinsuresh approved these changes Aug 8, 2024

View reviewed changes

Bravo555 force-pushed the fix/operations-status-duplicates branch from 46f1f2a to 79fb63c Compare August 8, 2024 11:34

Bravo555 temporarily deployed to Test Pull Request August 8, 2024 11:34 — with GitHub Actions Inactive

Bravo555 temporarily deployed to Test Auto August 8, 2024 11:39 — with GitHub Actions Inactive

Bravo555 added this pull request to the merge queue Aug 8, 2024

Merged via the queue into thin-edge:main with commit 95dd5f9 Aug 8, 2024
33 checks passed

Bravo555 mentioned this pull request Aug 8, 2024

Simplify the operation handling code #2880

Open

reubenmiller added the theme:c8y Theme: Cumulocity related topics label Aug 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

don't process operation status duplicates #3053

don't process operation status duplicates #3053

Bravo555 commented Aug 6, 2024

codecov bot commented Aug 6, 2024 •

edited

Loading

rina23q commented Aug 6, 2024 •

edited

Loading

github-actions bot commented Aug 6, 2024 •

edited

Loading

albinsuresh left a comment

albinsuresh Aug 7, 2024

Bravo555 Aug 7, 2024

albinsuresh Aug 7, 2024

Bravo555 Aug 7, 2024

albinsuresh Aug 7, 2024

Bravo555 Aug 7, 2024

albinsuresh Aug 7, 2024

Bravo555 Aug 7, 2024

albinsuresh Aug 7, 2024

Bravo555 Aug 7, 2024

rina23q Aug 7, 2024

Bravo555 Aug 7, 2024

albinsuresh left a comment

albinsuresh Aug 8, 2024

Bravo555 Aug 8, 2024

albinsuresh left a comment

	Ok(command) if command.status.is_empty() && is_clearing_message(&message.message) => {
	Ok(command) if command.is_cleared() => {

	(prev, next) if prev == next => true,
	(prev, next) if prev == next => false,

don't process operation status duplicates #3053

don't process operation status duplicates #3053

Conversation

Bravo555 commented Aug 6, 2024

Proposed changes

Types of changes

Paste Link to the issue

Checklist

Further comments

codecov bot commented Aug 6, 2024 • edited Loading

Codecov Report

rina23q commented Aug 6, 2024 • edited Loading

github-actions bot commented Aug 6, 2024 • edited Loading

Robot Results

albinsuresh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

albinsuresh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

albinsuresh left a comment

Choose a reason for hiding this comment

codecov bot commented Aug 6, 2024 •

edited

Loading

rina23q commented Aug 6, 2024 •

edited

Loading

github-actions bot commented Aug 6, 2024 •

edited

Loading