Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gcoap: add some client-side observe handling #20073

Merged

Conversation

MichelRottleuthner
Copy link
Contributor

@MichelRottleuthner MichelRottleuthner commented Nov 9, 2023

Contribution description

Enables some client-side observe operation of gcoap (and improve server side a bit).
Currently, when registering an observe from a gcoap client, the first notification consumes the request memo state. Any following notifications are therefore dropped silently afterwards.
When I Initially saw this behavior I thought it's a bug, but turns out gcoap never had client-side observe handling 😮
This PR tries to improve the situation like this: requests initiated with an observe option are simply kept open beyond the first response.

Testing procedure

The first commit of this PR adds some client side observe functionality to examples/gcoap to reproduce this behavior.
With two native instances, check out the first commit and follow below steps.

Step 1: Reproduce missing feature.

sudo ./dist/tools/tapsetup/tapsetup
BOARD=native PORT=tap0 make -C examples/gcoap all term
> ifconfig
(...)
# *copy node_1_IP IP* and  continue on node_2 in second term
coap get -o [<node_2_IP>] /cli/stats
gcoap_cli: sending msg ID 9556, 17 bytes
> gcoap: response Success, code 2.05, 1 bytes
0

Meanwhile in a second terminal:

BOARD=native PORT=tap1 make -C examples/gcoap all term
> ifconfig
(...)
# *copy node_2_IP IP*
coap get [<node_1_IP>] /.well-known/core
# sad, no notification arrived on node_1, try again?
coap get [<node_1_IP>] /.well-known/core
# nope, nothing happening, but wireshark says the server is doing what it should do

Step 2: hopefully make it better.

The second commit of this PR should fix this. So git checkout somethingsomething, and repeat.
This time, doing many coap get [<node_1_IP>] /.well-known/core (term2) should something like this on node_1:

(...)
coap get -o [<node_2_IP>] /cli/stats
gcoap_cli: sending msg ID 1769, 17 bytes
> gcoap: response Success, code 2.05, 1 bytes
0
gcoap: response Success, code 2.05, 1 bytes
1
gcoap: response Success, code 2.05, 1 bytes
2
gcoap: response Success, code 2.05, 1 bytes
3
gcoap: response Success, code 2.05, 1 bytes
4
gcoap: response Success, code 2.05, 1 bytes
5
(...)

Noice. Done, right? ...and if we want to get rid of that perma-request again? ...no problem, just, ahh reboot? 🤨
Jump to commit No.6 of this PR for a solution (and more problems).

Repeat step 1 and 2, then make node1 forget about the observe via the new -d (deregister) option like this:
coap get -d [<node_2_IP>] /cli/stats
According to RFC7641, 3.6 Cancellation this implementation variant of simply "forgetting about the observe" should do.
For node1 (the observed observing client) it indeed looks fine now as the response handler isn't called anymore. But wireshark shows that node2 is still sending all the notifications.
This is actually a shortcoming of the server-side handling, as gcoap currently does not handle a received RST after it sent a notification.
This is fixed with commit No.7.

The last "REMOVEME" commit shows an attempt of how explicitly sending a de-registration request could reuse the existing request memo of the observe. It also shows why we probably shouldn't do that ;)
It is much more nasty than I expected. Some not really related changes to nanocoap are also part of that droppable commit.
Luckily, with the other changes of this PR this is not strictly necessary. Maybe it helps finding a more elegant solution.

TODO: interop with some aiocoap-ish thing?

Issues/PRs references

@MichelRottleuthner MichelRottleuthner added Type: new feature The issue requests / The PR implemements a new feature for RIOT Area: CoAP Area: Constrained Application Protocol implementations Area: examples Area: Example Applications labels Nov 9, 2023
@github-actions github-actions bot added Area: network Area: Networking Area: sys Area: System labels Nov 9, 2023
@bergzand bergzand self-requested a review November 10, 2023 09:57
@benpicco benpicco requested a review from chrysn November 10, 2023 10:51
Copy link
Contributor

@Teufelchen1 Teufelchen1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦩

examples/gcoap/client.c Outdated Show resolved Hide resolved
examples/gcoap/client.c Show resolved Hide resolved
examples/gcoap/client.c Outdated Show resolved Hide resolved
examples/gcoap/client.c Outdated Show resolved Hide resolved
sys/net/application_layer/gcoap/gcoap.c Show resolved Hide resolved
@MichelRottleuthner
Copy link
Contributor Author

Note to myself and people interested in testing: a quick test showed that the server side of aiocoap also doesn't handle an RST returned after a notification. In my understanding this is valid behavior as per RFC a server can always safely pretend it didn't receive the RST. For the client side it indicates that it might be better to indeed also add a feature for explicit deregistration (i.e. sending a GET with observe option set to 1).

@MichelRottleuthner MichelRottleuthner force-pushed the pr_gcoap_observe_improvments branch from 7315614 to 84be404 Compare November 15, 2023 10:14
@MichelRottleuthner
Copy link
Contributor Author

(Rebased to fix conflicts with upstream changes and also squashed the fixups.)

Another data point regarding the handling of silently forgetting observations on the client side: libcoap treats this well (on RST it removes stale registration).

@MichelRottleuthner MichelRottleuthner force-pushed the pr_gcoap_observe_improvments branch from 84be404 to 4256885 Compare November 15, 2023 18:42
@MrKevinWeiss MrKevinWeiss self-requested a review December 18, 2023 11:41
@MrKevinWeiss
Copy link
Contributor

Just after some IRL conversation...

I think maybe it would be better to make this an optional module (even though the server observe is not)... I also don't think there is a rush for this as @MichelRottleuthner will be busy for the next little while.

Maybe @bergzand might have some opinions on this as well.

I will start with a bit more automation on the release specs and see if I can get something that can test this... eventually.

Copy link
Contributor

@leandrolanzieri leandrolanzieri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Here are some comments. I agree that 4256885 may not be the way. Given that de-registration of observations is kind of manual, and there are two options, I think we could really use some documentation on that.

examples/gcoap/client.c Outdated Show resolved Hide resolved
examples/gcoap/client.c Outdated Show resolved Hide resolved
sys/net/application_layer/gcoap/gcoap.c Outdated Show resolved Hide resolved
sys/net/application_layer/gcoap/gcoap.c Outdated Show resolved Hide resolved
examples/gcoap/client.c Show resolved Hide resolved
@leandrolanzieri
Copy link
Contributor

I think maybe it would be better to make this an optional module (even though the server observe is not)... I also don't think there is a rush for this as @MichelRottleuthner will be busy for the next little while.

Why? And also, how? The "forget" function would be optimized away if not used.

@leandrolanzieri
Copy link
Contributor

I tested this with an aiocoap server, the observe notifications and the observation cancellation worked like a charm.

@leandrolanzieri leandrolanzieri added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Feb 1, 2024
@riot-ci
Copy link

riot-ci commented Feb 1, 2024

Murdock results

✔️ PASSED

12982a0 gcoap: update documentation on supported features

Success Failures Total Runtime
10008 0 10010 15m:12s

Artifacts

@leandrolanzieri
Copy link
Contributor

@MichelRottleuthner please squash and rebase

@MichelRottleuthner MichelRottleuthner force-pushed the pr_gcoap_observe_improvments branch from a6c5acb to b447c5c Compare February 2, 2024 12:27
This splits the _find_req_memo util function into multiple variants that match on different things. This is done in preparation of a feature that has to find a request based on a token value, without creating an artificial pdu for that. A nice side effect is that it also makes the calls to the find functions a bit more readable by not relying on an anonymous bool input.
@MichelRottleuthner MichelRottleuthner force-pushed the pr_gcoap_observe_improvments branch from b447c5c to f6c40c4 Compare February 8, 2024 14:41
@MichelRottleuthner
Copy link
Contributor Author

Whoopsie I only squashed. Rebased now.

In order to properly handle an observe cancellation of a client, the server has to keep track of the notification MIDs (to be able to match an RST to a notification), see [RFC7641, 3.6 Cancellation](https://www.rfc-editor.org/rfc/rfc7641.html#section-3.6) for mor details. An alternative to this would be to make either the client send an explicit observe deregister request, or make the server send the next notification via CON (which hten allows matching of the RST due to the CON state).
@MichelRottleuthner MichelRottleuthner force-pushed the pr_gcoap_observe_improvments branch from f6c40c4 to 12982a0 Compare February 8, 2024 14:49
Copy link
Contributor

@leandrolanzieri leandrolanzieri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK!

@leandrolanzieri leandrolanzieri added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Reviewed: 1-fundamentals The fundamentals of the PR were reviewed according to the maintainer guidelines Reviewed: 2-code-design The code design of the PR was reviewed according to the maintainer guidelines Reviewed: 3-testing The PR was tested according to the maintainer guidelines Reviewed: 4-code-style The adherence to coding conventions by the PR were reviewed according to the maintainer guidelines Reviewed: 5-documentation The documentation details of the PR were reviewed according to the maintainer guidelines and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Feb 14, 2024
@leandrolanzieri leandrolanzieri added this pull request to the merge queue Feb 14, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 14, 2024
@leandrolanzieri leandrolanzieri added this pull request to the merge queue Feb 14, 2024
Merged via the queue into RIOT-OS:master with commit 7745e23 Feb 14, 2024
33 checks passed
@MrKevinWeiss MrKevinWeiss added this to the Release 2024.04 milestone Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: CoAP Area: Constrained Application Protocol implementations Area: examples Area: Example Applications Area: network Area: Networking Area: sys Area: System CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Reviewed: 1-fundamentals The fundamentals of the PR were reviewed according to the maintainer guidelines Reviewed: 2-code-design The code design of the PR was reviewed according to the maintainer guidelines Reviewed: 3-testing The PR was tested according to the maintainer guidelines Reviewed: 4-code-style The adherence to coding conventions by the PR were reviewed according to the maintainer guidelines Reviewed: 5-documentation The documentation details of the PR were reviewed according to the maintainer guidelines Type: new feature The issue requests / The PR implemements a new feature for RIOT
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants