-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhauled CoAP Stack and API #20792
Comments
Thanks for tackling this – a unified way to access CoAP in RIOT has been overdue for some time, and will also make it easier to integrate the implementation bundling EDHOC and OSCORE that's being developed in RIOT-rs (currently called coapcore, future-proof-iot/RIOT-rs#348). Some very early comments, only based on the text:
That's a very overloaded word, and I'm not sure what "akin to OSCORE" would encompass. Would "middleware" fit the bill, and can they be chained?
The request_t type seems to be able to store all the option pointers (probably not the data, requiring that the pointed values live between request_t construction time and buffer population, which sounds OK). How will it do that with the bounded size of the request_t, not having access to the buffer yet? Likewise, how does unicoap_resource_identifier_t store its inputs without already populating them into the buffer? For example, if I were to follow up on #13827 and add the CRI support you already mention, then building a unicoap_resource_identifier_t would probably look like: unicoap_resource_identifier_t where;
cri_init(&where, CRI_COAP_TCP, "example.com")
cri_append_path("sensor");
cri_append_path("2");
cri_append_query("a=b"); Would the unicoap_resource_identifier_t spool them in some bound memory, or would building the CRI just mean creating more data structures?
I'd love to review that draft!
A transport like the aforementioned coapcore would provide multiple URI schemes, and integrate EDHOC/OSCORE – whereas other implementations (like direct libOSCORE on Gcoap) would have authentication and transports in different places. Where would credential requirements be passed in, and can that be done in a way that works both if the authentication sits on a profile/middleware and if the authentication is built in? (I have no expectation that the precise same credentials could be passed in to either, but the credentials would be built and referenced in a way that can be transparent to the application) |
My first thought is: From my experience in using RIOT to teach students the current situation with gcoap and nanocoap is already highly confusing. Adding another user facing API would make IMHO a bad situation worse. Would you consider to instead have the goal of replacing nanocoap and gcoap as user facing API? nanocoap has been extended a lot in the recent years in a way that does not bloat trivial use cases, while enabling more complex use cases at the expense of using more resources. I would love it if the same code could be used for CoAP with different transports. I would like to emphasize the original goal of CoAP:
The part with 8-bit may not have aged that well, as it has been shown that tiny 32-bit bit MCUs are very much feasible. Still, larger SRAM will increase both the MCU cost and the power consumption, so that there is still a lot of value in keeping things small. As a result, I would like to emphasize that a universal CoAP API should allow tiny applications for trivial use cases. Allowing the user to access more features by calling additional functions (not needed for the trivial use cases) or enabling more modules (not needed for the trivial use cases) would IMO be a good compromise. When modules are used to enable the transports selected, optimizing for the case only a single transport is support (e.g. by avoiding indirect function calls in this case) would be way to cover both use cases. Note that the ability to add CoAP options in random order is IMO a bad trade-off. The convenience it adds is limited. A decent error message when adding options out of order, and developers will not waste a lot of time on this. The overhead of this little extra convenience is felt by all users, as it forces to keep the options in RAM until the options are written into the message buffer. |
The way I understand this PR, it aims to replace both gcoap and nanocoap_sock as the user facing parts. Only then it makes sense to add this.
I haven't seen the code, but I would expect that everything this new API does can be build-time folded into barely any different code than direct nanocoap or gcoap use would create, provided that LTO is enabled and there is only one backend (which would be a typical case). |
OSCORE is special implementation-wise as it needs to be hardcoded in. Consider any OSCORE-protected message passing through the transport layer and parser, just as any regular message would. Now, in theory, after parsing the message, the stack having inspected its OSCORE option, and decrypting the inner message, the combined plaintext message (serialized inner message) needs to be redirected back to the parser layer. This sort of behavior is unique to OSCORE and cannot be achieved with any middleware-like model. The term "profile" isn't set in stone yet. Fundamentally, directly passing the OSCORE security context as an optional argument to, e.g., request functions would suffice. The idea here was to create some sort of tagged union that ensures extensions requiring special treatment like OSCORE can be added without source-breaking changes or creating tens of request function "overloads" in the future. The cost associated to chaining middleware is too high for a constrained environment, I think. These use cases are too narrow to be built-in functionality, plus OSCORE doesn't event fit the middleware model.
First, the options issue. Options are stored in their serialized form in an options buffer. The header on the other hand is disconnected from the options buffer to support various transport-dependent headers (CoAP over TCP, UDP, (GATT), ...) and only added in by the stack afterwards. The request struct stores a pointer to an options struct containing a fixed-size option array with pointers into the PDU, similar to the existing nanocoap design. Additionally, the options struct holds the number of options, the option buffer total capacity and current size. The option member in the request needs to be pointer, otherwise each time the request is copied you'd also do a rather expensive copy of the options struct. If you really need to add options manually, it would look something like this: uint8_t my_buf[CONFIG_UNICOAP_OPTIONS_BUFFER_DEFAULT_CAPACITY];
unicoap_options_t my_options1;
unicoap_options_init(&my_options1, my_buf, sizeof(my_buf));
/* or shorter */
UNICOAP_OPTIONS_ALLOC_CAPACITY(my_options2, 200);
/* or, with the default capacity (CONFIG_UNICOAP_OPTIONS_BUFFER_DEFAULT_CAPACITY) */
UNICOAP_OPTIONS_ALLOC(my_options2);
unicoap_options_set_content_format(&my_options2, UNICOAP_FORMAT_TEXT); In general, supplying a URI would force you to allocate an options buffer before calling the request function.
Next April. Promise ;)
I'm not sure I fully understand. In theory, OSCORE could also be handled by a transport layer driver in the current design. I guess, that just wouldn't be that useful compared to shared OSCORE handling in the messaging stack. |
@maribu Yes, the goal is to offer a full replacement for nanocoap, nanocoap_sock and gcoap. The new library will ruse existing code as much as possible, such as code from the nanocoap cache extension. The transport drivers are currently compiled in via a module flag, so there shouldn't be any additional overhead.
Avoiding a single indirect function call doesn't seem to be possible to me without introducing too much complexity. As @chrysn said, code unused by the application should be stripped by the optimizer, so offering more functionality doesn't necessarily lead to increased binary sizes.
Provided you add options in the correct order, you'll achieve the same performance behavior as with nanocoap. As an alternative, I could gate the ability to insert options in a random order behind a compile-time flag? I just think inserting options in some predefined order requires a level of protocol knowledge I wouldn't to demand from a first-time user. In either case I'd document that you can rearrange your option insertion calls to match the RFC order as an optimization. |
The way libOSCORE is wrapped in Rust does exactly this: It presents a view of the plaintext message composed of both options from outside and from inside. So a sufficiently powerful middleware model (that can add and intercept properties, as would an HTTP authentication middleware that adds and strips cookie headers) can cover that. Whether that is an efficient and ergonomic thing to do in C is a different question, of course, but there's nothing fundamental in OSCORE that stops one from doing this.
The hard part about making it a transport is that transports then could be stacked – your remote can be an OSCORE transport tacked onto a UDP transport, or an OSCORE transport tacked onto an OSCORE transport tacked onto a TCP transport (while nested OSCORE is not in RFC8631, there is ongoing work and use cases to allow it). That works best if those different transports form a linked structure, but that requires multiple stack allocations in C, putting it again in the "might not be ergonomic" category of above. So yeah, maybe it is practical to give OSCORE a special place in the handling. At the same time, the coapcore reference not only was made to illustrate what can be done there, but to point to a practical possibility: If this is to be a pluggable, it will need the ability to send credentials either into a specially-placed OSCORE layer, but also into transports that handle their own security (which may be OSCORE as part of the implementation). |
Yeah, this what I meant by saying OSCORE "cannot be achieved with any middleware-like model". Still, I don't believe middleware is worth the overhead for the tiny number of application that'd benefit from middleware.
The current implementation tries really hard to not have statically allocated driver structs or anything like that. Transport drivers are, put simply, conditional (again, compile-time) function calls to the transport driver implementation.
For provisioning, RIOT largely relies on static configurations.. I could also make the profile (or whatever it is going to be called) available to the transport driver to make these "profiles" a way of "channeling" credentials to lower library components, such as the messaging stack (potentially handling OSCORE) or even transport drivers. As with middleware, and having talked to embedded/IoT developers, uses cases for stacking drivers are hard to find and too specific to be built directly into RIOT.. |
I didn't express that clearly: What I meant was that at the time the message is created, a chain would need to be created: coap_transport_data_t udp = coap_remote_for_uri("coap://host.example.com");
coap_transport_data_t protected = coap_oscore_for(&udp, my_security_requirements);
unicoap_send_request(..., protected, ...); which doesn't align nicely with the single-line C call syntax.
I didn't mean run-time pluggable. If coapcore is a backend that is selected, it is probably the only backend. |
Just some thought from my side: One goal of nanoCoAP sock was to use zero-copy network functions ( e.g. in your first example, how is So my question would be why you want to base your new API on top of GCoAP (which IMHO is way too complex and clunky) instead of nanoCoAP or your own re-write. I extended nanoCoAP sock to cover all use-cases I had so I don't have to deal with GCoAP anymore. But there are still some footguns in the 'core' nanoCoAP library that you'd inherit if you use it. (e.g. I see that you are using Do you also have some application(s) in mind to make use of the new API? e.g. what's the use-case for removing options from a CoAP header? I guess when you implement re-ordering, this pretty much comes for free though. |
The callback APIs are essentially zero-copy APIs on top of a listen buffer. By listen buffer I mean the buffer gcoap writes scattered data returned by Now, for the blocking client API, you will need to copy the payload into some sort of user-supplied buffer. For the callback-based APIs, I'm just handing you the listen buffer in different packaging. There's no copying of options or payload going on here. The pointers in the options view array (that's also how nanocoap does it, you'll need some way of keeping track of options without re-parsing the entire options blob again) point into the listen buffer, same as with Of course, that does not apply for block-wise reassembly, but that's another use case (where you explicitly want to copy).
Yeah, so this statement of building unicoap on top of gcoap might've been a little oversimplified. unicoap will have a messaging thread, like gcoap, enabling async operations. Other than that, another third of gcoap isn't usable because of the new transport modularity and I'm rewriting yet another third, because some parts are extremely hard to comprehend and debug. There's no struct or function that remains untouched ;)
I've rewritten the parser, bound checks included.
This is one of the details I haven't settled on, yet. Currently,
If you want to send chunked payload, I'm happy to hand you an
unicoap does not do reordering at all. Instead, options are alway inserted at the right palce right away. That's also why you can achieve nanocoaps performance by adding options in the order dictated by their option number. Yes, generally, you won't remove any options. The removal function is still there from an earlier version I tested. I can remove it of course, but if you don't call this API, it's not going to land in your binary. |
Only if you need async operation (so a separate send buffer is needed) or the message will be encrypted with DTLS. For plain blocking UDP, |
@benpicco Is sending chunked payload something you definitely need? If yes, I'd be interested in the concrete use case you have. |
It's certainly convenient, but my main use case was having a (small) buffer for the CoAP header and handing the payload over without coping it to a separate (CoAP header + payload length) buffer. On the other hand, are async requests something we definitely need? If yes, I'd be interested in the concrete use case you have. |
Oh, I think we've misunderstood each other. Yes, that's how I was going to do it anyway. This avoids allocating or occupying a temporary buffer. Still, there's a case where you cannot live without a temporary buffer, and that's retransmissions 1. Regular client call: Receiving: Footnotes
|
I think I'm not quite getting the connection between chunked payload and async requests.. |
Do we have any evidence that there are real world use cases (outside of research labs and testbeds) for CoAP proxies running on MMU-less nodes? I would guess that in real world deployments, the CoAP-Proxy will be running on some OpenWRT/Linux box near by anyway. I suspect that a CoAP-Proxy on a RIOT node is mostly an academic demonstrator, and nothing that one would actually use anyway. |
The use of nested OSCORE is not limited to the itself; nested contexts would also be paesent in the client or server. No evidence of use though either yet.
|
I did some digging to better understand the historical context on how we arrived at the current situation: Turns out both nanoCoAP (#5972) and GCoAP (#5598) arrived at roughly the same time. Looks like GCoAP was first based on MicroCoAP, but was then rewritten on top of @kaspar030's nanoCoAP implementation - but only using the parser / writer part. When SUIT came along, it needed a CoAP transport, but didn't want to use the heavy GCoAP library that comes with a separate thread and payload buffer, as this would increase the cost of providing firmware updates (and since SUIT needs to keep a flashpage in RAM, that's already rather high). It used nanoCoAP to implement block-wise to fetch the updates. Then I had a simple application where some sensors would just push some measurement data to a server via CoAP and found GCoAP very cumbersome to use for such a simple task. This was all under a 'only pay what you use' philosophy, so if you don't use a feature (e.g. server mode, async mode) you don't have to compile it in - unlike with GCoAP. With that nanoCoAP sock now almost has feature parity with GCoAP, the only things missing are Observe handling and Proxying - but I didn't have a use case for that.
I think @fabian18 and @mariemC were working on just that recently.
IMHO |
It's borderline off-topic at this point, but just for completeness' sake: There is also 6TiSCh minimal joining (RFC9031), for which every routing node in a 6LoWPAN (easily MMU-less) acts as a proxy – not to do blockwise or retransmissions or to cache (it does none of that), but to forward before the network even tells the joining node which network addresses it is using. Granted, that may well not use the GCoAP based proxy we now have in RIOT (especially as it should be implemented statelessly), but it is a proxy on a very constrained node nonetheless. |
Hi carl, I often find myself in a situation where I want to emit a CoAP packet to RAM (for further hacking, cuddling & debugging), can your API allow me to do so? e.g. If you require me to have a valid transport this won't work. This got me thinking, in combination with your approach to be agnostic of the transport driver, it should be easy to quickly build a "dummy" driver that emits packets to i.e. a ring buffer in RAM & receive packets from another ring buffer. Right? Lastly I want to make you aware of SLIPMUX, if you aren't already. Within SLIPMUX, CoAP packet are transported via serial / uart. No IPv6 or UDP involved. |
I don't think that that use case description is entirely accurate: There is no such thing as a CoAP message serialized without a transport, because not only does the CoAP header depend on the transport, but also CoAP options need to be set differently depending on the transport (for example, Observe over TCP has only zero values in notifications). I think I can rephrase that though to do what you need: It could be convenient to have a transport that behaves like some particular transport (maybe UDP, then it builds a header, or maybe OSCORE, then it builds no header and just puts the CoAP method/code and relies on external matching of request/response), but only serializes into or reads from a dummy buffer. This transport might be configured statically as the only transport anyway, or it might be selected at runtime, or it might be configured for some Uri-Host value. This would allow easy access to serialized messages for inspection. In particular, this could be used to implement CoAP transports such as slipmux that use another transport's mechanisms (although if the new CoAP stack has good APIs, implementing a new transport should be just as easy). |
I can also see this being useful with some Segger RTT style transport. I guess introducing some "buffer transport" backend would be an easy implementation choice that doesn't comprise the API design idea? For the use case of communicating from the host to a directly attached MCU the CoAP over websocket format should be pretty close to what would be perfect here, if I recall correctly. |
@Teufelchen1 Do you need the packet in RAM to be serialized as it would occur on the wire or is it just the options (and header?) you want to store and modify? |
Hello RIOT Community,
RIOT features multiple CoAP libraries: gcoap, nanocoap_sock, and the nanocoap parser. In the future, it’d be great to have a single, unified, and modular library, facilitating CoAP over various protocols such as UDP, DTLS, and potentially TCP, TLS.
It’s time for something new 🥳!
Below I will give a brief overview before I outline the proposed design of a new client and server API, the improved options API, and the modular transport driver design.
Overview of the New CoAP Library for RIOT
The new CoAP library should provide a unified, versatile CoAP API for RIOT that is both easily extensible and beginner-friendly. Until a final, better name is found, I’m calling it the unified CoAP API, or
unicoap
for short.The new CoAP stack will be based upon gcoap. The goal is to extend gcoap to provide synchronous and asynchronous APIs, and support message deduplication optionally.
The new API aims to reduce the need for in-depth protocol knowledge and implementation details while offering convenience APIs for commonly used features such as block-wise transfer and OSCORE. It minimizes boilerplate code.
For example, this is what a simple blocking GET request would look like with the new client API (error handling omitted):
Design
The new CoAP implementation comprises four main parts: a new API, the library messaging internals, and a new, modular parser and transport design. The latter two components are extensible enough to support CoAP over TCP or TLS.
Client API
Today, users have to choose between nanocoap and gcoap for sending requests. nanocoap provides a synchronous interface for client requests, and gcoap on the other hand offers async callback-based functionality. Generally, the gcoap async interface is more versatile, yet sometimes, your application can’t perform any useful work until the request comeback has come in. Plus, nesting your application logic in response handlers, async or sync, becomes quickly messy. Hence, the new API will define both synchronous and async APIs.
The following synchronous request function blocks until a response is received (or the corresponding timeout is exceeded), and copies relevant response data into the supplied buffer.
Request data (method, payload (+ size), and optionally any options) must be passed through the
request
parameter. To avoid unnecessary boilerplate, unicoap supports defining requests/responses via convenience functions, including, but not limited to:The
unicoap_resource_identifier_t
consists of an identifier type and value. This API allows for different representations of resource identifiers: The URI passed tounicoap_uri(...)
specifies the transport type via the URI’s scheme, the resource’s address, the Uri-Path and Uri-Queries.unicoap_endpoint_udp(...)
andunicoap_endpoint_dtls(...)
specify the endpoint, i.e. for UDP/DTLS the address and port. This design prevents constructing URIs for requests where the rawsock_udp_ep_t
already exists in the application and avoids having too many client APIs, especially with convenience APIs for block-wise and potential future identifiers like CRIs.The flags parameter serves for, e.g., signaling to the CoAP stack that the request should be sent as a confirmable message.
Details about the response (such as status code, the payload (+ size), and options) can be obtained from the
response
out parameter.For callback-based response handling, unicoap defines an async and a synchronous variant.
Note
Block-Wise Transfer
Sending: unicoap defines an auto-slice flag that can be passed alongside a client request or server response to instruct the stack that the request should be sliced and transmitted block-wise. In addition to that, you can use a slicer to send blocks manually.
Receiving: unicoap is also going to provide callback-based client request functions similar to the ones above that support automatic Block2 block-wise transfers. With these APIs, the callback is called per response block arriving at the client. The client and server API will also optionally (resource-intensive for async scenarios) support block-wise reassembly of requests/responses via a flag present in the resource definition or the requests flags parameter.
Note
On OSCORE and Profiles
unicoap is going to enable support for OSCORE. In order to leave room for future CoAP extensions akin to OSCORE, I’d like to introduce profiles⏤common characteristics for requests and responses that dictate special treatment is to be applied to a given message. E.g., an OSCORE security context stores characteristics relevant for encrypting/decrypting CoAP requests and responses.
Server API
unicoap will leverage existing resource definition APIs (nanocoap XFA and gcoap listeners), with slight modifications concerning flags (for things like restricting a resource from being accessed over unsecured transports). In contrast to existing APIs however, unicoap will allow sending responses after the request handler returns (needed for proxy operation).
A Note About Options
Let’s look at that
unicoap_options_t* options
field from before.unicoap_options_t
provides a view on CoAP options in a message buffer. The new API implements several APIs for option manipulation, with special focus on how repeatable and non-repeatable ("single-instance") options are handled.Non-repeatable options like Accept:
Repeatable options like Etag and Uri-Query.:
For options like
Uri-Path
whose combined values form another, aggregated value (here, the URI path is formed by concatenating all URI path components), you can voluntarily use convenience APIs like:Note
APIs for getting, setting, adding, and removing options with arbitrary option numbers still exist:
unicoap_options_get(...)
,unicoap_options_copy_value(...)
,unicoap_options_set(...)
,unicoap_options_remove
(non-repeatable);unicoap_options_add(...)
,unicoap_options_copy_values
,unicoap_options_remove_all
(repeatable).Tip
nanocoap also enforced a requirement where you would need to insert options in the order dictated by their corresponding option number. In unicoap this requirement is no longer present, yet inserting options in the order they’ll occur in the final packet will still deliver the best performance.
Exchangeable Transport Protocols and Message Formats
Expand for details
With unicoap, it’s easy to implement CoAP over, e.g., TCP. unicoap strives for driver equality, i.e., the built-in UDP and DTLS drivers won’t receive special treatment. Driver functionality is conditionally compiled in (e.g.,IS_USED(MODULE_UNICOAP_TRANSPORT_CARRIER_PIGEON))
.In unicoap’s transport layer, each ’driver’ must support two basic functions and a parser. As an example, let’s pick the UDP driver which provides these functions:
When you call
unicoap_init()
in your application, unicoap will spin up a thread (again, like gcoap does), create an event queue and ask your driver to perform any setup work._udp_init
opens a GNRC socket and registers a callback for incoming datagrams on the queue viasock_udp_event_init
. In the UDP case, the driver reads any available data from the socket and ultimately calls the following function (ignoring error handling).Because the CoAP header format varies from transport to transport, you will need to pass in a parser. These parsers aren’t entirely transport-specific, i.e., a certain parser (that is, a header format) is shared between multiple transports. The option format and payload (i.e., payload marker + payload that follows) remain the same across all CoAP message formats, however. In the UDP case, the appropriate parser is passed. Ultimately, after parsing the header, the parser calls
_parse_options_and_payload
which is shared functionality among all parsers.Sending works similarly, as each message is effectively passed down to the proper
_my_transport_send_message(...)
function which, in turn, also uses the corresponding header encoder.(where
unicoap_properties_t
holds the message ID, token, type)Implementation Status
Expand for details
At the moment the following components are implemented:
Thanks for reading! I’d love to hear your thoughts on the new API.
The text was updated successfully, but these errors were encountered: