-
Notifications
You must be signed in to change notification settings - Fork 1
Configuration with MQTT sink throws "packet_id exhausted error" after few hours of running #241
Comments
By quickly googling and looking through the code this mqtt_cpp issue came up: redboltz/mqtt_cpp#540 It seems fixed long ago though. Although it is not clear to me if "acquire_unique_packet_id" should be used without explicit release. Commenters seem to have conflicting opinion on that. In cppagent i see it being used, i.e. in here: cppagent_dev/src/mqtt/mqtt_client_impl.hpp Line 249 in 61a8537
|
I’ll look i into it. In the samples they don’t release the packet ids after use. I’ll check the code. I’m also not sure why we call it clientId and not packetId and why it’s an instance variable instead of a local. A lot was taken from samples. The limit is uint16 so 65536. Not sure about lifecycle and max rates. I remember one of the SpB saying they overloaded the broker and needed to package multiple measurements together. This may be the same issue. Do you have any insights? I’ll look into the mqtt code this afternoon and see what the unique packet id method does. BestW(Sent from mobile)On Dec 6, 2022, at 04:35, Aleksandr Smirnov ***@***.***> wrote:
By quickly googling and looking through the code this mqtt_cpp issue came up: redboltz/mqtt_cpp#540
It seems fixed long ago though. Although it is not clear to me if "acquire_unique_packet_id" should be used without explicit release.
Commenters seem to have conflicting opinion on that.
If cppagent i see it being used, i.e. in here: https://github.com/mtconnect/cppagent_dev/blob/61a853793a9509146a5c8628312eefa89879b6c0/src/mqtt/mqtt_client_impl.hpp#L249
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Got another occurrence of that. What's interesting is that its actually happening pretty quickly (after about 40 minutes of running). And first error is not about packet_id but something else that gets followed by exhaust of packet_ids. See part of log below, thats about the time packet_id's errors started:
Makes me wonder if actual reason is in |
Another observation is that when i am starting agent then CPU is around 20%, memory on 37%. And it keeps around those values during normal operation (for 40~ minutes). After that CPU spikes to 300-400% Hard to say which comes first. Restarting agent process resets everything back to normal. UPDATE: Now i am seeing those packet_id errors with CPU very low |
Unfortunately #242 didn't appear to have solved the issue. I've rebuilt agent and running under similar conditions and getting same results.
|
@jaxer what broker are you running this against? Can you share its configuration? |
@MRIIOT I was trying AWS IoT Core and MQTT broker inside AWS Greengrass (based on Moquette) I was not changing any configuration of those so i guess QoS should be 0 |
I remember running into this issue before. The problem had to do with the data rates to the broker and overloading the broker before it could free up another another free packet id. If we reduce the QoS we may lose data. The guys from SpB mentioned this as a problem they had, which is why the started combining multiple observations into a singe message. There are a few solutions. We could do something similar to what SpB did, but then we have issues with retention and we need a way to get the current state (SpB does this put publishing a "REBIRTH" from the subscriber) If this is what is happening and greengrass can't offload fast enough, then we may need to do something. Another option is to publish a current to one topic periodically and then deltas to another topic. The streams will sync every so often and an application can figure it out by checking sequence. What do you think? |
Can you try it against a local Mosquitto instance, log everything, and share the broker logs? |
Hey full error below. It is repeated many many times.
I am running two agents both on separate Ubuntu machines (one on AWS EC2 and another in Docker container on windows host). Both exhausted packet_id's at about same time (overnight).
Part of agent.log
agent.cfg:
Full logs: logs.zip
Full Devices.xml: Devices.xml
The text was updated successfully, but these errors were encountered: