Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thingspeak hangs #5

Open
rpj opened this issue Dec 28, 2018 · 4 comments
Open

thingspeak hangs #5

rpj opened this issue Dec 28, 2018 · 4 comments
Assignees
Labels
bug Something isn't working P1 Priority 1

Comments

@rpj
Copy link
Owner

rpj commented Dec 28, 2018

The thingspeak tool seems to hang very occasionally after running fine for long periods (~days), and seems to do so somewhere in the PSubscriber code for listening to redis. That's about all I know now.

@rpj rpj added the bug Something isn't working label Dec 28, 2018
@rpj rpj self-assigned this Dec 28, 2018
@rpj
Copy link
Owner Author

rpj commented Dec 28, 2018

It has to have something to do with how thingspeak is using the class, because other uses of it (oled-display, sql-sink) can run for many days without trouble (sql-sink is just about to write its 4 millionth row on charlie without a single error, for example...)

@rpj
Copy link
Owner Author

rpj commented Dec 29, 2018

Might need a way (TS has examples) to detect a lack of data ingestion and send an alert, so I can at least examine the failure mode while live...

@rpj
Copy link
Owner Author

rpj commented Dec 31, 2018

Actually, I now believe this was likely caused by the lack of a timeout argument to urllib2.urlopen(), causing that call to hang indefinitely. Have a few processes with instrumentation to prove this running right now, should find out for sure soon enough...

@rpj rpj changed the title PSubscriber sometimes hangs for unknown reasons in thingspeak thingspeak hangs Dec 31, 2018
@rpj
Copy link
Owner Author

rpj commented Jan 1, 2019

Ah ha, I do believe I've caught it! Managed to get one of the processes to hit 10 timeouts in a row over the course of about 5 minutes, but catching those instead of letting the process die allowed it to eventually recover just fine. Looping around those timeout and trying a couple more times immediately (rather than just bailing on that send and waiting until the next send point rolls around) might make things even better. Sweet! Finally.

@rpj rpj added the P1 Priority 1 label Jan 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P1 Priority 1
Projects
None yet
Development

No branches or pull requests

1 participant