Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enviro Weather crashes and isn't rebooted by watchdog #230

Open
g7uvw opened this issue Dec 30, 2024 · 7 comments
Open

Enviro Weather crashes and isn't rebooted by watchdog #230

g7uvw opened this issue Dec 30, 2024 · 7 comments

Comments

@g7uvw
Copy link

g7uvw commented Dec 30, 2024

I'm on my second enviro weather board - the first ended up literally covered in bugs that died and ate away some traces. I'm running the latest release firmware. Power is from a mains-derived 5V supply connected directly to the JST connector on the board. Periodically, anywhere from a few hours to a few days of running the board locks up with the white LED on permanently. It stays this way until I power-cycle the board. This occured with both the old board and the new one.

Looking at the code, it seems that the RTC has a watchdog that should be rebooting the board. The reboot never happens.

I've tried different power supplies with no change. It's an annoying intermittent bug that's hard to diagnose, and because the board is 3 m in the air, it is very hard to get logs.

Does anyone have any idea what is going on?

@sjefferson99
Copy link
Contributor

It seems to be an issue that affects individual boards differently. Mine was affected and I improved things by implementing the improved wifi code which is now in release v0.2.0. However it still locks up most days for a reason I can't pin down. I suspect it's something to do with the low power mode and the circuitry involved in that and probably some kind of manufacturing tolerance as it seems so varied across identical configurations. Or maybe the wifi fixed it and a couple of years outdoors has caused some other issues...

The code all looks fine and the circuit diagrams also all looked fine, so I am not sure why this happens. I started building my own firmware based on the enviro firmware to get realtime measurements and when I get some more time for this project I will probably completely rewrite it async and see if the problem continues. If so, I know it's hardware and I will build a custom board that doesn't have the deep sleep complexity that I don't need.

The only help I can give at this point is make sure you're running the latest release with the wifi patch in and failing that you may well get better results with another board. But as this is my suspicion and not a Pimoroni supported position, you may not be able to swap without buying another board.

The board appears to be crashing execution between lines of python in a way even the watchdog can't recover from.

@g7uvw
Copy link
Author

g7uvw commented Jan 1, 2025

I'll take the board down (when the weather improves and I can get to it) and add in the wifi patch - thank you.

For now, I'm just periodically rebooting the board by interrupting the power supply.

@AstroEng28
Copy link

I have what sounds like an identical issue on an Enviro Urban running v0.2.0. I haven't been able to fine any explanations yet on the forums or on GitHub.

My log doesn't show anything unusual, but it takes a power cycle or a reset to start it up again.

2024-12-31 15:30:14 [info / 137kB] > Disconnecting wireless after upload
2024-12-31 15:30:14 [info / 135kB] > going to sleep
2024-12-31 15:30:14 [debug / 133kB] - clearing and disabling previous alarm
2024-12-31 15:30:14 [info / 131kB] - setting alarm to wake at 15:45pm
2024-12-31 15:30:14 [info / 128kB] - shutting down
2024-12-31 23:23:59 [info / 142kB] > performing startup
2024-12-31 23:23:59 [debug / 140kB] - running Enviro 0.0.10, MicroPython v1.22.2, enviro v1.22.2 on 2024-03-06
2024-12-31 23:23:59 [info / 134kB] - wake reason: rtc_alarm

@AstroEng28
Copy link

AstroEng28 commented Jan 19, 2025

By the way, there are multiple other posts about similar issues:

Activity LED is always on

0.0.9 - Enviro Urban - Exception while uploading - Caused board to hang fro 2hrs

@gbravery
Copy link

I'm going to add an (annoying!!) me too on this issue.

I see the same thing - typically with the Weather.
It will be quite happy for a bit, and then randomly decide to go wrong.

Image

FWIW, it has been MUCH more stable since the new wifi code went in.

But all the breaks in data show the signs of the same things - It's as if this bit of code does something weird:

  # sleep until next scheduled reading
  rtc.set_alarm(0, minute, hour)
  rtc.enable_alarm_interrupt(True)

  # disable the vsys hold, causing us to turn off
  logging.info("  - shutting down")
  hold_vsys_en_pin.init(Pin.IN)

  # if we're still awake it means power is coming from the USB port in which
  # case we can't (and don't need to) sleep.
  stop_activity_led()

Because we see the " - shutting down" message, but instead of shutting down (and the LED turning off) it does "something" else...
and leaves the LED on.

Then when I plug it in to look it's as if the RTC has forgotten the time (as the first thing it did was fetch the time from NTP).

The hold_vsys_en_pin.init obviously did "something" as the code never hit the "stop_activity_led" call.
Any thoughts on diagnostics?

I'm happy to hack with code etc as my weather is accessible to me (i.e. it's not on my roof; it's just in the garden).

I'm almost wondering if the board needs a "gentle" clean (i.e. I'm now clutching at straws).
Would I potentially see something else, if i plugged a USB in to it whilst it's still "stuck" (rather than what I see when I power it off as I bring it inside)?

@gbravery
Copy link

Just to document, so my issue became worse (as in it was failing at the drop of a hat now, not every now and then) - even when just sat on my desk connected to a USB cable.

So, I went with my plan B 0 as swabbed everything down with alcohol....
The problem seems to have gone away (touch wood).

Which implies dirt causing some sort of short possibly?

@g7uvw
Copy link
Author

g7uvw commented Feb 13, 2025

Just to document, so my issue became worse (as in it was failing at the drop of a hat now, not every now and then) - even when just sat on my desk connected to a USB cable.

So, I went with my plan B 0 as swabbed everything down with alcohol.... The problem seems to have gone away (touch wood).

Which implies dirt causing some sort of short possibly?

My original one became very flaky and finally died, I took it down and it has picked up an infestation of small black flies.
https://webshed.org/blog/debugging_a_weatherstation/

My new one I gave a conformal coating to stop corrosion, but it still seems to lock up. I'll take it down this weekend and have a look and update the wifi code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants