Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poll bulbs for their Zigbee Connectivity status explicitly when refreshing #926

Merged
merged 2 commits into from
Oct 18, 2023

Conversation

dljsjr
Copy link
Contributor

@dljsjr dljsjr commented Aug 22, 2023

Adds explicit checking for zigbee connectivity status as part of refresh, instead of always waiting for it to come in on the SSE stream.

Note that the zigbee connectivity status itself within the Hue network can be delayed.

Because we're now doing more network calls during refresh, I also adjusted the "refresh all" handler to use the bulk endpoints.

To keep the changeset on this small, I did it by overloading the functionality of the existing refresh for single lights. It's noted in comments that this isn't the most ideal, however, I'm in the process of planning out a rearch of this driver including with a test coverage plan, so I didn't think deep surgery was warranted given that we're going to be doing a complete redo soon.

Pardon the Mess™

@github-actions
Copy link

github-actions bot commented Aug 22, 2023

Channel deleted.

@github-actions
Copy link

github-actions bot commented Aug 22, 2023

Test Results

     54 files     345 suites   0s ⏱️
1 622 tests 1 622 ✔️ 0 💤 0
2 842 runs  2 842 ✔️ 0 💤 0

Results for commit 6e83abd.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Aug 22, 2023

Minimum allowed coverage is 90%

Generated by 🐒 cobertura-action against 6e83abd

@@ -247,6 +248,7 @@ end
---@param light_device HueChildDevice
local function do_refresh_light(driver, light_device)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function has become exceedingly smelly, but I didn't think that now was the right time to be dealing with that. One of the major issues is that the whole driver is oriented around light devices on our platform representing a light resource in the Hue representation. We should probably revise that to orient around the Hue device representation instead of the light resource representation, but that's a deep surgery on the entire driver.

elseif light_resp ~= nil then
if #light_resp.errors > 0 then
for _, err in ipairs(light_resp.errors) do
rest_resp, rest_err = hue_api:get_device_by_id(hue_device_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is this call to the device record required because we need this to get the zigbee resource, which is needed to get that connectivity state? After seeing these changes I'm a little more concerned about this approach of adding all this to refresh. Would we be better off explicitly checking all the children connectivity status via a separate method only on the transition from hub offline -> online.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm of two minds about it.

On the one hand, yes, that's a lot of REST calls in a single event handler. We could pare it down to 2 instead of 3 if we stored the zigbee ID on the ST device record, but we're already storing so many strings in the driver that there's a real memory usage concern.

On the other hand, I do think it makes sense to check the connectivity status when you do a pull-to-refresh for a variety of reasons, not least of which being that Hue bridge will sometimes return last-known-state for an offline bulb.

Copy link
Contributor Author

@dljsjr dljsjr Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a different option could be not reusing the refresh code in the "refresh all" handler on the bridge, and having the "refresh all" code use the bulk endpoints, but that didn't seem like a change that was straight forward enough to have ready on the timelines we were targeting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@varzac See the updated description and the fix-up commit where I address some of this.

@dljsjr dljsjr changed the base branch from bugfix/hue-children-online-on-init to main September 13, 2023 18:23
@dljsjr dljsjr force-pushed the fix/hue-poll-connectivity-on-refresh branch 2 times, most recently from 5e74b0e to 0f99437 Compare September 13, 2023 18:31
@dljsjr dljsjr force-pushed the fix/hue-poll-connectivity-on-refresh branch from e867e9f to fcf3ae8 Compare October 17, 2023 21:10
@dljsjr
Copy link
Contributor Author

dljsjr commented Oct 18, 2023

I validated this locally via a monkeypatched st.device to print trace logs when :online() and :offline() are called.

The test I ran:

  1. Join a Hue topology that has an offline light,
  2. Verify the light marks itself offline at join
  3. Disconnect the Hue bridge, wait till all offline
  4. Reconnect the Hue bridge, wait til the SSE recovers and verify that the light still marks itself offline
  5. Reconnect the bulb and verify that it marks itself online

The reason I had to patch the Device table and use traces to confirm this is because we want to land #931 on top of this PR, and so the mobile app will show online a lot of the time because of the attribute event emits. Based on tracing the offline/online calls, we have the right behavior. We just need to land the next PR on top of this one to get the correct behavior out of the mobile app.

Copy link
Contributor

@tandres tandres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dljsjr dljsjr force-pushed the fix/hue-poll-connectivity-on-refresh branch from 23ae1a9 to 4121c8f Compare October 18, 2023 22:13
cjswedes and others added 2 commits October 18, 2023 17:13
Whenever a driver is restart, a bridge will temporarily be marked
offline while the event source connection is created. This seems to be
affecting the online/offline state of children incorrectly post 0.49.x
FW.
@dljsjr dljsjr force-pushed the fix/hue-poll-connectivity-on-refresh branch from 4121c8f to 6e83abd Compare October 18, 2023 22:13
@dljsjr dljsjr merged commit 63214e1 into main Oct 18, 2023
10 checks passed
@dljsjr dljsjr deleted the fix/hue-poll-connectivity-on-refresh branch October 18, 2023 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants