Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSDK-9233, RSDK-8904 - only call updateWeakDependents once per reconfiguration level #4585

Merged
merged 18 commits into from
Dec 17, 2024

Conversation

cheukt
Copy link
Member

@cheukt cheukt commented Nov 25, 2024

Bug:

  • If a resource with weak dependencies (data manager) depend on a resource (sensor), and the sensor reconfigures, the viam-server will get stuck in an infinite reconfigure loop. This is because in the process of getting dependencies for the data manager, we will run updateWeakDependents (because the sensor has updated but weak dependents have not), and in the process of running getDependencies inside updateWeakDependents we will continue to trip the check at r.lastWeakDependentsRound.Load() <= node.UpdatedAt(), meaning this will call updateWeakDependents in an infinite loop
  • an associated bug was if multiple resources were in the same level, there is a chance that updateWeakDependents will be called at the same time and reconfigure the same resources at the same time

Fix:

  • Only update weak dependents once per reconfiguration level
  • Only update weak dependents if something in the resource graph has updated (logical clock has to be greater than lastWeakDependentsRound
  • A secondary optimization was added so that we only updateWeakDependents if the resource actually depends on something with weak dependencies.

Since we can't guarantee that any resource depending on weak dependents will get "the most updated" weak dependent anyway, perhaps we should just guarantee that those weak dependents will be as updated as the last reconfiguration level

@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Nov 25, 2024
@cheukt cheukt requested a review from dgottlieb November 25, 2024 22:53
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 25, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 26, 2024
@dgottlieb
Copy link
Member

dgottlieb commented Nov 26, 2024

Mostly looks good. I'd like to get more detail on the expected behavior written down somewhere. So if we have a robot config where:

  • sensor explicitly depends on datamanager
  • datamanger weakly depends on everything

For the following scenarios:

  1. Robot first comes up
  2. the sensor config changes
  3. a third component config changes (triggering data manager reconfig)

What's the order of the following events:

  • Which resources have their Reconfigure called + when are they called
  • When the logical clock is read from
  • When the logical clock is bumped

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 27, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Nov 27, 2024
@cheukt
Copy link
Member Author

cheukt commented Nov 27, 2024

added tests and updated comments

@cheukt cheukt changed the title RSDK-9233 - Fix infinite loop inside getDependencies() RSDK-9233, RSDK-8904 - only call updateWeakDependents once per reconfiguration level Nov 27, 2024
@cheukt cheukt requested a review from benjirewis November 27, 2024 23:23
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 2, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 2, 2024
Copy link
Member

@benjirewis benjirewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, some initial comments.

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 2, 2024
@cheukt cheukt requested a review from benjirewis December 2, 2024 20:35
Copy link
Member

@benjirewis benjirewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming tests pass; thanks!

// (that also has an explicit dependency on one of the components).
// The following scenario is expected:
// 1) The two bases will configure first in parallel (which will each bump the logical clock).
// 2) The completeConfig loop will check to see if there is a need to updateWeakDependents by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We run completeConfig at the end of each "level" of reconfigure? At the end of each resource?

I always assumed "complete" inferred the last step before localrobot.Reconfigure returns.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking more closely, I'm not sure what @cheukt means by "The completeConfig loop." completeConfig is, indeed, one of the final steps of Reconfigure/reconfigure for a local robot. completeConfig also runs every 5s if any resources are still in NodeStateConfiguring or NodeStateUnhealthy.

With @cheukt's changes, updateWeakDependents no longer runs as part of getDependencies. It now runs in the following situations:

  1. Every 5s when any changes in remote resources or resources are still in NodeStateConfiguring / NodeStateUnhealthy
  2. Right after initial construction
  3. Upon orphaned resource removal
  4. After completeConfig in reconfigure
  5. Potentially at the end of each level in completeConfig
    a. This one is new and only happens when "any resources in the level needs reconfiguration and depends on weak dependents, and the resource graph has updated since the last time weak dependents were updated"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned offline that I was worried one of these 5 updateWeakDependents calls might be unnecessary. Now, I can't convince myself that we can remove any of them 😆 . In any case, let's not worry about the other updateWeakDependents callsites in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly makes sense to me, thanks! Maybe I asked this question in the wrong test. I think the case I've asked you to describe is when base1 strongly depends on weak1. And in this test was just have weak1 strongly and weakly depending on base1.

Potentially at the end of each level in completeConfig

That reads to me as:

  • First we reconfigure base1+base2
  • Then we reconfigure weak1

Which sounds fine because we want to update weak1 because things it depended on changed.

But then the sub-bullet says:

a. This one is new and only happens when "any resources in the level needs reconfiguration and depends on weak dependents, and the resource graph has updated since the last time weak dependents were updated"

Editorializing to make sure we're on the same page:

any resources in the level needs reconfiguration (e.g: base1) and depends on weak dependents (e.g: weak1), and the resource graph has updated since the last time weak dependents (e.g: base1) were updated

That to me reads as we're reconfiguring the weak dependents because something else depends on them. Normally we want to reconfigure "leaf nodes" first, then work our way up (or down?) to the things that depended on those leaf nodes. But my interpretation of this algorithm is that we're intentionally updating things in the opposite order that we normally do?

Copy link
Member Author

@cheukt cheukt Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially at the end of each level in completeConfig

is inaccurate based on the code - I'm pretty sure updateWeakDependents is being run in the beginning of the loop, and that is the intention because of the case Dan is describing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure updateWeakDependents is being run in the beginning of the loop, and that is the intention because of the case Dan is describing

You're referencing this call to updateWeakDependents?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// The following scenario is expected:
// 1) The two bases will configure first in parallel (which will each bump the logical clock).
// 2) The completeConfig loop will check to see if there is a need to updateWeakDependents by
// checking the logical clock on also whether there are dependencies on weak dependents.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "dependencies on weak dependents" mean "explicit/"strong" dependencies on weak dependents"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I interpreted this as "resources that have an API/model pair with weak dependencies." I believe @cheukt is using the words "weak dependents" to mean resources upon which other resources weakly depend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a suggested change on your earlier in-line comment to potentially change the language here and elsewhere for %s:dependencies on weak dependents/has weak dependencies/g.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my comment is meant to convey what @dgottlieb is saying. I'm using "weak dependents" to mean "resources with weak dependencies" but can see why it's confusing. Can't really think of a better name though, but I also feel that "resources with dependencies that have weak dependencies" is also just as confusing. Open to other phrasing!

// 1) The two bases will configure first in parallel (which will each bump the logical clock).
// 2) The completeConfig loop will check to see if there is a need to updateWeakDependents by
// checking the logical clock on also whether there are dependencies on weak dependents.
// The check will fail.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think on -> and? ("checking the logical clock on")

Which conditions fail? I think the "last round" is still at 0 and the "clock" is at 2 now? So this would fail because we haven't initialized weak1 yet? ("There are no components with weak dependencies?")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think on -> and?

Agree, it should be and.

Which conditions fail?

Failure (no update of weak dependents from @cheukt 's new updateWeakDependents call in completeConfig) occurs when the "last round" is GTE the clock value or when no resources have weak dependencies.

I think the "last round" is still at 0 and the "clock" is at 2 now?

Yep.

So this would fail because we haven't initialized weak1 yet? ("There are no components with weak dependencies?")

Yep.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this would fail because we haven't initialized weak1 yet? ("There are no components with weak dependencies?")

No, this failed because there are no dependencies on weak dependents in level 2, e.g. nothing depends on weak1

// checking the logical clock on also whether there are dependencies on weak dependents.
// The check will fail.
// 3) weak1 will configure.
// 4) At the end of reconfiguration, weak1 will reconfigure once.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logical clock will be at 3 now? And last round is still 0?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logical clock will be at 3 now?

4 now.

And last round is still 0?

Correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last round is 4 - adding a check and additional documentation to clarify

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely misunderstanding last round; will need some clarification on its value offline.


// Reconfigure base2
// The following scenario is expected:
// 1) The base2 will reconfigure (which will bump the logical clock).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would bet it's 7 at this point. I'll stop answering these ones, but we definitely need assertions on the logical clock.

Copy link
Member Author

@cheukt cheukt Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the assertions

test.That(t, weak1.reconfigCount, test.ShouldEqual, 2)

// Reconfigure base2
// The following scenario is expected:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the "last weak dependency round" here? And when does it get bumped?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added language around each scenario about the expected last weak dependency round

test.That(t, weakCfg3.Ensure(false, logger), test.ShouldBeNil)
robot.Reconfigure(context.Background(), &weakCfg3)

res, err = robot.ResourceByName(weak1Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above -- ideally changed to weakRes.

test.That(t, err, test.ShouldBeNil)
test.That(t, weak1.reconfigCount, test.ShouldEqual, 3)

// check that getting dependencies for either resource does not increase reconfigCount
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend changing "getting dependencies" to "calling getDependencies". And "does not increase reconfigCount" to "does not change weak1.reconfigCount".

lRobot := robot.(*localRobot)
node, ok := lRobot.manager.resources.Node(weak1Name)
test.That(t, ok, test.ShouldBeTrue)
lRobot.getDependencies(weak1Name, node)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "getting dependencies for either resource does not".

I was assuming that'd be "base1" and "base2". But instead we're checking "weak1" and "base1". I can't tell if those cases have specific logic we're trying to exercise? Or if "base2" is considered redundant? But I'm not sure why it would be. Given it's different from base1 in that it is not an explicit dependency for weak1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the infinite loop issue happened because getDependencies ended up calling reconfig on weak1 which chained infinitely. this check is basically a check that getDependencies do not have side effects on the resource graph, and the tests here are already set up. Can move to a different test if that makes more sense

@@ -629,6 +629,41 @@ func (manager *resourceManager) completeConfig(
levels := manager.resources.ReverseTopologicalSortInLevels()
timeout := rutils.GetResourceConfigurationTimeout(manager.logger)
for _, resourceNames := range levels {
// if any resources in the level needs reconfiguration and depends on weak dependents,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// if any resources in the level needs reconfiguration and depends on weak dependents,
// if any resources in the level need reconfiguration and have weak dependencies,

The phrase "depend on weak dependents" is pretty confusing to me. Is my rephrasing there accurate with what you're meaning?

Copy link
Member Author

@cheukt cheukt Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not accurate - what I mean is if any resource in the level need reconfiguration and depends on a resource that have weak dependencies. So if the resource depends on something that would be affected by updateWeakDependents.

I ended up using the same phrasing as the function name, but definitely can be confusing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying; I see now you're using GetAllParentsOf on resources in the level and then checking whether those parents (resources that depend on the original resource) has weak dependencies. Now I'm more confused on how we arrived on that being part of the requirement for updateWeakDependents; let's talk about in the meeting later.

// (that also has an explicit dependency on one of the components).
// The following scenario is expected:
// 1) The two bases will configure first in parallel (which will each bump the logical clock).
// 2) The completeConfig loop will check to see if there is a need to updateWeakDependents by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned offline that I was worried one of these 5 updateWeakDependents calls might be unnecessary. Now, I can't convince myself that we can remove any of them 😆 . In any case, let's not worry about the other updateWeakDependents callsites in this PR.


// Reconfigure base1
// The following scenario is expected:
// 1) The base1 will reconfigure (which will bump the logical clock).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To 5.

// 1) The base1 will reconfigure (which will bump the logical clock).
// 2) The completeConfig loop will check to see if there is a need to updateWeakDependents by
// checking the logical clock on also whether there are dependencies on weak dependents.
// The check will fail.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, too, @cheukt ; shouldn't the check pass? As @dgottlieb pointed out, it definitely looks like weak1 gets a second Reconfigure... Step (4) here therefore sounds incorrect.

// 2) The completeConfig loop will check to see if there is a need to updateWeakDependents by
// checking the logical clock on also whether there are dependencies on weak dependents.
// The check will fail.
// 3) weak1 will not reconfigure, as base1 was not newly built or error during reconfiguration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why (2) fails (if it does,) and I do not think the "newly built" or "reconfig error" pieces here have to do with it. Agree with @dgottlieb 's rewording, too: something like weak1 will not reconfigure, as base1 was neither newly built nor errored during reconfiguration.

// We need the following `robot.Reconfigure` to call `Reconfigure` on this `base1`
// component. We change the `Attributes` field from the previous (nil) value to
// accomplish that.
Attributes: rutils.AttributeMap{"version": 1},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would work, too, but I have no strong opinion on what you currently have vs. @dgottlieb 's suggestion.


// Reconfigure base2
// The following scenario is expected:
// 1) The base2 will reconfigure (which will bump the logical clock).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would bet it's 7 at this point. I'll stop answering these ones, but we definitely need assertions on the logical clock.

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 6, 2024
Copy link
Member Author

@cheukt cheukt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I addressed everything, let me know if there are further qs

weak1, err := resource.AsType[*someTypeWithWeakAndStrongDeps](res)
test.That(t, err, test.ShouldBeNil)
// Assert that the weak dependency was tracked.
test.That(t, weak1.resources, test.ShouldHaveLength, 2)
Copy link
Member Author

@cheukt cheukt Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is a bug that the weak dependency system creates and I'm not looking to fix here. weak1 will be called a few times during reconfiguration. if reconfigure is called during the "regular" reconfiguration loop (the one with level), the reconfigure will only have the strong dependency passed in. during updateWeakDependents however, we will call it with both bases . Because we do make sure to call updateWeakDependents before we exit out of reconfiguration, the list of dependencies should be 2.

Edit: the paragraph above is wrong. We do grab both weak and strong dependencies and pass them in at the same time - we dedupe using a map of [resource.Name]{resource}

// 1) The base1 will reconfigure (which will bump the logical clock).
// 2) The completeConfig loop will check to see if there is a need to updateWeakDependents by
// checking the logical clock on also whether there are dependencies on weak dependents.
// The check will fail.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding additional words to explain, but basically these checks fail because there is nothing that depends on weak dependents (ex: weak1) in the next level

// 2) The completeConfig loop will check to see if there is a need to updateWeakDependents by
// checking the logical clock on also whether there are dependencies on weak dependents.
// The check will fail.
// 3) weak1 will not reconfigure, as base1 was not newly built or error during reconfiguration.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

// checking the logical clock on also whether there are dependencies on weak dependents.
// The check will fail.
// 3) weak1 will not reconfigure, as base1 was not newly built or error during reconfiguration.
// 4) At the end of reconfiguration, weak1 will reconfigure once.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updateWeakDependents will reconfigure weak1. adding language to clarify


// Reconfigure base2
// The following scenario is expected:
// 1) The base2 will reconfigure (which will bump the logical clock).
Copy link
Member Author

@cheukt cheukt Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the assertions

lRobot := robot.(*localRobot)
node, ok := lRobot.manager.resources.Node(weak1Name)
test.That(t, ok, test.ShouldBeTrue)
lRobot.getDependencies(weak1Name, node)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the infinite loop issue happened because getDependencies ended up calling reconfig on weak1 which chained infinitely. this check is basically a check that getDependencies do not have side effects on the resource graph, and the tests here are already set up. Can move to a different test if that makes more sense

// 2) The completeConfig loop will check to see if there is a need to updateWeakDependents by
// checking the logical clock and also whether there are dependencies on weak dependents.
// The check will fail.
// 3) weak1 will not reconfigure, as base1 was not newly built or error during reconfiguration.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did intend for the base1 bit there, since I wanted to give a reason as to why weak1 did not reconfigure. Can remove if it's confusing

test.That(t, weak1.reconfigCount, test.ShouldEqual, 2)

// Reconfigure base2
// The following scenario is expected:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added language around each scenario about the expected last weak dependency round

// We need the following `robot.Reconfigure` to call `Reconfigure` on this `base1`
// component. We change the `Attributes` field from the previous (nil) value to
// accomplish that.
Attributes: rutils.AttributeMap{"version": 1},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

test.That(t, err, test.ShouldBeNil)
weak1, err = resource.AsType[*someTypeWithWeakAndStrongDeps](res)
test.That(t, err, test.ShouldBeNil)
test.That(t, weak1.reconfigCount, test.ShouldEqual, 2)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we run updateWeakDependents at the end of reconfiguration, which is why this has been bumped. let me know if the comments communicate the story better now.

@benjirewis
Copy link
Member

Since this PR is pretty noisy already, I assume you two are alright with continuing to discuss the weak dependency system in person later this week?

test.That(t, err, test.ShouldBeNil)
test.That(t, weak1.reconfigCount, test.ShouldEqual, 2)

test.That(t, lRobot.lastWeakDependentsRound.Load(), test.ShouldEqual, 5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So reconfiguring base1 calls SwapResource and bumps the clock. But "weak reconfiguring" weak1 does not result in a SwapResource?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, we don't call SwapResource when "weak reconfiguring."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, we don't call SwapResource when "weak reconfiguring."

Why do we not call SwapResource when weak reconfiguring?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we don't want the logical clock to increase when weak reconfiguring.

Copy link
Member Author

@cheukt cheukt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

responded, @dgottlieb you said you had some changes ready so did not make further doc changes yet

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 10, 2024
@cheukt cheukt requested a review from dgottlieb December 10, 2024 17:54
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 11, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 13, 2024
@cheukt
Copy link
Member Author

cheukt commented Dec 13, 2024

revamped documentation once more, hopefully things are a bit clearer now

Copy link
Member

@benjirewis benjirewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the state of this PR now? The current comments LGTM and I answered my remaining question (I think.)

test.That(t, err, test.ShouldBeNil)
test.That(t, weak1.reconfigCount, test.ShouldEqual, 2)

test.That(t, lRobot.lastWeakDependentsRound.Load(), test.ShouldEqual, 5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we don't want the logical clock to increase when weak reconfiguring.

@cheukt cheukt merged commit 9098976 into viamrobotics:main Dec 17, 2024
16 checks passed
@cheukt cheukt deleted the fix-infinite-loop branch December 17, 2024 22:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
safe to test This pull request is marked safe to test from a trusted zone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants