Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make SendToChildrenInParallel asynchronous #682

Open
ineiti opened this issue Dec 15, 2021 · 10 comments
Open

Make SendToChildrenInParallel asynchronous #682

ineiti opened this issue Dec 15, 2021 · 10 comments

Comments

@ineiti
Copy link
Member

ineiti commented Dec 15, 2021

The current SendToChildrenInParallel is synchronous: it waits for each children to receive the message OR to return an error. Unfortunately some connections only return an error after a timeout. For the blscosi protocol, this timeout is too long for the protocol to work correctly.

So SendToChildrenInParallel should return immediately and still allow to be notified:

  • when it's done
  • eventual errors

@tharvik proposed the following code:

	ret := make(chan error)
	childsRets := make(chan error)
	go func() {
		for range p.Children() {
			if err := <-childsRets; err != nil {
				ret <- xerrors.Errorf("send to children: %v", err)
			}
		}
		close(ret)
	}()
	for _, c := range p.Children() {
		go func(tn *onet.TreeNode) {
			childsRets <- p.SendTo(tn, msg)
		}(c)
	}
	return ret

As this is a breaking change, one could add the following to SendToChildrenInParallel:

Deprecated: this is synchronous with eventual connection errors, so please use `SendToChildrenInBackground`.

Naming propositions (add yours):

  • SendToChildrenInBackground

Once this is merged, the calls to SendToChildrenInParallel should be replaced with this one. Also the hacky ones in the cothority/blscosi/protocol/sub_protocol.go and cothority/messaging/propagate.go.

@nkcr
Copy link
Contributor

nkcr commented Dec 15, 2021

With the proposed solution we are open to a zombie goroutine in case the caller is not fully reading the channel. I see two solutions:

  1. Concatenate the errors, and return a channel with one error, if any
  2. Return a buffered chan so that we are not concerned about blocking when filling the chan and avoid a loop

Solution 2 seems easier:

	ret := make(chan error, len(p.Children))

	wait := sync.WaitGroup{}
	wait.Add(len(p.Children))

	for _, c := range p.Children() {
		go func(tn *onet.TreeNode) {
			defer wait.Done()
			ret <- p.SendTo(tn, msg)
		}(c)
	}

	go func() {
		wait.Wait()
		close(ret)
	}()

	return ret

Then the caller can do:

ret := SendToChildrenInBackground(...)
for err, more := range ret {
	if more {
		// do something with err
	}
}

@ineiti
Copy link
Member Author

ineiti commented Dec 15, 2021

@tharvik what do you think of @nkcr 's proposal?

@tharvik
Copy link
Contributor

tharvik commented Dec 15, 2021

Return a buffered chan so that we are not concerned about blocking when filling the chan and avoid a loop

Right, good idea, that do avoid having non-terminating goroutine.

I would avoid sending nil error on the channel

func SendToChildrenInBackground(msg interface{}) <-chan error {
	ret := make(chan error, len(p.Children()))

	var wait sync.WaitGroup
	wait.Add(len(p.Children()))

	for _, c := range p.Children() {
		go func(tn *onet.TreeNode) {
			defer wait.Done()
			if err := p.SendTo(tn, msg); err != nil {
				ret <- err
			}
		}(c)
	}

	go func() {
		wait.Wait()
		close(ret)
	}()

	return ret
}

This way, the caller can simply

for err := range SendToChildrenInBackground(...) {
	// some error happened
}

@ineiti
Copy link
Member Author

ineiti commented Dec 15, 2021

The nice thing about having a potential nil in the channel is that the caller can also count how many successful connections have been made. So in case he needs t out of n successful connections to continue, the caller can simply count the nils and then go on, dropping the rest of the results. That's something I think might be used in some cases.

@tharvik
Copy link
Contributor

tharvik commented Dec 15, 2021

Alright, let's add the next helper then, to easily choose between the two behaviors

func FilterNilError(recv <-chan error) <-chan error {
	ret := make(chan error) // should we use cap(recv) here?

	go func() {
		for err := range recv {
			if err != nil {
				ret <- err
			}
		}
		close(ret)
	}()

	return ret
}

@ineiti
Copy link
Member Author

ineiti commented Dec 15, 2021

That makes the main function quite small:

func (n *TreeNodeInstance) SendToChildrenInBackground(
	msg interface{}) <-chan error {
	ret := make(chan error, len(n.Children()))

	for _, c := range n.Children() {
		go func(tn *TreeNode) {
			ret <- n.SendTo(tn, msg)
		}(c)
	}

	return ret
}

Or we make a SendToChildrenInBackgroundErrorOnly method (better name needed) that calls the SendToChildrenInBackground.

@tharvik
Copy link
Contributor

tharvik commented Dec 15, 2021

That makes the main function quite small:
...

No, it needs the WaitGroup, because we really have to close the channel, otherwise, we can't range on it.

Or we make a SendToChildrenInBackgroundErrorOnly method (better name needed) that calls the SendToChildrenInBackground.

Having a helper allows us to reuse this pattern without linking it to SendToChildrenInBackground.

And to come back to the need to have nil errors: that's quite a specific use case, do you have anywhere it's needed now? We can always add it latter should the need arise.

@ineiti
Copy link
Member Author

ineiti commented Dec 16, 2021

I thought of using it in a

childrenErrors := onet.SendToChildrenInBackground(...)
for {
   select {
   case err := <- childrenErrors:
   case other := <- otherStuff:
   }
}

If I remember correctly, closing the channel would always choose the first case, correct? And the case other would never be called?

If that is true, then I prefer not to have the close in the method. Because that's how it's used in the blscosi protocol.

@nkcr
Copy link
Contributor

nkcr commented Dec 16, 2021

If I remember correctly, closing the channel would always choose the first case

No, there are absolutely no priority in select statement. Use default if you want that.

I seems you're trying to be too smart there. The function should have a simple behavior: return result of sends on a channel and close it when its done. period. Let the caller implement its own utility function if needed.

@ineiti
Copy link
Member Author

ineiti commented Dec 16, 2021

I like being smart...

Anyway - if you both think the channel should be closed, I'll add some additional logic in the 2 places (out of 6) where this is needed.

And for the nil use-case, it could be used in the propagate-protocol. But I'm not sure it's a good thing to do there.

Now I just need an easy way to write the test for this and some time...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants