Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race Condition in StartExec #1076

Open
mpass99 opened this issue Jul 25, 2024 · 1 comment
Open

Race Condition in StartExec #1076

mpass99 opened this issue Jul 25, 2024 · 1 comment
Labels

Comments

@mpass99
Copy link

mpass99 commented Jul 25, 2024

Expected Behavior

We consistently receive no error for an exec in a container (thread-safe).

Current Behavior

Sometimes the StartExec function is returning the error io: read/write on closed pipe.

This happens once with around:

  • 30.000 exec calls in production
  • 60.000 exec calls running the following minimal reproduction setup in a local environment
  • 500 exec call when enabling Go's Data Race Detector (go run -race main)

Steps to Reproduce

  • docker pull ubuntu
  • go mod init main
  • go get github.com/fsouza/go-dockerclient
  • Run main.go with go run -race main.go.

Possible Solution

The error originates in the forwarding of the StdIn stream, here.

One Goroutine (1) is started to copy the StdIn to the Docker Deamon.

go-dockerclient/client.go

Lines 826 to 835 in 87c2a33

go func() {
var err error
if hijackOptions.in != nil {
_, err = io.Copy(rwc, hijackOptions.in)
}
errChanIn <- err
rwc.(interface {
CloseWrite() error
}).CloseWrite()
}()

Another Goroutine (2) copies the StdOut from the Daemon. However, when the copy of the StdOut stream finishes, the Input Reader is closed.

go-dockerclient/client.go

Lines 806 to 823 in 87c2a33

go func() {
defer func() {
if hijackOptions.in != nil {
if closer, ok := hijackOptions.in.(io.Closer); ok {
closer.Close()
}
errChanIn <- nil
}
}()
var err error
if hijackOptions.setRawTerminal {
_, err = io.Copy(hijackOptions.stdout, br)
} else {
_, err = stdcopy.StdCopy(hijackOptions.stdout, hijackOptions.stderr, br)
}
errChanOut <- err
}()

Common Readers (such as io.Pipe) return an error when someone tries to read from it after it has been closed. This happens here when Goroutine (2) closes the Reader and Goroutine (1) is still trying to copy data from it.

When the Go scheduler now decides to switch Goroutines between these two lines

go-dockerclient/client.go

Lines 810 to 812 in 87c2a33

closer.Close()
}
errChanIn <- nil

to these two lines

go-dockerclient/client.go

Lines 829 to 831 in 87c2a33

_, err = io.Copy(rwc, hijackOptions.in)
}
errChanIn <- err

an unexpected error is being returned.

Context (Environment)

This scenario is relevant as the Nomad orchestrator triggers it regularly.

@MrSerth
Copy link

MrSerth commented Oct 21, 2024

Recently, Nomad switched from go-dockerclient to the native SDK. I would say that the bug is still valid and relevant, but not in the context of Nomad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants