-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure that the millRun goroutine terminates when Close called. #100
base: v2.0
Are you sure you want to change the base?
Conversation
Currently the millRun goroutines leaks. This is very noticable if a Logger is constructed periodically, used and then closed. This change ensures that the millCh channel is closed if it exists.
I notice that the travis-ci has failures. Interestingly I noticed that could happen when I was reading the tests as there is no synchronisation between the goroutine running the millRunOnce and the tests, so the test triggers the mill, but doesn't wait for the milling to complete before it checks the files. Do you have any preference on how you'd like to see the synchronisation added? |
No preference, do whatever you feel is best and is most straight forward. I'm not too picky. I really need to make a v3 of this thing and clean some of this stuff up. |
Wow, adding synchronisation and determinism has proved to be way more complicated than I thought. |
FYI, here is the comment I wrote while evaluating the race in
|
While fixing the tests I noticed that the original patch closed the millRun goroutine in the wrong place. I didn't realise that the `close` method was called internally as well as part of the `rotate` method. The closing of the mill signalling channel is now done in the `Close` method. There were a bunch of race errors detected, mostly around the updating of the time, and the `fakeFS`. Synchronisation is added to these. All of the `time.After` calls have been removed from the tests and the execution time has gone from 700ms to 7ms on my machine. Two different notify channels were added to the `Logger` internals. These are only ever set in the tests, and no notification is done for any normal `Logger` use. In order to avoid spurious errors the `Close` method needed to wait for the `millRun` goroutine to complete, otherwise there was potential for the `millRunOnce` method to return errors. I temporarily added a panic to that method while testing. I use a wait group to wait for the goroutine to be complete. However due to the way the `sync.WaitGroup` works, you can't reuse them, so we have a pointer to a `sync.WaitGroup`. This patch does introduce a change in behaviour, which is in evidence due to the deleted test. Effectively I was left with two choices: allow the compression of existing old log files as part of writing to a new logger (which wouldn't rotate the files yet); or have a race in the `TestCleanupExistingBackups`. This test failure was intermittent, due to the race. I decided on determinism as the likelihood of having old uncompressed files around that needed to be compressed was small.
756caf0
to
6dba581
Compare
OK, after disussion with @4a6f656c, realised that we really do need the rotate on resume functionality. To make the test deterministic, I changed it to not write enough to trigger a rotate. This way we always just have one notification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good - some minor issues and cleanup suggestions inline.
@@ -165,7 +175,16 @@ func (l *Logger) Write(p []byte) (n int, err error) { | |||
func (l *Logger) Close() error { | |||
l.mu.Lock() | |||
defer l.mu.Unlock() | |||
return l.close() | |||
if err := l.close(); err != nil { | |||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not return here - if openExistingOrNew gets called, it will call l.mill and l.millCh will become non-nil. If the l.close call then fails (for example, Close fails on the underlying file descriptor), the millCh will not be closed.
Either the close error can be preserved and returned late, or you may be able to close the millCh first.
case <-notify: | ||
// All good. | ||
case <-time.After(2 * time.Second): | ||
fmt.Println("logger notify not signalled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging testing failures to stdout is less than ideal - use t.Fatal
@@ -814,3 +815,13 @@ func exists(path string, t testing.TB) { | |||
_, err := os.Stat(path) | |||
assertUp(err == nil, t, 1, "expected file to exist, but got error from os.Stat: %v", err) | |||
} | |||
|
|||
func waitForNotify(notify <-chan struct{}, t testing.TB) { | |||
select { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should call t.Helper so that the calls to t are more useful.
// It is safe to check the millCh here as we are inside the mutex lock. | ||
if l.millCh == nil { | ||
l.millCh = make(chan struct{}, 1) | ||
l.wg = &sync.WaitGroup{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: A sync.WaitGroup seems like overkill here - you're only ever having one waiter, I'd probably just use a millShutdownCh channel instead.
return err | ||
} | ||
|
||
// millRun runs in a goroutine to manage post-rotation compression and removal | ||
// of old log files. | ||
func (l *Logger) millRun() { | ||
for _ = range l.millCh { | ||
func (l *Logger) millRun(ch <-chan struct{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm missing something, there should be no need to pass the channel as an argument - we should be still able to use l.millCh directly.
Filename: filename, | ||
MaxBackups: 1, | ||
MaxSize: 100, // megabytes | ||
notifyCompressed: notify, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could initialise inline and call waitForNotify with l.notifyCompressed (which seems more readable/self-documenting).
// we need to wait a little bit since the files get compressed on a different | ||
// goroutine. | ||
<-time.After(10 * time.Millisecond) | ||
waitForNotify(notify, t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.e. waitForNotify(l.notifyCompressed, t)
Filename: filename, | ||
MaxSize: 10, | ||
MaxBackups: 1, | ||
notifyRemoved: notify, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as earlier - consider initialising inline and calling waitForNotify(l.notifyRemoved, t)
below.
@howbazaar sorry bothering, but when can we expect this to merge? |
One more for @natefinch |
Any progress on this? |
This is now superseded by #211 |
Currently the millRun goroutines leaks. This is very noticable if
a Logger is constructed periodically, used and then closed.
This change ensures that the millCh channel is closed if it exists.
Existing log rotation tests cover the duplicate Close calls.