Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster shutdown of pgxpool.Pool background goroutines #1642

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bgentry
Copy link
Contributor

@bgentry bgentry commented Jun 14, 2023

When a pool is closed, some background goroutines may be left open, particularly for health checks as detailed in #1641. Two specific examples have been refactored here to avoid a blocking sleep and instead also select on the pool being closed to potentially return/continue sooner.

Comment on lines +420 to +425
ticker := time.NewTicker(500 * time.Millisecond)
defer ticker.Stop()
Copy link
Contributor Author

@bgentry bgentry Jun 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted for a time.Ticker on this one for simplicity, but this does have the side effect of changing this behavior from a "sleep 500ms between attempts" to a "make an attempt every 500ms" type behavior. If the health checks take awhile then there could be minimal sleeping time.

I can refactor to use a timer, it would just be messier. I wanted to propose this option first and see what you think.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually understand the purpose of this change. Aside from the wait 500ms between vs try every 500ms it doesn't change anything does it?

@jackc
Copy link
Owner

jackc commented Jun 16, 2023

I took a fresh look at this, and I wonder if time.AfterFunc would be a simpler solution. Instead of triggerHealthCheck starting a goroutine and sleeping, the whole triggerHealthCheck call could be in a time.AfterFunc. The goroutine wouldn't even exist except when the timer was firing.

And if we wanted to be super rigorous, instead of calling time.AfterFunc every time, it could be created once and stored in a field on the pool. Any existing call to triggerHealthCheck would instead call Reset on the timer for 500ms in the future.
Closing the pool could stop the timer.

When a pool is closed, some background goroutines may be left open,
particularly for health checks as detailed in jackc#1641. Two specific
examples have been refactored here to avoid a blocking sleep and instead
also select on the pool being closed to potentially return/continue
sooner.
@bgentry bgentry force-pushed the bg-pool-fast-exit-health-checks branch from d2d8635 to 749542e Compare July 3, 2023 20:59
@bgentry
Copy link
Contributor Author

bgentry commented Jul 3, 2023

@jackc Just pushed an updated version, is this roughly what you had in mind?

Note that I needed to introduce a mutex for synchronization over the new timer field. I considered instead just setting this field up as part of pool init so it never needs to be directly written to, but if I did that I would need to specify an initial duration for the timer. Always triggering a health check 500ms after startup seems like a behavioral change AFAICT so I didn't want to go down that route without discussing.

@jackc
Copy link
Owner

jackc commented Jul 7, 2023

Note that I needed to introduce a mutex for synchronization over the new timer field. I considered instead just setting this field up as part of pool init so it never needs to be directly written to, but if I did that I would need to specify an initial duration for the timer. Always triggering a health check 500ms after startup seems like a behavioral change AFAICT so I didn't want to go down that route without discussing.

I think it would be fine to initially set the timer to 100 years or something like that. Then triggerHealthCheck would reset it.

@jameshartig
Copy link
Contributor

jameshartig commented Jul 20, 2023

I realized a cleaner way to do this is to call triggerHealthCheck (or something like it) in the destructor callback sent to puddle. Unfortunately the destructor is called before the actual resource is removed but it's called directly after it so we might be able to relax the sleep time to be something even smaller.

One "hack" would be in the destructor passed to puddle to do:

go func() {
  p.p.Stat()
  select {
  case p.healthCheckChan <- struct{}{}:
  default:
  }
}()

The reason this works is that destroyAcquiredResource has a lock on the puddle's mux and is not unlocked until after the resource is removed. Stat tries to lock the same mux so it would lock until the resource has been removed.

That's obviously relying on something that puddle isn't guaranteed to maintain but maybe it'll inspire some better ideas.

@batazor
Copy link

batazor commented Oct 4, 2024

any update?

@rubiagatra
Copy link

Any update for this PR @bgentry?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants