-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Frequent short-lived connections lead to OOM #1
Comments
Happened again to this same DB. I'm kind of skeptical of more frequent suicide at this point. Could Go 1.1 be worth a shot here? Or alternately using a worker pool, rather than spawning new goroutines per-connection (kinda gross, but if that's where the problem is, it could help). |
I don't think that's where the problem is. Some discussion suggested the more likely thing to finger are the threads that are spawned to make system calls. Go 1.1 may help but there's also some evidence it could hurt. Faster suicide may help. I told Dave Cheney I'd get around to making a test driver to help track go bloat. I committed to that, but if you feel inclined to do this, of course feel free? |
Interesting. And yeah, perhaps. My plate is equally full at the moment, but maybe when things clear up... Incidentally, I took a stab at quantifying the problem by polling ps periodically. This is what happened around the restart: heroku@ip-10-60-95-103:~$ while true
do
ps --no-headers -opid,start,rss,vsz -C pg_logplexcollector
sleep 5
done
...
13869 23:23:21 193996 638164
13869 23:23:21 194260 638292
13869 23:23:21 194260 638420
13869 23:23:21 194520 638676
13869 23:23:21 194784 638804
13869 23:23:21 194784 638932
3737 00:23:24 4 35920
3737 00:23:24 4188 290120
3737 00:23:24 4680 291656
3737 00:23:24 5132 291912
3737 00:23:24 5560 301324
3737 00:23:24 5812 301452
3737 00:23:24 6060 301708 So up to 200MB for this particular DB after an hour, and this is after the workload was tweaked to use more persistent (but still relatively short-lived) connections. |
After a lot of activity like this:
I get this:
Perhaps the suicide exit to prevent bloat needs to happen faster? This is on a machine with just 2GB.
The text was updated successfully, but these errors were encountered: