Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The server starts taking 100% #7

Open
pocha opened this issue Aug 13, 2013 · 7 comments
Open

The server starts taking 100% #7

pocha opened this issue Aug 13, 2013 · 7 comments

Comments

@pocha
Copy link
Owner

pocha commented Aug 13, 2013

Leaving the server on using forever make it take 100% CPU. There is no conclusive evidence when it happens. On a daily load of 200 users who are spending around 7-10 min of time, this happens couple of times every day. As for now, I have a cron script that checks if the process is having a high CPU usage & restarts the server using forever. It fixes the problem temporarily.

@dereXeus did some research & figured out that the Node server is not able to close file descriptors when he ran the server using strace. The situation is partially reproduce able if you run the benchmark tests as instructed in README. Once all the connections are done, the CPU does not go back to 0% . So it looks like this issue is similar to what I am getting on production server.

While Googling, I came across nodejs/node-v0.x-archive#5504 & it seems upgrading the nodejs version might fix the issue. I upgraded nodejs on production to v0.10.12 (it is an ubuntu server) but the problem remains.

I also tried upgrading the node version to the latest version but 'npm install' failed for the version.

@tomgco
Copy link

tomgco commented Aug 23, 2013

Hello,

I have had a quick look at this and couldn't reproduce this using the benchmark (are you able to reproduce by just using the multiple-clients.js benchmark?

However I have found a cause of 100% CPU usage by malicious intent.

It is possible to cause an infinate loop on the server by using the command:

$ while true
> do
> echo 'a'
> done

I will look into this further later to see if any application logic is a different cause, but it may be worth looking at your mongo logs to see if this or any variation of this command appears regularly.

Thanks,

Tom Gallacher

@pocha
Copy link
Owner Author

pocha commented Aug 23, 2013

@tomgco thanks for checking. What node & npm version did you test it with ? Do you mind putting output of npm version from inside the nodejs app.

As for recreating the issue, @dereXeus reported that if you run the benchmark test & once done, the CPU usage does not come back to 0%. It stayed at around 50% & I have a hunch it is the same issue which is getting manifested to 100% CPU usage. I am in process of reproducing it myself on my PC.

I just now checked that I still have v.0.10.9 running on my production server. If everything works fine on my PC for v0.10.12 (including the benchmark test), I would update the version on my production server too. Will upate my finding in the thread as well.

@tomgco
Copy link

tomgco commented Aug 23, 2013

@pocha I tested it on node 0.10.17 on Mac OS X. I do have a Ubuntu machine which I will try to test it on later.

npm version

{ http_parser: '1.0',
  node: '0.10.17',
  v8: '3.14.5.9',
  ares: '1.9.0-DEV',
  uv: '0.10.14',
  zlib: '1.2.3',
  modules: '11',
  openssl: '1.0.1e',
  npm: '1.3.8',
  'terminal-codelearn': '0.0.3' }

and this is the output from npm ls

https://gist.github.com/tomgco/6317063

@arunkjn
Copy link

arunkjn commented Aug 23, 2013

Hi,

I had the same problem nodejs/node-v0.x-archive#5108 (comment)
I was running node-proxy in production using forever with node v0.8.7, it ran for weeks altogether without any issue. When I upgraded to node v0.9.x the CPU spiked at 100% couple of times a day and the proxy would become unresponsive. The only solution being a restart. I have reverted back to node v0.8.7 and it seems to be working fine again.

@pocha
Copy link
Owner Author

pocha commented Aug 24, 2013

@arunkjn I tried running my app for node v0.8.x but unfortunately the tests failed. Probably its some issue with some of the dependencies.

As for now - I have the app running on the server with node v0.10.17 .

As for replicating the issue - I could not do it on my own PC through single client or multiple client benchmarking script. I directed the single client script to connect to the server & the server did reach 100% CPU. I need to forever restartall to bring back the CPU to few percents.

This is pretty strange now. My server is Ubuntu 12.04 AMD 64 bit.

@pocha
Copy link
Owner Author

pocha commented Aug 25, 2013

Some more updates.

Some SO thread suggested to run using node-tick . The output of node-tick-processor when CPU was 100% is at http://pastebin.com/uAM3ncpd .

https://groups.google.com/forum/#!topic/nodejs/_U0MmS6rUl4 says that a lot of ticks for libc is probably system is spending a lot of time in epoll_wait() .

Can you quickly suggest something on top of head to provide a timeout or something so that system waits in epoll_wait() for sometime & simply moves out if timeout is hit.

Unfortunately, I am finding it pretty hard to recreate the situation on my PC. But it does get created on the server.

@pocha
Copy link
Owner Author

pocha commented Aug 25, 2013

There is some update on this. Wanted to check with you if you have any insight.

When the CPU was 100% on the server, I used pstree -p to found out the processes. It looked like .

ubuntu@Ubuntu-1204-precise-64-minimal ~ $ pstree -p  32762
node(32762)─┬─su(582)───bash(583)
            ├─su(704)───bash(705)───rails-codelearn(747)───ruby(758)─┬─{ruby}(760)
            │                                                        └─{ruby}(762)
            ├─{node}(32763)
            ├─{node}(32764)
            ├─{node}(32765)
            ├─{node}(32766)
            └─{node}(32767)

When I killed su(582) with kill -9 582, the CPU usage came back to normal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants