Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some URLs hang and PhantomJS refuses to die #7

Closed
anjackson opened this issue Dec 1, 2017 · 1 comment
Closed

Some URLs hang and PhantomJS refuses to die #7

anjackson opened this issue Dec 1, 2017 · 1 comment
Labels

Comments

@anjackson
Copy link
Contributor

anjackson commented Dec 1, 2017

The crawler ran out of memory due to PhantomJS processes hanging about. We need to try to improve the rendering behaviour, but ensure they get killed either way. Not sure what the problem URIs are. The following process got killed so this may be useful test case:

/opt/phantomjs/bin/phantomjs --ssl-protocol=any --ignore-ssl-errors=true --web-security=false --proxy=warcprox:8000 phantomjs/phantomjs-render.js http://www.telegraph.co.uk/education/2017/04/19/british-children-spend-time-internet-almost-every-developed/ /tmp/webrender/tmpkh2sqnhe :root

IIRC I've also seen a lot of hanging processed from rendering http://www.thejc.com/

There's a lot of instances of http://www.theatrffynnon.co.uk/ and http://www.ysgolllanfynydd.co.uk/ at the moment, so that might be slow/difficult

@anjackson anjackson added the bug label Dec 1, 2017
@anjackson anjackson changed the title Some URLs hang and the PhantomJS processes refuse to die Some URLs hang and PhantomJS refuses to die Dec 1, 2017
@anjackson
Copy link
Contributor Author

So, due to some apparent bugs/oddities in PhantomJS, the page.open can fail, after which the process hangs rather than exiting neatly.

Some clean-up of the code (see 18f894d and preceding commits) and testing indicated that the simplest option is simply to wait and then force the exit whether or not the page.open reported success. This has been included in v. 2.0.8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant