Run phantom scripts in multiple managed reusable workers
Running a script in phantom can soon become performance bottleneck when it comes to scale. Starting phantomjs process is not a cheap operation and you cannot start hundred of them at once. This package provides solution using phantomjs webserver and multiple phantomjs processes running in parallel.
##First create a phantomjs script wrapped in webserver
//every worker gets unique port, get it from a process environment variables
var system = require("system");
var port = system.env['PHANTOM_WORKER_PORT'];
var host = system.env['PHANTOM_WORKER_HOST'];
require('webserver').create().listen(host + ':' + port, function (req, res) {
//standard phantomjs script which get input parametrs from request
var page = require('webpage').create();
page.open(JSON.parse(req.post).url, function(status) {
var title = page.evaluate(function() {
return document.title;
});
//write the result to the response
res.statusCode = 200;
res.write({ title: title });
res.close();
});
##Start phantomjs workers
var phantom = require("phantom-workers")({
pathToPhantomScript: "script.js",
timeout: 5000,
numberOfWorkers: 10
});
phantom.start(function() {
phantom.execute({ url: "http://jsreport.net" }, function(err, res) {
console.log(res.title);
});
});
##Options
pathToPhantomScript
(required) - absolute path to the phantom script
timeout
- execution timeout in ms
numberOfWorkers
- number of phantomjs instances
host
- ip or hostname where to start listening phantomjs web service, default 127.0.0.1
portLeftBoundary
- don't specify if you just want to take any random free port
portRightBoundary
- don't specify if you just want to take any random free port
hostEnvVarName
- customize the name of the environment variable passed to the phantom script that specifies the worker host. defaults to PHANTOM_WORKER_HOST
portEnvVarName
- customize the name of the environment variable passed to the phantom script that specifies the worker port. defaults to PHANTOM_WORKER_PORT
##License See license