Skip to content

Latest commit

 

History

History
36 lines (27 loc) · 915 Bytes

README.md

File metadata and controls

36 lines (27 loc) · 915 Bytes

PhantomjsFetcher

A python web fetcher using phantomjs and tornado to mock browser.

Before using

  1. install phantomjs and start with:

$ phantomjs phantomjs_fetcher.js [port]

  1. install tornado with pip:

$ pip install tornado

Sample Code

from tornado_fetcher import Fetcher

# create a fetcher
>>> fetcher=Fetcher(
  user_agent='phantomjs', # user agent
  phantomjs_proxy='http://localhost:12306', # phantomjs url
  pool_size=10, # max httpclient num
  async=False
  )
# fetch html after rendering javascript from url
>>> fetcher.fetch(url)
# or execute additional javascript after rendering end, which must be a function
>>> fetcher.fetch(url, js_script='setTimeout("function(){window.scrollTo(0,100000)}", 1000)')

Reference

pyspider