Skip to content

Latest commit

 

History

History
22 lines (17 loc) · 1.41 KB

README.md

File metadata and controls

22 lines (17 loc) · 1.41 KB

huojian2weibospiderAPI

weibospider API for original articles or more

Preface

Ddependency

  • this project is based on python 2.7 in windows system, Scrapy frame. but python3.x is ok i guess.

Usage

  1. download this project, and put it into your IDE. (e.g.Pycharm)
  2. firstly, you need modify spider.py. the "start urls"(who do you want to spider?), "page"(how many pages you want to spider) and "parse_item"(what item you want to spider?).
  3. second, modify items.py according to formerly modified "parse_item".
  4. finally, modify pipeline.py, to tell where you want to put your items. i write items into json file. you can input these into Mysql or Mongodb whatever.

PS

  • actually, this project is just an idea to find an easier way to spider weibo.com. unfortunately, The API entrance is always not easy to disclose.
  • if you can, more items are approachable such as user information, news, hot incidents and so on.
  • if you like this project, please star it, thanks.