A project to build a current map of 绘本馆 --subscription children's libraries in China.
file scrape.py is the current version of the code.
- Raleigh
- Angela
- Josue
- JJ
- Connie
- Steven
- sign up for an api key they never replied. We'll need to screenscrape
- decide on a language are we going to split into 2 groups?
- worked on getting our Python environments to work
- make sure you are installing packages into the right version of python!
- spoofed a web client and scraped a page from dianping.com
- we figured out that calling different URL's, calls different cities. https://www.dianping.com/search/keyword/26/0_[string] #26 is a city code. #1 = Shanghai #2 = Beijing #3 = Hangzhou
- code goal: figure out how to send a query to dianping J-search-input is the element name of the keyword search bar.
- load query results into a data frame
- Connie looped over 200 cities. Still needs to share the code!
- format the results more carefully -- explore our data.
- code goal: figure out how to loop over a list of cities
- work on compiling query results into a workable table
- how does dianping treat locations? lat-long pairs or street addresses?
- if addresses, how do we batch geocode Chinese addresses?
- compile tables and join to addresses
- prepare other data for analysis (Chinese Census info from CD-ROM and Provincial Statistical Yearbooks @ UMichigan)
- exploratory visual GIS analysis