Skip to content

Latest commit

 

History

History
51 lines (43 loc) · 982 Bytes

README.md

File metadata and controls

51 lines (43 loc) · 982 Bytes

Dcard-Crawler-Analyzer

get Dcard & Meteor forum content and analyze !
And also provide a docker image version

Dependency

python3

pypi package

  • flask
  • flask_script
  • flask_migrate
  • Flask-SQLAlchemy
  • gunicorn
  • requests
  • cfscrape
  • beautifulsoup4
  • jieba

apt package

  • nodejs 8+

Install

# install dependency
apt-get install nodejs
pip install requirements.txt   # it highly recommend install packages in the virual env

# environment variable
export APP_SETTINGS="config.DevelopmentConfig"
export DATABASE_URL="sqlite:///WHERE_YOU_WANT_PUT_DB"

# database migration
cd app
python3 manage.py db init
python3 manage.py db migrate
python3 manage.py db upgrade

Run

cd app
python3 dcard.py  # for Dcard
python3 meteor.py   # for Meteor

Docker image

You can also build crawler from dockerfile

docker build . --tag uranusx86/forum_crawler --no-cache
docker run -dt --name forumcrawler -p 80:8000 uranusx86/forum_crawler