Demo

Grebe Social Data Aggregator

Grebe aggregates geo-fenced Canadian Twitter data for research in sociology and public health. View our demo to see how the data collected by Grebe can be analyzed and visualized in various ways.

Please cite the following publication when using our source code for your research.

@inproceedings{SamuelNooriFaraziZaiane2018,
  title = {{Context Prediction in the Social Web Using Applied Machine Learning: A Study of Canadian Tweeters}},
  author = {Samuel, Hamman and Noori, Benyamin and Farazi, Sara and Zaiane, Osmar},
  booktitle = {IEEE/WIC/ACM International Conference on Web Intelligence (WI)},
  pages = {230--237},
  year = {2018},
  organization = {IEEE}
}

Demo

A working live web app is available for demo purposes.

Prerequisites

For hosting, you can use IaaS with Cybera or Digital Ocean, PaaS with OpenShift or Heroku, or just use your laptop/computer (not recommended due to space and processing limitations).
Install Python.
Install Flask by using pip install flask.
Install Flask's HTTP Auth dependency via pip install flask-httpauth.
Install TwitterAPI via pip install TwitterAPI.
Install MariaDB.
Run the SQL commands in schema.sql to set up a database.
Edit config.py to enter your database username and password.
Install the MySQL Connector via pip install mysql-connector.
Install Python MySQL Connector by using pip install mysql-connector-python-rf.

Workflow

Aggregate tweets by running spyder.py.
Initialize cache by running scripts/cacher.py.
View web app by running webapp/server.py.

Aggregating Tweets

Sign up for a Twitter Developer account.
Set up your Twitter API keys.
Edit config.py and enter your API keys.
In a terminal, use the following command to run the aggregator python spyder.py [status | search | stream].
If you want to aggregate data automatically, set up instances of the command above to run at scheduled intervals, for example as a cron job or Task Scheduler.

Initializing Cache

To visualize and display data faster in the web app, the cache directory is set in config.py as HOME_DIR.
To set up the cache, run python cacher.py [data tags stats] from the scripts folder.
Clean up your cache directory regularly so it doesn't fill your drive, a sample bash script is provided here that can be set up to run regularly (replace HOME_DIR with the actual path to your directory).

#!/bin/bash
LIMIT="1000000" # 1GB
SIZE=$(du --apparent-size HOME_DIR | cut -f1)
if (($SIZE > $LIMIT))
then
    rm -f HOME_DIR/*
    echo "Cache cleared"
else
    echo "Cache preserved"
fi

Viewing Web App

In a terminal from the webapp folder, use the command python server.py to run the Flask server.
In your web browser, go to http://127.0.0.1:5000/grebe/
When using IaaS hosting, you can serve the Flask web app using uWSGI.
PaaS hosting configurations depend on the provider, but here is one for Heroku.

Name		Name	Last commit message	Last commit date
Latest commit History 197 Commits
.github		.github
MedSpider		MedSpider
scripts		scripts
webapp		webapp
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
config.py		config.py
demo.png		demo.png
requirements.txt		requirements.txt
schema.sql		schema.sql
spyder.py		spyder.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Grebe Social Data Aggregator

Demo

Prerequisites

Workflow

Aggregating Tweets

Initializing Cache

Viewing Web App

About

Releases

Sponsor this project

Packages

Languages

License

hwsamuel/Grebe

Folders and files

Latest commit

History

Repository files navigation

Grebe Social Data Aggregator

Demo

Prerequisites

Workflow

Aggregating Tweets

Initializing Cache

Viewing Web App

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Languages

Packages