Skip to content

gabriele-decapoa/RTwUP

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

#RTwUP - Realtime Twitter Url Popularity

Given a tweets stream from Twitter, filtered in a suitable way, RTwUP realize the real time statistics of most URLs twitted since the software is active. These statistics show links by domain.

For example:

Domain Link Frequency
foursquare.com expanded.url.com/123 9 times
foursquare.com expanded.url.com/456 8 times
youtube.com ... ...
instagram.com ... ...
...

##Data Flow Description Tweets are captured from a Twitter's stream thanks to Twitter APIs, then are elaborated to filter it by geolocation (for example, according to the coordinates of a polygon centered of a city).
Links found into the tweet are processed in this way:

  • first of all, links are expanded from Twitter's shorting service (t.co)
  • after that, if the link is "expanded" yet, then the system count directly the frequency;
  • else, if the link is "shortned" (e.g. 'http://bit.ly/13NHE7v', 'http://goo.gl/uJH2Y', 'http://instagr.am/p/S3l5rQjCcA/', etc...) then it requires to arrive, possibly with some expansions, to final form to count frequencies.

Starting from expanded URLs, RTwUP extract pages' domain to organize results.

##Technologies adopted RTwUP is written in Java.
To listen to Twitter's stream, it was chosen Twitter4j but an a alternative client is Hosebird. To process real time the tweets, it was chosen Storm. The user interface was written as a Node.js application.

For more information, read wiki pages.

##Wiki

About

Realtime Twitter Url Popularity

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 94.6%
  • JavaScript 5.4%