Skip to content

Dani7B/RTwUP-giw

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 

Repository files navigation

RTwUP - Realtime Twitter Url Popularity

Given a suitably filtered stream of documents returned by a Twitter query, calculate real-time statistics and show the ranking of the most tweeted URLs since system activation. The statistics must be updated on screen every N seconds.
They show the links organized into various domain categories, each with its counting popularity:

Domain Link Frequency
foursquare.com expanded.url.com/123 9 times
foursquare.com expanded.url.com/456 8 times
youtube.com ... ...
instagram.com ... ...
...

Data Stream Description and Requirements

The system has to use Twitter APIs (Twitter4j, Hosebird for instance) to perform queries and retrieve Tweets, suitably filters them (e.g. according to the coordinates of a polygon centered on Rome, Milan or a city of your choice).
The links of interest are the ones retrieved from the entities/urls field of the Tweet json:

  • first of all, links have to be expanded, reversing the output of Twitter's shortening service (t.co);
  • if the Tweet contains the expanded form of the URL, the count is assigned to it;
  • if the Tweet contains a “shortened” form of the URL (e.g. bit.ly/13NHE7v, goo.gl/uJH2Y, http://instagr.am/p/S3l5rQjCcA/, etc ...), then it has to be expanded in order to obtain the completely expanded form (eventually after several expansions); the count can then be assigned to it.

Starting from the final expanded form, domain information can be extracted to organize the current results.
This must be done in real time, using Apache Storm.

Adopted Technologies

RTwUP is developed in Java.
To listen to Twitter's stream, it was chosen Twitter4j, Twitter Stream API in particular.
To process the Tweets real time, it was chosen Apache Storm.
The user interface is written as a Node.js application, making use of socket.io and Redis to display results in real time.

For more information, you can refer to the wiki pages.

Wiki

About

Realtime Twitter Url Popularity

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published