Skip to content

fquellec/twitter-retweets-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Retweet networks and analysis of twitter discussions

In this repo, we propose several tools to easily get an overview of the debates on twitter using Python. We first propose three ways to collect tweets with the twitter API, then we build different types of interactions graphs with these tweets to visualize the polarization in the discussions and obtain statistics on the actors of the flows. Finally, with the statistics obtained from the graphs we added previously, we propose two possible visualizations to interpret some of the previous results.


Workspace

Data Mining

We propose three ways to collect tweets in the data-mining folder :

  • searchKeyword.py: collect tweets from the twitter api using the search call, by specifying one or more keywords or hashtags, we can retrieve up to the last week of the tweet corresponding to this keyword(s). (with the free twitter license)
  • streamUsers.py: use the streaming api of twitter to fetch in real time all tweets related to the list of accounts we specify.
  • getTimelines.py: retrieve up to 3200 tweets (with the free twitter license) of each user we specify.

Graph generation

Once we have gathered our tweets, we can generate two type of interaction graph in the graphs folder:

  • Retweet Graph: One of the more popular way to visualize interactions on twitter, a retweet graph is a directed graph of users that have participated in the discussion on a specific topic/keyword. Each node correspond to a twitter account and a directed edge between to nodes u and v indicates that user v has been retweeted by user u. Since retweet often indicates endorsement (source), we can use this kind of graph to detect polarization and communities in twitter discussions.

  • Co-retweet Graph: The co-retweeted network is constructed as the undirected weighted graph that connects highly visible accounts who have been retweeted by members of the audience during some real-time event. If two accounts have been retweeted several times by the same people, an edge is added between theses two accounts. The co-retweeted network can be seen as a form of consulting the opinion of the crowd that is following the discussion about the similarity (or difference) of positions expressed by the highly visible accounts. When applied to polarized discussions, it can be applied to retrieve orientation of major players in the discussion. source

These two interaction graphs can be used for different purposes, one is directed and allows us to use well-known algorithms such as HITS for computing hubs and authorities. Whereas the other one highlights the important actors and better represents the polarization in the twitter debates.

Example of co-retweet network with keyword "police" in french:

Plots

In this section, we offer two ways to visualize some of the information we have from the data we have collected.

  • Reactions versus Number of followers: This plot can help us better understand which account generates the most reactions in relation to its number of followers, we only need a tweet list to generate the plot.
  • Partisanship: This plot allows us to visualize the orientation of some twitter accounts in a discussion, for now it only makes sense when a discussion is polarized on two opposite sides, it calculates the number of links a node shares with each side of the discussion. (Carefully check that the retweet or co-retweet graph is polarized on two opposite sides and that the partition is well defined).

Getting Started

Clone

Clone this repo to your local machine using

$ git clone https://github.com/Fanfou02/twitter-retweets-analysis.git

Install Dependencies

Go to the root of the project with your terminal and install dependencies by executing the following

$ pip3 install -r requirements.txt

If you want to use the partitioning function, which is useful for calculating partitions and visualizing polarization in graphs, install METIS for python following instructions here.

Add your twitter API credentials

First you need to have or create a Twitter account, then apply for a developer account here. Once you have a developper account you need to register an app to be able to make API calls. Go to your Twitter apps page and select the Create an app option. Fill in the form then go again to your Twitter apps page, click on the Details button of your app, this button takes you to the next page, where you can generate the credentials. Once you have your credentials, fill the data-mining/credentials.py file.

Get some tweets

First we need some data to generate interactions graphs, go to the data-mining folder and choose one of the three methods. For example, if you want to search for all discussions mentionning the keyword police in the last week, open searchKeyword.py, change the query option and execute the following command:

$ python3 searchKeyword.py

Then, a CSV file in the directory results will be generated with all the tweets corresponding to your query

Build the retweet/co-retweet graph

Once you got some tweets, go to the graphs folder and choose between co-retweet and retweet graph. Open the corresponding python file and change the input/output filenames variables with the previously generated csv file. For example, open retweet_graph.py and change the following variable as follow:

FILENAME_TWEET              = "../data-mining/results/police_tweets.csv"        # CSV of all tweets considered in the graph, build with one of data_mining script
FILENAME_GRAPH              = "police_graph"                                    # Output filename

Then execute retweet_graph.py:

$ python3 retweet_graph.py

Finally, a JSON file and a GEPHI file containing the produced graph will be generated in the retweet_graphs folder. You can view the graph directly using Gephi or networkx

Plot your results

Not Implemented yet

License

License

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages