Skip to content

Springboard Capstone 2: A Few Months in the Troll Factory - An Analysis of Russian Troll Tweets in the 2016 US Election

Notifications You must be signed in to change notification settings

chudzikr/Capstone2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Capstone2

Problem Statement
In the lead up to the 2016 US Election, and for a short time afterwards, social media users in he US were targeted by a disinformation campaign by a Russian “troll factory,” designed to “sow disinformation and discord into American politics via social media.”

Earlier this year, as part of special counsel Robert Mueller’s investigation, the Justice Department charged 13 Russian nationals with interfering in American electoral and political processes. The defendants worked for a well-funded “troll factory” called the Internet Research Agency, which had 400 employees, according to one Russian news report. From a bland office building in St. Petersburg, the agency ran a sophisticated and coordinated campaign to sow disinformation and discord into American politics via social media. This often involved Trump’s favorite medium: Twitter. (via FiveThirtyEight.com, Why We’re Sharing 3 Million Russian Troll Tweets, July 31, 2018)

I will use the Tweets to explore questions about the nature of the disinformation campaign, such as:

  • Did the tweets increase in frequency or volume around the time of major events?
  • Did other trolls retweet and amplify troll tweets?
  • Can clusters be made of Twitter handles/’users’ grouped with similar features?
  • Can common topics or themes be identified?
  • What were the most-used hashtags?
  • Did the tweets predominantly support one candidate or political party, or seek to undermine the other?

By exploring the patterns, topics and methods of the disinformation campaign, I will seek to create insight into these efforts and understand how to recognize, identify and potentially avoid future disinformation attacks.

Client
My client for this project is the American voter, and I intend to provide analysis to aid them in discerning manufactured disinformation from “real” opinion and information.

Data
The data I will be using for this project is data that has been made available to the public by Five Thirty Eight, on their GitHub at https://github.com/fivethirtyeight/russian-troll-tweets/ . The data was originally gathered by two professors at Clemson University; Darren Linvill and Patrick Warren, and shared with FiveThirtyEight.

Using advanced social media tracking software, they pulled the tweets from thousands of accounts that Twitter has acknowledged as being associated with the IRA. The professors shared their data with FiveThirtyEight in the hope that other researchers, and the broader public, will explore it and share what they find. (via FiveThirtyEight.com, Why We’re Sharing 3 Million Russian Troll Tweets, July 31, 2018)

Solution Approach
I plan to use Python data analysis and manipulation techniques to manipulate, aggregate and clean the data. I will then use Natural Language Processing and text mining techniques to extract topics and sentiments, and data visualization libraries and techniques to create visualizations to support my findings.

Project Deliverables
The project files are structured in the following manner:

Code

Reports

Data
Due to the large file size, the data is not available in this repository. See the README.

About

Springboard Capstone 2: A Few Months in the Troll Factory - An Analysis of Russian Troll Tweets in the 2016 US Election

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published