Skip to content

Get all the tweets from your favorite twitter accounts.

License

Notifications You must be signed in to change notification settings

mayankgautam/Tweets-Scrapper

 
 

Repository files navigation

Tweets-Scrapper

This script has helped me to scrap more than 30K+ tweets from more that 40 authors. The script is written such that you only have to give it a list of Twitter handles and output csv file path and it'll download all the tweets, process them and save them to a csv file without any hassle. You can checkout the dataset here on Github and here on Kaggle. Also, I have done a comprehensive data analysis which you can find here. You can also checkout the jupyter notebook I have used to scrap 30K+ tweets here.

How the script works

The script will download tweets from all the authors whose Twitter handles are written in the authors.txt file in the newline seperated format. The script will download direct tweets, retweets and retweets with a comment. In a retweeted tweet, I took all the information (name, handle, tweetcontent and creation time) of the orignal author and stored it. Furthermore, Every retweet with a comment contains <Q> and </Q> tags. The author's comment is followed by <Q> tag and then the content of the retweet comes which is followed by </Q>.

How to run it

  1. First clone the repository
git clone https://github.com/Hsankesara/Tweets-Scrapper.git
  1. Then download the python dependencies.
cd Tweets-Scrapper
pip3 install -r requirements.txt
  1. Now, create cred.json file which is the copy of cred.json.sample,
cp cred.json.sample cred.json
  1. Get Twitter credentials and write them in cred.json file. You can follow this to get your access tokens. Now update the cred.json file with the tokens you've received from Twitter.

  2. Write the Twitter handle of the accounts you want to scrap in authors.txt in newline seperated format.

  3. run the script

python3 scrap.py authors.txt tweets.csv
  1. Wait for it! And you'll get all the tweets soon in the csv format.

About

Get all the tweets from your favorite twitter accounts.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 88.8%
  • Python 11.2%