Tweets-Scrapper

This script has helped me to scrap more than 30K+ tweets from more that 40 authors. The script is written such that you only have to give it a list of Twitter handles and output csv file path and it'll download all the tweets, process them and save them to a csv file without any hassle. You can checkout the dataset here on Github and here on Kaggle. Also, I have done a comprehensive data analysis which you can find here. You can also checkout the jupyter notebook I have used to scrap 30K+ tweets here.

How the script works

The script will download tweets from all the authors whose Twitter handles are written in the authors.txt file in the newline seperated format. The script will download direct tweets, retweets and retweets with a comment. In a retweeted tweet, I took all the information (name, handle, tweetcontent and creation time) of the orignal author and stored it. Furthermore, Every retweet with a comment contains <Q> and </Q> tags. The author's comment is followed by <Q> tag and then the content of the retweet comes which is followed by </Q>.

How to run it

First clone the repository

git clone https://github.com/Hsankesara/Tweets-Scrapper.git

Then download the python dependencies.

cd Tweets-Scrapper
pip3 install -r requirements.txt

Now, create cred.json file which is the copy of cred.json.sample,

cp cred.json.sample cred.json

Get Twitter credentials and write them in cred.json file. You can follow this to get your access tokens. Now update the cred.json file with the tokens you've received from Twitter.
Write the Twitter handle of the accounts you want to scrap in authors.txt in newline seperated format.
run the script

python3 scrap.py authors.txt tweets.csv

Wait for it! And you'll get all the tweets soon in the csv format.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Tweets-Scrapper

How the script works

How to run it

Files

README.md

Latest commit

History

README.md

File metadata and controls

Tweets-Scrapper

How the script works

How to run it