Fetch and save real time data anonymously from any Instagram profile without using official API.
Before you continue, ensure you have met the following requirements.
- You are using a Linux or Windows OS Machine.
- You have installed latest version of Python, Firefox and Geckodriver.
- You have installed and running latest version of Tor listening on SOCKSPort 9050.
- You have installed xvfb (only for linux).
You can get step by step detailed Installation steps here for both windows and linux.
-
Git clone or Download this project and run below command in project directory.
pip install -r requirements.txt
-
Open up
config.py
in your favourite text editor and-
Replace timezone according to your country or state.
TIMEZONE = timezone("Asia/Kolkata")
-
Add your temporary insta ids in ids dictonary.
ids = { "<USERNAME_OR_EMAIL_HERE>" : "<PASSWORD_HERE>", "<USERNAME_OR_EMAIL_HERE>" : "<PASSWORD_HERE>" }
-
Add usernames of profiles which you want to scrape in the list of usernames.
usernames = ["<USERNAME1>", "<USERNAME2>"]
-
Add your Slack webhook URL to get notified about errors and exceptions while running this scraper.
slack = Slack(url = "<<ADD_YOUR_SLACK_WEBHOOK_URL_HERE>>")
-
Congratulations! you are ready to go, now run scraper.py
. Ping me if you ever face any kind of error.
-
Profile Scraping
- Full Name and Biography (Both encoded with utf-8)
- Followers and Following
- Number of public posts and owned media
- Is user's account private, business, verified, has channel, joined recently
- Profile page ID
- Conneced FB page
- Externel URL
-
Save data to an unique csv file in output folder.
-
Check for existing csv file and will create a new file if old one dosen't exist.
-
Random sleep time (to create a little randomness).
-
Autologin and auto logout (to switch ids after every 8 hours).
-
Automatic browser screenshots in
ss_log/browser
folder. -
Slack webhook Integration to get error notifications
-
Tor connectivity and public ip check
Project License can be found here
MIT © Rahul Meena