Skip to content

Scrape n latest posts from a user's instagram profile

Notifications You must be signed in to change notification settings

Syn3rman/instaScrape

Repository files navigation

InstaScrape

made-with-puppeteer GitHub language count GitHub top language


Uses headless chrome to scrape specified number of images from instagram for a particular user. Also provides blazing fast download scripts to download the images to your system by leveraging python's multiprocessing and go routines.


Features

  • Support public profiles
  • Support private profiles
  • Run with docker
  • Go and python scripts to download images

To-do's

  • Implement image download in rust to compare performance.
  • Try using pyO3 to integrate rust and python to see if there is a significant boost in performance.

Demo



Try it out

Set up locally using git

$ git clone https://github.com/Syn3rman/instaScrape.git && cd instaScrape

$ npm install

$ node run.js

Or using docker:

$ docker pull syn3rman/instascrape:latest

$ docker run --rm -it syn3rman/instascrape

Navigate to localhost and change the get request parameters as required.

Downloading images to filesystem

$ cd download_scripts

Using python:
$ python3 dwn.py

Using go:
$ go run main.go

Performance

With ~500 image url's, the go script takes around 6-7s to complete while python takes around 12-15s.