Google Image Scraper

Image scraper for Google Image to collect images from their official websites (Python3 and Selenium)

This code is used to scrap images from Google Image. Typical image scrapers query a keyword in Google Image and download the result images. However, this technique returns a small image size without their original names.

Scraper Logic

This crawler collects images from their original websites. Basically, it follows the next logic:

Query Google Image for a particular query, e.g. "cat"
Using Selenium library, it opens a web page and collects and scroll down the Google Image page to obtain as many images as possible
Then, the result HTML code of the Google Image is saved locally
Using the Beautiful Soup library, the crawler parses the original websites of each image and visits them website individually 5- For each website, the crawler collects all the images and save them locally in a new folder, named as the query.

Parameters

For this version, I hardcoded the following parameters, but they need to be changed when you use the script:

Selenium driver path ( you can download it online, just google it)
The query keywords
The output folder name

Output

When you run the Jupyter notebook, you will get a folder called "Dataset". Inside Dataset, you will have two sub-folders:

images: this folder will contain a folder for the images of each query.
soups: HTML code dump from Google Image with the query

Requirements

The requirements for this project are:

BeautifulSoup
Selenium (I'm using FireFox here)
Python3, of course :)
tqdm
requests

Just download everything with pip, and you are ready to go!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
dataset		dataset
imgs		imgs
Google_scraper.ipynb		Google_scraper.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Google Image Scraper

Scraper Logic

Parameters

Output

Requirements

About

Releases

Packages

Languages

wesamalnabki/Google-Image-Scraper

Folders and files

Latest commit

History

Repository files navigation

Google Image Scraper

Scraper Logic

Parameters

Output

Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages