Skip to content

A web scraper, which is written in Python using the BeautifulSoup library.

Notifications You must be signed in to change notification settings

mendyk-ja/google_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Google Scraper

This project is based on Python3 and BeautifulSoup4 library for web scraping. It allows us to make some automatization during information search in Google.

Table of Contents

General Information

  • Web scraper, which allows us to make some automatization during information search in Google.
  • Script showing scheme for coding web scraper with BeautifulSoup.
  • The main purpouse for coding it was getting familiare with BeautifulSoup library.

Technologies Used

  • Python 3
  • BeautifulSoup4 library

Usage

Google can block IP adress used for running this code (necessity to use VPN/proxy).

  1. Project code starts with importing libraries needed for the rest of the code.
  2. It opens keywords.txt file to get information about keywords.
  3. After getting each keyword it makes Google query with that keyword and opens it in browser.
  4. It scraps information about the total numbers of results.
  5. It writes this information into CSV file.
  6. It goes in the loop to every container, which storages a link and scraps it.
  7. It writes these links into CSV file.

Project Status

Project is: complete, but some improvement still can be done.

Room for Improvement

Room for improvement:

  • making code more clean through putting functionalities in functions
  • handling exceptions (for excemple, when file keywords.txt doesn't exist)

Contact

Created by Jacek Mendyk - feel free to contact me!

About

A web scraper, which is written in Python using the BeautifulSoup library.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages