- This project is a Google Maps scraper built using Python and Playwright. It consists of two main parts: the scraper and the server.
- it will scrap all liastings that represented in a scrollbar after enter search query (after scrolling to the end of scrollbar).
- Server: Provides two API endpoints:
/status
: Check the status of the application./request
: Accepts a JSON payload to enqueue search queries.
- Scraper: Listens to a Redis queue for search queries, opens multiple tabs with different browsers and user agents, and scrapes Google Maps.
Accepts a JSON payload with the following structure:
{
"city": "",
"listing_category": "",
"listing_type": "",
"province": "",
"verb": ""
}
The scraper combines these fields (listing_type + verb + city + province) to create a search query and enqueues it for processing. use all fields to create excel file for storing search result.
Returns the current status of the server.
- The server receives requests via the
/request
endpoint and enqueues the search queries into a Redis queue. - The scraper listens to the Redis queue, dequeues search queries, and opens multiple tabs with different browser instances and user agents.
- The scraper processes the search query by interacting with Google Maps.
- Python 3.12
- Poetry for dependency management
- Redis server
- Playwright
-
Clone the repository: git clone [email protected]:AmirEspahbodi/google-map-scraper.git cd [email protected]:AmirEspahbodi/google-map-scraper.git
-
Install dependencies:
poetry install
- Install Playwright browsers:
poetry run playwright install
poetry run playwright install-deps
- Start the application:
poetry run python run_app.py