Standalone client to scrape lobbyist disclosure pages from www.sec.state.ma.us/LobbyistPublicSearch/ and upload them to a postgres database
The docker-compose.yml
file configures a postgres database and a python container for the scraper.
Note that the container runs as root and will change the file permissions on the files it writes. You can run sudo chown -R $(id -u):$(id -g) .
to reset permissions.
- Install docker and docker compose v2.
- Build the images with
docker compose build
- Start the services with
docker compose up -d
. This will return once they're up. - Open a shell into the python container with
docker compose exec lobby bash
. This gives you a terminal into the development environment, connected to your source directory. So this will reflect changes you make. - Run your scraper commands
poetry run python main.py
- Shut down the services:
docker compose down
. Add-v
to also delete the database