This directory holds code to build an Apache Beam pipeline written in python.
python3 -m venv .venv
source .venv/bin/activate
pip install -U -r requirements.txt
Run the following command to execute the pipeline on your local machine.
python3 main.py --source resources/catsum.txt --output /tmp/wordcount/output
or
Assumes previously run gcloud auth application-default login
python3 main.py --source gs://apache-beam-samples/shakespeare/* --output /tmp/wordcount/output