Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Overview

This directory holds code to build an Apache Beam pipeline written in python.

Requirements

Usage

Setup virtual environment

python3 -m venv .venv
source .venv/bin/activate

Install dependencies

pip install -U -r requirements.txt

Run Word Count

Run the following command to execute the pipeline on your local machine.

python3 main.py --source resources/catsum.txt --output /tmp/wordcount/output

or

Assumes previously run gcloud auth application-default login

python3 main.py --source gs://apache-beam-samples/shakespeare/* --output /tmp/wordcount/output