Skip to content
This repository has been archived by the owner on Sep 5, 2023. It is now read-only.
/ aws-glue-docker Public archive

๐Ÿ‹ Docker image for AWS Glue Spark/Python

License

Notifications You must be signed in to change notification settings

webysther/aws-glue-docker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

33 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Supported tags and respective Dockerfile links

Simple Tags

Python Shell

Spark

You can use Python extension modules and libraries with your AWS Glue ETL scripts as long as they are written in pure Python. C libraries such as pandas are not supported at the present time, nor are extensions written in other languages.
-- AWS

Deprecated, please migrate to v3/v4

AWS Glue Docker

Software License

AWS Glue Development enviroment based on svajiraya/aws-glue-libs fix.

Getting started

# install docker and configure aliases
curl -sSL https://raw.githubusercontent.com/webysther/aws-glue-docker/master/start.sh | sh

# to use pandas
glue

# or pyspark
glue-spark

# here you are inside docker

# Glue PySpark (REPL)
pyspark

# Glue PySpark
# /app is you current folder
glue-spark sparksubmit /app/spark_script.py

# Test
glue pytest

# aliases inside docker (backwards compatibility)
gluesparksubmit == sparksubmit
gluepyspark == pyspark
gluepytest == pytest

License

MIT License. Please see License File for more information.