Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Official latest aws-glue-libs docker image with arm64 does not seem to work on Macbook Pro M2 specs #205

Open
awongCM opened this issue Mar 30, 2024 · 1 comment

Comments

@awongCM
Copy link

awongCM commented Mar 30, 2024

https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-libraries.html#develop-local-docker-image

I followed the instructions above to setup its docker image container from DockerHub link
I did docker pull amazon/aws-glue-libs:glue_libs_4.0.0_image_01 and tried to start up in docker desktop. It abruptly exited.

At this point, I thought this image does not have arm64 based image layer running on it that may not be compatible for my Macbook Pro M2 machine.

So I tried docker pull amazon/aws-glue-libs:glue_libs_4.0.0_image_01-arm64. Tried starting up in docker desktop. It also abruptly exited too.

Now I'm confused.

Do any of these docker images ever work on Apple M Core Series machines at all since their inception less than 4 years ago?

Can anyone help to shed light on this?

FYI - I followed the exisiting request here - #83 (comment)
which allow me to raise this issue request.

@svajiraya
Copy link
Contributor

@awongCM what is the docker run command you are using? are you seeing any errors? Can you please post the docker logs to this thread for me to investigate further?

docker run -it --rm -p 8888:8888 --name glue_pyspark public.ecr.aws/glue/aws-glue-libs:glue_libs_4.0.0_image_01

If the above command exits as you said, In another shell, run docker logs glue_pyspark

The images are built with multi-arch support for amd64 and arm64.

I tested the image on EC2 Graviton m6g instance type (uses arm64 CPU arch) and it seems to be working fine. It would be great if I can get some more details to investigate:

[glue_user@9c05327ab205 workspace]$ uname -r
5.10.223-212.873.amzn2.aarch64
[glue_user@9c05327ab205 workspace]$ pyspark
Python 3.10.2 (main, Oct  8 2024, 04:02:18) [GCC 7.3.1 20180712 (Red Hat 7.3.1-17)] on linux
Type "help", "copyright", "credits" or "license" for more information.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/glue_user/spark/jars/log4j-slf4j-impl-2.17.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/glue_user/spark/jars/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/glue_user/aws-glue-libs/jars/log4j-slf4j-impl-2.17.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/glue_user/aws-glue-libs/jars/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
24/10/10 14:47:23 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 3.3.0-amzn-1
      /_/

Using Python version 3.10.2 (main, Oct  8 2024 04:02:18)
Spark context Web UI available at http://9c05327ab205:4041
Spark context available as 'sc' (master = local[*], app id = local-1728571643790).
SparkSession available as 'spark'.
>>> df = spark.createDataFrame([('X', )], "dummy STRING")
>>> df.printSchema()
root
 |-- dummy: string (nullable = true)

>>>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants