Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DynamoDB Locking Mechanism Failing for AWS S3 Storage Backend in Version 0.20.1 #2930

Closed
donotpush opened this issue Oct 7, 2024 · 7 comments
Assignees
Labels
binding/rust Issues for the Rust crate bug Something isn't working storage/aws AWS S3 storage related
Milestone

Comments

@donotpush
Copy link

donotpush commented Oct 7, 2024

Bug Report

Environment

Delta-rs version: 0.20.1
Environment: Docker


Description

Issue: Fails to write data to AWS S3 using DynamoDB locking mechanism in version 0.20.1, but works in version 0.19.2.


Error Messages

  1. First Execution Failure (table does not exists):

    Traceback (most recent call last):
      File "/app/test.py", line 21, in <module>
          df.write_delta(
      File "/usr/local/lib/python3.11/site-packages/polars/dataframe/frame.py", line 4286, in write_delta
          write_deltalake(
      File "/usr/local/lib/python3.11/site-packages/deltalake/writer.py", line 323, in write_deltalake
          write_deltalake_rust(
    _internal.CommitFailedError: Transaction failed: dynamodb client failed to write log entry
    
  2. Subsequent Execution Failure (after it worked once, table already exists):

    Traceback (most recent call last):
      File "/app/test.py", line 22, in <module>
          df.write_delta(
      File "/usr/local/lib/python3.11/site-packages/polars/dataframe/frame.py", line 4286, in write_delta
          write_deltalake(
      File "/usr/local/lib/python3.11/site-packages/deltalake/writer.py", line 302, in write_deltalake
          table.update_incremental()
      File "/usr/local/lib/python3.11/site-packages/deltalake/table.py", line 1258, in update_incremental
          self._table.update_incremental()
    _internal.DeltaError: Generic error: error in DynamoDb
    

How to Reproduce

Dockerfile:

FROM python:3.11

WORKDIR /app

RUN pip install deltalake==0.20.1 polars

# Uncomment to see it working
# RUN pip install deltalake==0.19.2

COPY test.py .

CMD [ "python", "test.py" ]

test.py:

import polars
import os

df = polars.DataFrame({'x': [1, 2, 3]})

storage_options = {
    'AWS_S3_LOCKING_PROVIDER': 'dynamodb',
    'DELTA_DYNAMO_TABLE_NAME': 'delta_log',
    'AWS_ACCESS_KEY_ID': os.environ["AWS_ACCESS_KEY_ID"],
    'AWS_SECRET_ACCESS_KEY': os.environ["AWS_SECRET_ACCESS_KEY"],
    'AWS_REGION': os.environ['AWS_REGION'],
}

df.write_delta(
    f"s3://{os.environ['BUCKET_NAME']}/delta/test",
    storage_options=storage_options,
)

# You will need a bucket and a DynamoDB table.
# How to create DynamoDB table?
    #  aws dynamodb create-table \
    # --table-name delta_log \
    # --attribute-definitions AttributeName=tablePath,AttributeType=S AttributeName=fileName,AttributeType=S \
    # --key-schema AttributeName=tablePath,KeyType=HASH AttributeName=fileName,KeyType=RANGE \
    # --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

Run the following commands:

docker build -t test:latest .
docker run \
  -e AWS_ACCESS_KEY_ID=your_access_key \
  -e AWS_SECRET_ACCESS_KEY=your_secret_key \
  -e BUCKET_NAME=your_bucket_name \
  -e AWS_REGION=your_region \
  test:latest

If you uncomment line 8 in the Dockerfile and then execute docker build and docker run again, you will see that it works correctly with version 0.19.2

Reference: https://delta-io.github.io/delta-rs/integrations/object-storage/s3/

@donotpush donotpush added the bug Something isn't working label Oct 7, 2024
@donotpush donotpush changed the title Dynamodb locking mechanism failing for AWS S3 Storage Backend on 0.20.1 DynamoDB Locking Mechanism Failing for AWS S3 Storage Backend in Version 0.20.1 Oct 7, 2024
@rtyler rtyler self-assigned this Oct 8, 2024
@rtyler rtyler added storage/aws AWS S3 storage related binding/rust Issues for the Rust crate labels Oct 8, 2024
@rtyler
Copy link
Member

rtyler commented Oct 8, 2024

@donotpush I cannot imagine this being the case, but the IAM user that is being used does have the necessary dynamodb permissions granted right?

@rtyler rtyler added this to the Rust v1.0.0 milestone Oct 8, 2024
@donotpush
Copy link
Author

@donotpush I cannot imagine this being the case, but the IAM user that is being used does have the necessary dynamodb permissions granted right?

@rtyler, thanks for looking into this. It’s a strange issue—it doesn’t happen in all environments, and the error message doesn’t provide much insight.

My AWS credentials have admin-level permissions, and the problem is easy to reproduce. I’ve tried it in several scenarios:

  • Locally without Docker: works (using credentials + AWS CLI configured)
  • Locally with Docker: fails (only using credentials)
  • Running on AWS Lambda with IAM roles (no credentials): fails

Regardless, the error message isn’t helpful. It took me 4 hours to figure out what was wrong. It’s also suspicious that everything works fine with version 0.19.2.

I’m running this locally on an Apple M2 (ARM), though I doubt that’s related. If you can reproduce the issue with the example I provided, it would be very helpful.

@rtyler
Copy link
Member

rtyler commented Oct 8, 2024

Locally with Docker: fails (only using credentials)

Can you expand a little bit on what this means? Does this mean that access keys and secrets are set in storage_options or in the environment? I'm having trouble understanding how this case differs from the first scenario you described 🤔

I am hoping this might be a case of mismatched key names which I recently fixed in #2931

@donotpush
Copy link
Author

donotpush commented Oct 8, 2024

Locally with Docker: fails (only using credentials)

Can you expand a little bit on what this means? Does this mean that access keys and secrets are set in storage_options or in the environment? I'm having trouble understanding how this case differs from the first scenario you described 🤔

I am hoping this might be a case of mismatched key names which I recently fixed in #2931

The first scenario is the same code test.py but without running on docker, and without environment variables. If you follow the steps from "How to Reproduce" in the issues description, you should get an error when running on docker.

@rtyler thanks for looking at it, it will be great to get a confirmation that you also get a problem whe running on docker. I tried multiple things, my conclusion is that something might is wrong in version 0.20.1

@rtyler
Copy link
Member

rtyler commented Oct 19, 2024

😒 so I tried the exact steps with the Dockerfile and have still not been able to reproduce the issue. I'm curious if you still see the issue? If so, what region?

The IAM keys I used had the AdministratorAccess IAM policy added. Perhaps there's a permission missing 🤔

@donotpush
Copy link
Author

Thanks for testing it out! If it works on your end, the issue is likely specific to my local machine. I’ll try reproducing it on another laptop, as I was currently running it on an M2 chip.

@rtyler
Copy link
Member

rtyler commented Dec 1, 2024

I haven't seen this crop up again so I'm going to close it out. 🤞

@rtyler rtyler closed this as completed Dec 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate bug Something isn't working storage/aws AWS S3 storage related
Projects
None yet
Development

No branches or pull requests

2 participants