Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The amazon.ion.simple_ion.dumps method output doesn't work with DynamoDB import table #363

Open
MacHu-GWU opened this issue Aug 2, 2024 · 4 comments
Labels

Comments

@MacHu-GWU
Copy link

I am trying to generate ion data file manually using this library so that I can use it for DynamoDB import table, this is my DynamoDB item in python dictionary.

{
        "id": 1, # this is hash key
        "name": "Alice"
}

the amazon.ion.simple_ion.dumps method gives me: $ion_1_0 {Item:{id:1,name:"Alice"}}, note that there's no dot after number 1. Then the import_table API fails.

However, if I manually add the dot behind the number 1, making it to be $ion_1_0 {Item:{id:1.,name:"Alice"}}, then it works.

I also tried to export a manually createdDynamodb table and I found out that the export ION file has the dot after the integer number.

I also tried the loads method, I think the integer without dot is a valid value for deserialization. However, it doesn't work with DynamoDB table import.

How do I ensure that there's an dot after any integer in the text view of my data?

@MacHu-GWU MacHu-GWU added the bug label Aug 2, 2024
@rmarrowstone
Copy link
Contributor

This isn't really a bug with ion-python.

In the Ion text format 1 is an Integer and 1. is a Decimal, see: https://amazon-ion.github.io/ion-docs/docs/spec.html

Per the DynamoDB Import Docs: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/S3DataImport.Format.html#S3DataImport.Requesting.Formats.Ion

They import an Ion Decimal as a Dynamo DB Number. I do not know why they don't map an Ion Integer to a Dynamo DB Number, but they don't. That's a possible feature request for Dynamo DB.

Assuming that it's faster to change your code then get DynamoDB to change, and that your code block is your python code:
To serialize an Ion Decimal from Python you need to create a decimal.Decimal. That will emit in your Ion stream as an Ion Decimal.

See https://github.com/amazon-ion/ion-python/blob/master/amazon/ion/simpleion.py#L33

@rmarrowstone
Copy link
Contributor

I would further advise that for your production code you serialize your Ion as Binary for the imports. Obviously the text format is great for debugging and developing, but the binary format has improved data density and will be faster to import.

Please check out the pydoc in simpleion and let us know how we can improve that if needed.

@MacHu-GWU
Copy link
Author

Thanks @rmarrowstone .

I believe it is still a bug, but not in amazon ION python, it is actually about DynamoDB Import.

The simpleion.dumps() method gives you the correct value $ion_1_0 {Item:{id:1,name:"Alice"}} (I expect the id to be integer). However, the DynamoDB import table feature doesn't recognize it. In my TableCreationParam, I defined the attribute type is N, however, DynamoDB import table feature raises an error for that.

@rmarrowstone another issue is that the DynamoDB import table document didn't mention how to use ion binary format to prepare the data. And the document says that Items in an Ion file are delimited by newlines. Each line begins with an Ion version marker, followed by an item in Ion format., which implies that I should use text to code my data. Then how can I do this?

I would further advise that for your production code you serialize your Ion as Binary for the imports. 

@rmarrowstone
Copy link
Contributor

Sadly it does look like they only support the Text format, so that was some bad advice, sorry. It would be more optimal if they supported the binary format, but alas...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants