-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SUPPORT] Hudi fails ACID verification test #11170
Comments
@matthijseikelenboom I don't see any lock related configurations in your setup. I checked that you are using 2 parallel writers. So you may need to configure lock during write. Hudi follows OCC principal. Let me know in case I am missing anything on the same. Thanks a lot. |
@ad1happy2go Ah yes, you're right. I seem to have forgot to add the hudi-defaults.conf file to this project. I've added it to my repository and ran the test again. It comes further along, but still breaks down. Stacktrace (Be warned, it's a big one):
|
@matthijseikelenboom Looks like some library conflicts are there in the project. Need to reproduce it. |
@matthijseikelenboom I noticed you are using JAVA 17 for the same. Hudi 0.14.1 doesn't support JAVA 17 yet. The newer Hudi version will be able to support the same. Some reference to similar issue related to java 17 here - EsotericSoftware/kryo#885 Can you try with JAVA 8 once. Thanks. |
Okay, yeah sure. The original test was written with Java 11, but I updated to 17 because I thought why not and Spark 3.4.2 supports it. Is it known that Hudi (Or Kryo) also doesn't work with Java 11 and is that why you suggest Java 8? |
@ad1happy2go I've pushed a new branch on the repo where the project is downgraded to Java 8. When running the test then, the writers don't seem to fail anymore, but it still fails the verification test. |
@matthijseikelenboom I tried to run in my local but again seeing issues. We can connect once. If you are on Apache Hudi slack can you ping me "Aditya Goenka" |
@matthijseikelenboom I was able to successfully test. There were two issues -
|
@matthijseikelenboom Please let us know if it works for you also. Thanks. |
Tested and verified. Closing issues. More infoSolution has been tested on:
|
Thanks @matthijseikelenboom for the update |
Describe the problem you faced
For work we had needed to have a concurrent read/write support for our data lake, which uses Spark. We where noticing some inconsistencies, so we wrote a test that can verify whether something like Hudi adheres to ACID. We did however find that Hudi fails this test.
Now, it can be that we've wrongly configured Hudi or that there is some mistake in the test code.
My question is if someone of you can take a look at it, and perhaps can explain what is going wrong here.
To Reproduce
How to run the test and it's findings are described in the README of the repository, but here is a short run down
Steps to reproduce the behavior:
Expected behavior
Environment Description
Hudi version : 0.14.1
Spark version : 3.4.2
Hive version : 4.0.0-beta-1
Hadoop version : 3.2.2
Storage (HDFS/S3/GCS..) : NTFS(Windows), APFS(macOS) & HDFS
Running on Docker? (yes/no) : No
Additional context
It's worth noting that other solutions, Iceberg and Delta Lake, have also been tested this way. Iceberg also didn't pass this test. Delta Lake did pass the test.
Stacktrace
The text was updated successfully, but these errors were encountered: