Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comatibility test may trigger segfaults #1979

Open
IvoDD opened this issue Nov 5, 2024 · 2 comments
Open

Comatibility test may trigger segfaults #1979

IvoDD opened this issue Nov 5, 2024 · 2 comments
Assignees
Labels
bug Something isn't working flaky test Tracking tests that fail inconsistently in CI

Comments

@IvoDD
Copy link
Collaborator

IvoDD commented Nov 5, 2024

We've seen azure storage compat tests trigger a segfault here:

FAILED tests/compat/arcticdb/test_compatibility.py::test_modify_old_library_option_with_current[4.5.0-azurite] - tests.compat.conftest.ErrorInVenv: Executing ['from arcticdb import Arctic', 'import pandas as pd', 'import numpy as np', "ac = Arctic('azure://DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://localhost:10047/devstoreaccount1;Container=container3;CA_cert_path=/tmp/tmpos9yrqltAzuriteStorageFixtureFactory/client.pem')", "expected_df = pd.read_parquet('/tmp/tmp2dkdtq4p/expected_df.parquet')", "lib = ac.get_library('test_modify_old_library_option.269_2024-11-04T17_39_52_672424')", "read_df = lib.read('sym').data", 'pd.testing.assert_frame_equal(read_df, expected_df)'] failed with return code -11

We've seen such segfaults previously with lmdb and mongo which both had race conditions at desctruction time. We believe we've fixed the lmdb and mongo segfaults from 4.5.1 going forward.

Due to the nature of the compat tests they are very good at reproducing these type of desctructor segfaults because they run many short lived processes which end with many desctructions.

I suspect we might have a similar desctructor related bug with azure.

@IvoDD IvoDD added bug Something isn't working flaky test Tracking tests that fail inconsistently in CI labels Nov 5, 2024
IvoDD added a commit that referenced this issue Nov 5, 2024
Also since 4.5.1 is now available we replace 4.5.0 with 4.5.1 which
should re-enable the lmdb and mongo tests with their segfaults fixed.
@IvoDD IvoDD mentioned this issue Nov 5, 2024
5 tasks
@IvoDD
Copy link
Collaborator Author

IvoDD commented Nov 5, 2024

Also relatedly the mongo segdfault which we taught we fixed in #1862 is still observed in 4.5.1 e.g. here:

FAILED tests/compat/arcticdb/test_compatibility.py::test_modify_old_library_option_with_current[4.5.1-mongo] - tests.compat.conftest.ErrorInVenv: Executing ['from arcticdb import Arctic', 'import pandas as pd', 'import numpy as np', "ac = Arctic('mongodb://localhost:15584')", "expected_df = pd.read_parquet('/tmp/tmpn5c751c7/expected_df.parquet')", "lib = ac.get_library('test_modify_old_library_option.475_2024-11-05T10_46_39_809132')", "read_df = lib.read('sym').data", 'pd.testing.assert_frame_equal(read_df, expected_df)'] failed with return code 139

I had previously identified the segfault is in the mongo destructor.

We need to fix that as well.

@IvoDD IvoDD changed the title Comatibility test azure storage segfault Comatibility test azure and mongo storage segfaults Nov 5, 2024
@IvoDD
Copy link
Collaborator Author

IvoDD commented Nov 5, 2024

Aaand lmdb can segfault too:

FAILED tests/compat/arcticdb/test_compatibility.py::test_modify_old_library_option_with_current[4.5.1-lmdb] - tests.compat.conftest.ErrorInVenv: Executing ['from arcticdb import Arctic', 'import pandas as pd', 'import numpy as np', "ac = Arctic('lmdb:///tmp/pytest-of-runner/pytest-0/popen-gw0/test_modify_old_library_option0')", "expected_df = pd.read_parquet('/tmp/tmpzgvmc357/expected_df.parquet')", "lib = ac.get_library('test_modify_old_library_option.966_2024-11-05T13_05_18_218166')", "read_df = lib.read('sym').data", 'pd.testing.assert_frame_equal(read_df, expected_df)'] failed with return code 139

Given it's so widespread this is likely not an issue with our storage destructors. I've recently seen it only on conda linux builds so it might be related to that?

@IvoDD IvoDD changed the title Comatibility test azure and mongo storage segfaults Comatibility test may trigger segfaults Nov 5, 2024
IvoDD added a commit that referenced this issue Nov 5, 2024
After we fix the segfaults we can re-enable
IvoDD added a commit that referenced this issue Nov 6, 2024
Also since 4.5.1 is now available we replace 4.5.0 with 4.5.1 which
should re-enable the lmdb and mongo tests with their segfaults fixed.
IvoDD added a commit that referenced this issue Nov 6, 2024
After we fix the segfaults we can re-enable
grusev pushed a commit that referenced this issue Nov 25, 2024
Also since 4.5.1 is now available we replace 4.5.0 with 4.5.1 which
should re-enable the lmdb and mongo tests with their segfaults fixed.
grusev pushed a commit that referenced this issue Nov 25, 2024
After we fix the segfaults we can re-enable
@G-D-Petrov G-D-Petrov self-assigned this Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flaky test Tracking tests that fail inconsistently in CI
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

3 participants