-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-37621: [Packaging][Conda] Sync conda recipes with feedstocks #37624
Conversation
|
@github-actions crossbow submit -g conda |
Revision: c04c157 Submitted crossbow builds: ursacomputing/crossbow @ actions-630421cd79 |
dev/tasks/conda-recipes/.ci_support/linux_ppc64le_cuda_compiler_version11.2.yaml
Show resolved
Hide resolved
Something is not working out with the CUDA cross-compilation here:
It does work in conda-forge, and CUDA on aarch/ppc is a bit more of a niche setup, so I guess we could also drop it here. For completeness, I looked at the diff for the build scripts, but I couldn't determine something that would touch upon where the CUDA_HOME would point -- the error is pretty clearly that we're not pointing to the ppc version of
|
Is it intended that gandiva now has a run-time dependence on
Adding the respective host-dependence (+ respective run-export) is not hard, I'm just double-checking that this is intentional. CC @pitrou @kou @raulcd @jorisvandenbossche PS. This is a rare case where windows looks better than unix, only one test failure:
|
Yes. It's caused by #37412. |
I started debugging this PR on conda-forge infrastructure in conda-forge/arrow-cpp-feedstock#1170, and it turns out that the problem hits us there as well. I'm not sure what changed since 13.0, but it seems something is now overriding (or not respecting) our CUDA_HOME, which is necessary to make cross-compilation work with CUDA... |
Could you try
|
Thanks for the quick response, that sounds like a very promising candidate! Rebased & retriggered conda-forge/arrow-cpp-feedstock#1170. Will sync back to this PR if passing. |
# currently broken | ||
{% set tests_to_skip = tests_to_skip + " or test_fastparquet_cross_compatibility" %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't raised an issue for this yet, but this is consistently failing. Not sure what's the difference between our CI and the one here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know a CI build log that shows the error for this? (the last ones will have this already skipped)
(to have an idea what is going on here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_____________________ test_fastparquet_cross_compatibility _____________________
tempdir = PosixPath('/tmp/pytest-of-conda/pytest-0/test_fastparquet_cross_compati0')
@pytest.mark.pandas
@pytest.mark.fastparquet
@pytest.mark.filterwarnings("ignore:RangeIndex:FutureWarning")
@pytest.mark.filterwarnings("ignore:tostring:DeprecationWarning:fastparquet")
def test_fastparquet_cross_compatibility(tempdir):
fp = pytest.importorskip('fastparquet')
df = pd.DataFrame(
{
"a": list("abc"),
"b": list(range(1, 4)),
"c": np.arange(4.0, 7.0, dtype="float64"),
"d": [True, False, True],
"e": pd.date_range("20130101", periods=3),
"f": pd.Categorical(["a", "b", "a"]),
# fastparquet writes list as BYTE_ARRAY JSON, so no roundtrip
# "g": [[1, 2], None, [1, 2, 3]],
}
)
table = pa.table(df)
# Arrow -> fastparquet
file_arrow = str(tempdir / "cross_compat_arrow.parquet")
pq.write_table(table, file_arrow, compression=None)
fp_file = fp.ParquetFile(file_arrow)
df_fp = fp_file.to_pandas()
> tm.assert_frame_equal(df, df_fp)
pyarrow/tests/parquet/test_basic.py:741:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
testing.pyx:55: in pandas._libs.testing.assert_almost_equal
???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E AssertionError: DataFrame.iloc[:, 3] (column name="d") are different
E
E DataFrame.iloc[:, 3] (column name="d") values are different (66.66667 %)
E [index]: [0, 1, 2]
E [left]: [True, False, True]
E [right]: [False, False, False]
E At positional index 0, first diff: True != False
testing.pyx:173: AssertionError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Searching for fastparquet in our code base, I assume that we actually don't have any CI build that includes it ... So we are just not running that test anywhere, whoops.
Now, the cross-compatibility is also tested in the pandas test suite, so maybe it's not too important to have it here as well. Anyway, opened an issue at #37853
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although fastparquet reading a column of boolens wrongly for a pyarrow-written file seems a quite serious issue ..
Now, similar data is used in the pandas test suite, where this is passing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for raising the issue. I agree it does looks potentially serious, but I hadn't looked too closely because it was just one test. The first thing would be to add testing here in CI. I think this PR is fine as is (I try to keep the test skips to an absolute minimum and remove them whenever possible; that then gets picked up by the next recipe sync anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a closer look at it (could reproduce this locally), and so actually it was good to do so, as this is a problem in fastparquet reading parquet files generated by latest pyarrow, see details at #37853 (comment)
But yes, so skipping for now is perfectly fine.
@github-actions crossbow submit -g conda |
Revision: d7af614 Submitted crossbow builds: ursacomputing/crossbow @ actions-20144be8d4 |
Revision: 53a4b70 Submitted crossbow builds: ursacomputing/crossbow @ actions-80a2017be8 |
Failures: linux-64Failed during artefact upload -- irrelevant linux-aarch64Connection lost to agent -- irrelevant osx-64
I can skip the respective test on osx if that's what people prefer. Otherwise this PR should be ready. |
Anything else to do here? |
I don't think so, thanks for the ping, and for all the work here! |
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit c9674bc. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about possible false positives for unstable benchmarks that are known to sometimes produce them. |
…apache#37624) Syncing after the release of 13.0.0 + a couple of migrations (state as of conda-forge/arrow-cpp-feedstock#1168 & conda-forge/r-arrow-feedstock#68) Relevant updates: * we're not building twice for different protobuf versions anymore * new abseil version (fixes apache#36908) * we've finally upgraded the aws-sdk to 1.11 * the default R versions (on unix) are now 4.2 & 4.3. Also some further hardening of the activation scripts & clean-ups for dependencies & test skips. * Closes: apache#37621 Lead-authored-by: H. Vetinari <[email protected]> Co-authored-by: h-vetinari <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
…apache#37624) Syncing after the release of 13.0.0 + a couple of migrations (state as of conda-forge/arrow-cpp-feedstock#1168 & conda-forge/r-arrow-feedstock#68) Relevant updates: * we're not building twice for different protobuf versions anymore * new abseil version (fixes apache#36908) * we've finally upgraded the aws-sdk to 1.11 * the default R versions (on unix) are now 4.2 & 4.3. Also some further hardening of the activation scripts & clean-ups for dependencies & test skips. * Closes: apache#37621 Lead-authored-by: H. Vetinari <[email protected]> Co-authored-by: h-vetinari <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
…apache#37624) Syncing after the release of 13.0.0 + a couple of migrations (state as of conda-forge/arrow-cpp-feedstock#1168 & conda-forge/r-arrow-feedstock#68) Relevant updates: * we're not building twice for different protobuf versions anymore * new abseil version (fixes apache#36908) * we've finally upgraded the aws-sdk to 1.11 * the default R versions (on unix) are now 4.2 & 4.3. Also some further hardening of the activation scripts & clean-ups for dependencies & test skips. * Closes: apache#37621 Lead-authored-by: H. Vetinari <[email protected]> Co-authored-by: h-vetinari <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
…apache#37624) Syncing after the release of 13.0.0 + a couple of migrations (state as of conda-forge/arrow-cpp-feedstock#1168 & conda-forge/r-arrow-feedstock#68) Relevant updates: * we're not building twice for different protobuf versions anymore * new abseil version (fixes apache#36908) * we've finally upgraded the aws-sdk to 1.11 * the default R versions (on unix) are now 4.2 & 4.3. Also some further hardening of the activation scripts & clean-ups for dependencies & test skips. * Closes: apache#37621 Lead-authored-by: H. Vetinari <[email protected]> Co-authored-by: h-vetinari <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
…apache#37624) Syncing after the release of 13.0.0 + a couple of migrations (state as of conda-forge/arrow-cpp-feedstock#1168 & conda-forge/r-arrow-feedstock#68) Relevant updates: * we're not building twice for different protobuf versions anymore * new abseil version (fixes apache#36908) * we've finally upgraded the aws-sdk to 1.11 * the default R versions (on unix) are now 4.2 & 4.3. Also some further hardening of the activation scripts & clean-ups for dependencies & test skips. * Closes: apache#37621 Lead-authored-by: H. Vetinari <[email protected]> Co-authored-by: h-vetinari <[email protected]> Signed-off-by: Joris Van den Bossche <[email protected]>
Syncing after the release of 13.0.0 + a couple of migrations (state as of conda-forge/arrow-cpp-feedstock#1168 & conda-forge/r-arrow-feedstock#68)
Relevant updates:
Also some further hardening of the activation scripts & clean-ups for dependencies & test skips.