Skip to content

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Oct 8, 2025

What changes were proposed in this pull request?

This PR aims to skip test_profile_pandas_udf and test_profile_pandas_function_api tests if pandas or pyarrow are unavailable like the other test cases, e.g., test_memory_profiler_pandas_udf.

$ git grep test_profile_pandas
python/pyspark/tests/test_memory_profiler.py:    def test_profile_pandas_udf(self):
python/pyspark/tests/test_memory_profiler.py:    def test_profile_pandas_function_api(self):

Why are the changes needed?

We had better check the test requirements explicitly. In other words, PySpark unit tests should pass without those packages like the existing other unit test cases.

@unittest.skipIf(
not have_pandas or not have_pyarrow,
cast(str, pandas_requirement_message or pyarrow_requirement_message),
)
def test_memory_profiler_pandas_udf(self):

Does this PR introduce any user-facing change?

No. This is a test change.

How was this patch tested?

Pass the CIs and manually test without pyarrow.

...
Tests passed in 159 seconds

Skipped tests in pyspark.tests.test_memory_profiler with python3:
      test_memory_profiler_aggregate_in_pandas (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_aggregate_in_pandas) ... skip (0.000s)
      test_memory_profiler_cogroup_apply_in_arrow (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_cogroup_apply_in_arrow) ... skip (0.001s)
      test_memory_profiler_cogroup_apply_in_pandas (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_cogroup_apply_in_pandas) ... skip (0.000s)
      test_memory_profiler_group_apply_in_arrow (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_group_apply_in_arrow) ... skip (0.000s)
      test_memory_profiler_group_apply_in_pandas (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_group_apply_in_pandas) ... skip (0.000s)
      test_memory_profiler_map_in_pandas_not_supported (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_map_in_pandas_not_supported) ... skip (0.000s)
      test_memory_profiler_pandas_udf (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_pandas_udf) ... skip (0.000s)
      test_memory_profiler_pandas_udf_iterator_not_supported (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_pandas_udf_iterator_not_supported) ... skip (0.000s)
      test_memory_profiler_pandas_udf_window (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_pandas_udf_window) ... skip (0.000s)
      test_memory_profiler_udf_with_arrow (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_udf_with_arrow) ... skip (0.000s)
      test_profile_pandas_function_api (pyspark.tests.test_memory_profiler.MemoryProfilerTests.test_profile_pandas_function_api) ... skip (0.000s)
      test_profile_pandas_udf (pyspark.tests.test_memory_profiler.MemoryProfilerTests.test_profile_pandas_udf) ... skip (0.000s)
...

Was this patch authored or co-authored using generative AI tooling?

No.

@dongjoon-hyun
Copy link
Member Author

Thank you, @zhengruifeng . Merged to master.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-53846 branch October 9, 2025 01:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants