Skip to content

Commit 18f0463

Browse files
committed
[SPARK-53846][PYTHON][TESTS] Skip test_profile_pandas_* tests if pandas or pyarrow are unavailable
### What changes were proposed in this pull request? This PR aims to skip `test_profile_pandas_udf` and `test_profile_pandas_function_api` tests if `pandas` or `pyarrow` are unavailable like the other test cases, e.g., `test_memory_profiler_pandas_udf`. ``` $ git grep test_profile_pandas python/pyspark/tests/test_memory_profiler.py: def test_profile_pandas_udf(self): python/pyspark/tests/test_memory_profiler.py: def test_profile_pandas_function_api(self): ``` ### Why are the changes needed? We had better check the test requirements explicitly. In other words, PySpark unit tests should pass without those packages like the existing other unit test cases. https://github.com/apache/spark/blob/bf2457b6db77b911874a22e6d73f07793f44bef1/python/pyspark/tests/test_memory_profiler.py#L307-L311 ### Does this PR introduce _any_ user-facing change? No. This is a test change. ### How was this patch tested? Pass the CIs and manually test without `pyarrow`. ``` ... Tests passed in 159 seconds Skipped tests in pyspark.tests.test_memory_profiler with python3: test_memory_profiler_aggregate_in_pandas (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_aggregate_in_pandas) ... skip (0.000s) test_memory_profiler_cogroup_apply_in_arrow (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_cogroup_apply_in_arrow) ... skip (0.001s) test_memory_profiler_cogroup_apply_in_pandas (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_cogroup_apply_in_pandas) ... skip (0.000s) test_memory_profiler_group_apply_in_arrow (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_group_apply_in_arrow) ... skip (0.000s) test_memory_profiler_group_apply_in_pandas (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_group_apply_in_pandas) ... skip (0.000s) test_memory_profiler_map_in_pandas_not_supported (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_map_in_pandas_not_supported) ... skip (0.000s) test_memory_profiler_pandas_udf (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_pandas_udf) ... skip (0.000s) test_memory_profiler_pandas_udf_iterator_not_supported (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_pandas_udf_iterator_not_supported) ... skip (0.000s) test_memory_profiler_pandas_udf_window (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_pandas_udf_window) ... skip (0.000s) test_memory_profiler_udf_with_arrow (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_udf_with_arrow) ... skip (0.000s) test_profile_pandas_function_api (pyspark.tests.test_memory_profiler.MemoryProfilerTests.test_profile_pandas_function_api) ... skip (0.000s) test_profile_pandas_udf (pyspark.tests.test_memory_profiler.MemoryProfilerTests.test_profile_pandas_udf) ... skip (0.000s) ... ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #52549 from dongjoon-hyun/SPARK-53846. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent 22e24df commit 18f0463

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

python/pyspark/tests/test_memory_profiler.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,10 @@ def test_memory_profiler(self):
112112
self.sc.dump_profiles(d)
113113
self.assertTrue(f"udf_{id}_memory.txt" in os.listdir(d))
114114

115+
@unittest.skipIf(
116+
not have_pandas or not have_pyarrow,
117+
cast(str, pandas_requirement_message or pyarrow_requirement_message),
118+
)
115119
def test_profile_pandas_udf(self):
116120
udfs = [self.exec_pandas_udf_ser_to_ser, self.exec_pandas_udf_ser_to_scalar]
117121
udf_names = ["ser_to_ser", "ser_to_scalar"]
@@ -130,6 +134,10 @@ def test_profile_pandas_udf(self):
130134
"Profiling UDFs with iterators input/output is not supported" in str(user_warns[0])
131135
)
132136

137+
@unittest.skipIf(
138+
not have_pandas or not have_pyarrow,
139+
cast(str, pandas_requirement_message or pyarrow_requirement_message),
140+
)
133141
def test_profile_pandas_function_api(self):
134142
apis = [self.exec_grouped_map]
135143
f_names = ["grouped_map"]

0 commit comments

Comments
 (0)