Create a BatchOnly category to exclude some batch tests from Dataflow Validates Runner Streaming #36615

Abacn · 2025-10-24T14:55:48Z

Dataflow Streaming Validates Runner Streaming tests setup by gathering all @ValidatesRunnerTest and add --streaming pipeline options. There is a Dataflow Streaming Validates Runner test suites also exercising this test.

Over the time the number of test grows so does execution time. We already increased timeout several times in the past. Here I try to dedup some test cases that

already run in batch test suites, and
is error handling, or test some streaming mode irrelevant feature, or have a base test case not excluded

Please add a meaningful description for your change here

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
Update CHANGES.md with noteworthy changes.
If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

… Validates Runner Streaming

Abacn · 2025-10-24T15:03:06Z

cc: @Amar3tto @tvalentyn for #36577 (comment)

tvalentyn · 2025-10-24T19:16:18Z

cc: @kennknowles

kennknowles · 2025-10-25T14:55:33Z

I would prefer a semantic tag that justifies why this is OK. For example,

Maybe it doesn't validate the runner but just validates the SDK harness so we can remove the tag.
Maybe it is always the same whether it is batch or streaming, so it could be @Category(SameInBatchAndStreaming.class)

But the problem is that the existence of a batch vs streaming mode is actually quite under-the-hood. In fact it would be better for the execution mode to be chosen at finer granularity than "whole job". And if there really is a potentially different codepath in batch and streaming we can't safely disable it.

The real solution to the issue is to combine multiple ValidatesRunner pipelines into a single Dataflow job so that we can run 100 at a time or more. This is just sort of hard to do with JUnit. Or at least it requires hackery that I didn't figure out yet.

If it is a hack to make just Dataflow work better I would rather make it in the Dataflow build.gradle than tag on the test, since it isn't a property of the test.

kennknowles · 2025-10-25T14:57:53Z

sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/CombineTest.java


    @Test
-    @Category({ValidatesRunner.class, UsesSideInputs.class})
+    @Category({ValidatesRunner.class, UsesSideInputs.class, BatchOnly.class})


Since streaming and batch have really different execution models, I would not opt this in to being batch only.

agree, and the name is subject to change. Currently just use if for manual testing

kennknowles · 2025-10-25T14:58:09Z

sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/CreateTest.java


  @Test
-  @Category(ValidatesRunner.class)
+  @Category({ValidatesRunner.class, BatchOnly.class})


Streaming and batch actually have somewhat different Create implementations I think

kennknowles · 2025-10-25T15:46:37Z

Another question: How many Dataflow jobs are in parallel? I think at some point we single-threaded these to avoid quota issues, or some other build issue, but we could run with 10+ threads and just need to adjust our quota.

Abacn · 2025-10-27T14:20:29Z

It's still timing out reducing 30+ tests (should save ~1h). It is hard to check if there was a stuck tests without any log. Currently running on #36634, if there are stuck tests, it will show in Dataflow console as long as its in running state. Therefore checking 8h later.

Abacn · 2025-10-27T14:25:36Z

Another question: How many Dataflow jobs are in parallel? I think at some point we single-threaded these to avoid quota issues, or some other build issue, but we could run with 10+ threads and just need to adjust our quota.

It was set to default value

beam/runners/google-cloud-dataflow-java/build.gradle

Line 274 in 4325e2b

maxParallelForks Integer.MAX_VALUE

Checked in Dataflow console running it's effectively 4 jobs in parallel.

but we could run with 10+ threads and just need to adjust our quota.

We'd definitely need much larger quota if bump this value. Currently one of the quota most easily hit is in-use ipv4 addresses on us-central1. ipv4 quota is not easy to get approved. I wonder if we can run validates runner tests (and supported Dataflow test in general) on ipv6 only or disable external networks (could be follow up)

kennknowles · 2025-10-27T15:41:15Z

Another question: How many Dataflow jobs are in parallel? I think at some point we single-threaded these to avoid quota issues, or some other build issue, but we could run with 10+ threads and just need to adjust our quota.

It was set to default value

beam/runners/google-cloud-dataflow-java/build.gradle

Line 274 in 4325e2b

maxParallelForks Integer.MAX_VALUE

Checked in Dataflow console running it's effectively 4 jobs in parallel.

but we could run with 10+ threads and just need to adjust our quota.

We'd definitely need much larger quota if bump this value. Currently one of the quota most easily hit is in-use ipv4 addresses on us-central1. ipv4 quota is not easy to get approved. I wonder if we can run validates runner tests (and supported Dataflow test in general) on ipv6 only or disable external networks (could be follow up)

This seems like a good idea. I'm not sure why we would need external networks. Certainly not for all tests.

Another possibility could be to shard the test suite for Dataflow. I think probably ParDoTest and CombineTest probably have a ton of pipelines and if you separate some "core ValidatesRunner suite" from "extended ValidatesRunner suite" it might reduce flakes and speed up reruns while reducing quota usage.

Abacn · 2025-10-27T19:29:21Z

I understand what is going on now. The parallelism isn't even. As of now (after 8h+), a single test class ViewTest is running (persumably other tests have finished). There isn't a stuck pipeline. That explains this PR doesn't work in terms of decreasing test running time.

ViewTest has 42 validates runner tests. We need to split it as well. However the pipeline running time definitely elevated recently. Checked example pipeline log it appears pipeline teardown time is increased (it usually takes 4h to startup, then needs 2-5 min from "stopping worker pool" to pipeline finish. At "stopping worker pool" the PipelineResult is still running. I kind of remember previously this was 1.5-3 min-ish.

I think we're good for now in terms of validation (no SDK side regression found)

Amar3tto · 2025-10-27T19:39:08Z

СС @tvalentyn

github-actions bot added runners dataflow labels Oct 24, 2025

Create a BatchOnly category to exclude some batch tests from Dataflow…

4325e2b

… Validates Runner Streaming

Abacn force-pushed the dedup-validate-runner branch from 89b0901 to 4325e2b Compare October 24, 2025 14:56

github-actions bot added the java label Oct 24, 2025

kennknowles reviewed Oct 25, 2025

View reviewed changes

Abacn closed this Nov 4, 2025

Abacn deleted the dedup-validate-runner branch November 4, 2025 21:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Create a BatchOnly category to exclude some batch tests from Dataflow Validates Runner Streaming #36615

Create a BatchOnly category to exclude some batch tests from Dataflow Validates Runner Streaming #36615

Abacn commented Oct 24, 2025 •

edited

Loading

Uh oh!

Abacn commented Oct 24, 2025

Uh oh!

tvalentyn commented Oct 24, 2025

Uh oh!

kennknowles commented Oct 25, 2025

Uh oh!

kennknowles Oct 25, 2025

Uh oh!

Abacn Oct 27, 2025

Uh oh!

kennknowles Oct 25, 2025

Uh oh!

kennknowles commented Oct 25, 2025

Uh oh!

Abacn commented Oct 27, 2025

Uh oh!

Abacn commented Oct 27, 2025

Uh oh!

kennknowles commented Oct 27, 2025

Uh oh!

Abacn commented Oct 27, 2025

Uh oh!

Amar3tto commented Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Create a BatchOnly category to exclude some batch tests from Dataflow Validates Runner Streaming #36615

Create a BatchOnly category to exclude some batch tests from Dataflow Validates Runner Streaming #36615

Conversation

Abacn commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GitHub Actions Tests Status (on master branch)

Uh oh!

Abacn commented Oct 24, 2025

Uh oh!

tvalentyn commented Oct 24, 2025

Uh oh!

kennknowles commented Oct 25, 2025

Uh oh!

kennknowles Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

Abacn Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

kennknowles Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

kennknowles commented Oct 25, 2025

Uh oh!

Abacn commented Oct 27, 2025

Uh oh!

Abacn commented Oct 27, 2025

Uh oh!

kennknowles commented Oct 27, 2025

Uh oh!

Abacn commented Oct 27, 2025

Uh oh!

Amar3tto commented Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Abacn commented Oct 24, 2025 •

edited

Loading