Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve testing framework #4962

Merged
merged 26 commits into from
Dec 13, 2023
Merged

improve testing framework #4962

merged 26 commits into from
Dec 13, 2023

Conversation

RustedBones
Copy link
Contributor

@RustedBones RustedBones commented Aug 22, 2023

Fix #4963

  • Add runWitnOverrides to run pipeline unit-test, replacing some parts
  • Add JobTest io support for SMB
  • Add JobTest API for anonymous jobs (defined from context)

@@ -36,15 +37,15 @@ trait PipelineTestUtils {
* }
* }}}
*/
def runWithContext[T](fn: ScioContext => T): ScioExecutionContext = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is breaking. type parameter was not really useful here

@codecov
Copy link

codecov bot commented Aug 22, 2023

Codecov Report

Attention: 178 lines in your changes are missing coverage. Please review.

Comparison is base (1e0cb80) 63.42% compared to head (c69d279) 63.37%.

Files Patch % Lines
.../smb/syntax/SortMergeBucketScioContextSyntax.scala 19.42% 141 Missing ⚠️
...cala/com/spotify/scio/smb/SortMergeTransform.scala 45.71% 19 Missing ⚠️
.../main/scala/com/spotify/scio/testing/JobTest.scala 67.39% 15 Missing ⚠️
...sdk/extensions/smb/ParquetTypeSortedBucketIO.scala 0.00% 2 Missing ⚠️
.../smb/syntax/SortMergeBucketSCollectionSyntax.scala 90.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4962      +/-   ##
==========================================
- Coverage   63.42%   63.37%   -0.06%     
==========================================
  Files         288      291       +3     
  Lines       10805    10841      +36     
  Branches      778      781       +3     
==========================================
+ Hits         6853     6870      +17     
- Misses       3952     3971      +19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@RustedBones RustedBones force-pushed the testing-improvements branch 2 times, most recently from 3b481e3 to 434d558 Compare August 28, 2023 14:18
@RustedBones RustedBones force-pushed the testing-improvements branch from e051fc5 to ae86acf Compare August 30, 2023 09:22
@RustedBones RustedBones changed the base branch from main to v0.14.x August 30, 2023 09:23
@RustedBones
Copy link
Contributor Author

RustedBones commented Aug 30, 2023

In order to make testing work with sortMergeTransform, I had to introduce some breaking API changes.
Rebased the branch on v0.14.x

@RustedBones RustedBones added this to the 0.14.0 milestone Aug 30, 2023
@RustedBones RustedBones marked this pull request as ready for review September 11, 2023 07:31
Copy link
Contributor

@kellen kellen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

* }
* }}}
*/
def runWithOverrides(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this instead be chained onto ScioExecutionContext so we could avoid the distinct naming? e.g.

runWithContext { sc =>
  // ...
}
.withOverrides(
  TransformOverride.of("operation", (v: Int) => v.toString)
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried but then it's not possible to decide when the pipeline should be run.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh yeah that makes sense. Could we nest it somehow?

runWithContext { sc =>
  sc.withOverrides(
    TransformOverride.of(...)
  )
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks nice and should be possible. Will try that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is a bit risky because the transform must be defined before the withOverrides call or we need to store the override and call replaceAll just before run.

I tried to integrate the TestDataManager here but it is inconvenient

Base automatically changed from v0.14.x to main November 29, 2023 07:52
Copy link
Contributor

@kellen kellen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌

Copy link
Contributor

@clairemcginty clairemcginty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amazing work!

@RustedBones RustedBones merged commit 5648e79 into main Dec 13, 2023
13 checks passed
@RustedBones RustedBones deleted the testing-improvements branch December 13, 2023 10:42
import com.spotify.scio.io.{KeyedIO, TapOf, TapT, TestIO}
import com.spotify.scio.util.ScioUtil

final class SortedBucketIO[K, T](path: String, override val keyBy: T => K)(implicit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that I think about it, this class name is a bit confusing as it conflicts with the beam packaged SortedBucketIO https://github.com/spotify/scio/blob/main/scio-smb/src/main/java/org/apache/beam/sdk/extensions/smb/SortedBucketIO.java. I was testing out a custom SortedBucketIO.Read override, but since I had import com.spotify.scio.smb._ in my job (in order to get the Scala API bindings for reads/writes), I had to rename the org.apache.beam.sdk.extensions.smb.SortedBucketIO import to be able to use it

Maybe SmbIO? SortedBucketTestIO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I noticed that. I tried to keep consistency with java naming but It's probably to better to avoid conflict with SmbIO

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool. I can make that change today!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TransformOverride runtime typeValidation in wrong order
4 participants