Skip to content

[SPARK-52937][SDP] Sinks #51644

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

[SPARK-52937][SDP] Sinks #51644

wants to merge 1 commit into from

Conversation

sryza
Copy link
Contributor

@sryza sryza commented Jul 24, 2025

What changes were proposed in this pull request?

We were not resetting checkpoint dirs on full refresh.

What changes were proposed in this pull request?

As proposed in the Declarative Pipelines SPIP, a sink is a generic target for a flow to send data that is external to the pipeline. Add support for defining them and executing flows that target them.

This PR implements sinks:

  • Spark Connect message for registering sink definitions – DefineSink
  • New FlowExecution subtype for executing flows that write to sinks – SinkWrite

TODO: explain sink checkpoint locations and pipeline storage root

Why are the changes needed?

Implement the Declarative Pipelines SPIP.

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

@sryza sryza changed the title Sinks [SPARK-52937][SDP] Sinks Jul 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant