Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stub in a run_pipeline CLI and add example usage #492

Merged
merged 1 commit into from
Jan 29, 2025

Conversation

bbrowning
Copy link
Contributor

Part of the typical research flow with SDG is to run a single pipeline, with an input dataset and an output dataset. We need a simple way to enable this within the SDG repo, ideally without requiring code to be written just for the cases of running a Pipeline. This stubs in a simple run_pipeline CLI for just that purpose.

This also adds an example of how to use the pipeline by documenting the IterBlock where we show how to use that block with the new run_pipeline CLI.

@mergify mergify bot added documentation Improvements or additions to documentation testing Relates to testing ci-failure labels Jan 21, 2025
Part of the typical research flow with SDG is to run a single
pipeline, with an input dataset and an output dataset. We need a
simple way to enable this within the SDG repo, ideally without
requiring code to be written just for the cases of running a
Pipeline. This stubs in a simple run_pipeline CLI for just that
purpose.

This also adds an example of how to use the pipeline by documenting
the IterBlock where we show how to use that block with the new
run_pipeline CLI.

Signed-off-by: Ben Browning <[email protected]>
@mergify mergify bot added CI/CD Affects CI/CD configuration and removed ci-failure labels Jan 21, 2025
@bbrowning bbrowning added this to the 0.8.0 milestone Jan 23, 2025
@bbrowning bbrowning requested a review from a team January 27, 2025 19:06
@bbrowning
Copy link
Contributor Author

This is blocking adding additional examples and tests to our repo that will depend on this CLI and the test setup created here, such as adding an annotation pipeline example to demonstrate how SDG can be used for other types of things besides pure data generation.

Copy link
Member

@aakankshaduggal aakankshaduggal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bbrowning
LGTM

@mergify mergify bot added the one-approval label Jan 29, 2025
@mergify mergify bot merged commit 597e372 into instructlab:main Jan 29, 2025
24 checks passed
@mergify mergify bot removed the one-approval label Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI/CD Affects CI/CD configuration documentation Improvements or additions to documentation testing Relates to testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants