-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Task]: Improve Enrichment docs #33012
Comments
@claudevdm this would be good to pick up at some point when you have space (don't drop other things, just when this fits in nicely) |
hello sir , i would like to work on this issue please assign me this one. |
Thanks! Please check https://beam.apache.org/contribute/: Comment “.take-issue” on the issue you'd like to work on. This will cause the issue to be assigned to you. |
.take-issue |
hello sir i am updating the file by adding this section is it correct ? BigQuery SupportThe enrichment transform supports integration with BigQuery to dynamically enrich data using BigQuery datasets. By leveraging BigQuery as an external data source, users can execute efficient lookups for data enrichment directly in their Apache Beam pipelines. To use BigQuery for enrichment:
This integration is particularly beneficial for use cases that require augmenting real-time streaming data with information stored in BigQuery. BatchingTo optimize requests to external services, the enrichment transform uses batching. Instead of performing a lookup for each individual element, the transform groups multiple elements into a batch and performs a single lookup for the entire batch. Advantages of Batching:
Users can configure the batch size by specifying parameters in their pipeline setup. Adjusting the batch size can help fine-tune the balance between throughput and latency. Caching with
|
and i am adding this section to explain Cross join . please tell me if there is any mistake or need of updating. What is a Cross-Join?A cross-join is a Cartesian product operation where each row from one table is combined with every row from another table. It is useful when we want to create all possible combinations of two datasets. Example:
Result of Cross-Join:
Cross-joins can be computationally expensive for large datasets, so use them judiciously. |
@damccorm sir where I can find correct URL to replace the broken one? |
I think that resource may have been deleted, but we could probably link to
The other changes look reasonable to me at a high level, it is probably easier to just go ahead and open a PR when you have a chance though, that will make it a bit easier to see the difference and review it. Thanks for doing this! |
Hi @Vishesh-Tripathi, are you currently working on this issue? I’d like to take it on otherwise. 🙏 |
Ya I am working on the issue |
What needs to happen?
There are a few targeted fixes needed for the Enrichment docs:
example handler with composite row key support
link is deadIssue Priority
Priority: 2 (default / most normal work should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: