-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline Action support for multiple index pages #12
Comments
@saumier I propose this design. Let me know what you think.
|
@dev-aravind I have a question. Do you see one entity-identifier applied per page-url? In my use case: I am wondering if we need to have a list of entity-identifier or not. Perhaps the entity-identifier could apply to all page-urls and when we come across a need for multiple entity-identifiers then we add a check if entity-identifier is a list and if true we apply one entity-identifier per page-url? But I like the consistency of your solution. So let's go ahead with one entity-identifier applied per page-url as you propose. Please go ahead to implementation of a minor version as long as it remains backwards compatible. |
@saumier You can test this now using our custom crawl test workflow(use enhancement/issue-12 branch). Please let me know if you find any issues, if not we can move on to release this. If you want to test the multiple page crawl, the input should be like:
both without the quotes. |
@dev-aravind I ran a test with theplayhouse.ca and it collected all the pages. Excellent. Please go ahead and release. |
@saumier Done, let me know if you want to add any workflows in orion to use this feature. |
@dev-aravind Please design an enhancement to the Artsdata Pipeline Action that would allow multiple
page-url
andentity-identifier
to be crawled for the same artifact.The test case can be with theplayhouse-ca which has 2 sitemaps: https://theplayhouse.ca/fr/sitemap.xml and https://theplayhouse.ca/en/sitemap.xml. The idea is to load all pages from both sitemaps into the artifact
orion/theplayhouse-ca
The text was updated successfully, but these errors were encountered: