(EAI-375): Ingest snooty docs facets and meta #558
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Jira: https://jira.mongodb.org/browse/EAI-375
Changes
meta
andfacets
fields asPage.metadata
Notes
facets.toml
files. There's a DOP ticket to capture thefacets.toml
https://jira.mongodb.org/browse/DOP-5182Experiment Results
Experiment compares using the ingestion pipeline with the new snooty metadata to the previous baseline. The results can be found here: mongodb-chatbot-retrieval/experiments/mongodb-chatbot-retrieval-snooty-metadata
The results actually show a very slight decrease in search quality as a result of these changes:
Next Steps
Based on the results, I think we should take one of the following paths:
pages
collection.My inclination is to proceed with path 2. Leave this to the side, and if we can't the metadata inclusion to improve system performance, then do some light experimentation on path 3, and only do 4 if we can't get it to work.
But then again, maybe I'm overcomplicating and we should just YOLO path 1, shouldn't having meaningful negative impact...