Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DYN-6055 Lucene Search Category Based. #14663

Merged
merged 3 commits into from
Dec 11, 2023
Merged

DYN-6055 Lucene Search Category Based. #14663

merged 3 commits into from
Dec 11, 2023

Conversation

RobertGlobant20
Copy link
Contributor

@RobertGlobant20 RobertGlobant20 commented Nov 30, 2023

Purpose

Implemeting Lucene Search Category Based.
I've updated the Lucene Search in a way that if the user type the "." character then the search will be "category based" (e.g. the search criteria "list.r" will find all the nodes which belong to the list category and the node name starts with r).
For this implementation I've indexed two new fields: NameSplitted and CategorySplitted. For NameSplitted when the node name contains the Category (like List.Shop) then we will be using the last part (after the "." character), the same case for CategorySplitted, we will be using the last part after the "." character.

Declarations

Check these if you believe they are true

  • The codebase is in a better state after this PR
  • Is documented according to the standards
  • The level of testing this PR includes is appropriate
  • User facing strings, if any, are extracted into *.resx files
  • All tests pass using the self-service CI.
  • Snapshot of UI changes, if any.
  • Changes to the API follow Semantic Versioning and are documented in the API Changes document.
  • This PR modifies some build requirements and the readme is updated
  • This PR contains no files larger than 50 MB

Release Notes

Implemeting Lucene Search Category Based.

Reviewers

@QilongTang

FYIs

@Amoursol

I've updated the Lucene Search in a way that if the user type the "." character then the search will be "category based" (e.g. the search criteria "list.r" will find all the nodes which belong to the list category and the node name starts with r).
For this implementation I've indexed two new fields: NameSplitted and CategorySplitted. For NameSplitted when the node name contains the Category (like List.Shop) then we will be using the last part (after the "." character), the same case for CategorySplitted, we will be using the last part after the "." character.
@RobertGlobant20
Copy link
Contributor Author

This the GIF showing the added search behavior.
LuceneSearchCategoryBased

@QilongTang QilongTang added this to the 3.0 milestone Dec 5, 2023
f != nameof(LuceneConfig.NodeFieldsEnum.CategorySplitted))
continue;

var categorySearchBased = searchTerm.Split('.');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be an edge case where we have, A.B.C? So the category would be A, and node name is B.C?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QilongTang for this implementation I'm just considering the base "category.node" search criteria, if we want for support more complex categories we will need think in another solution.
For this case I can add some code to prevent a search of the type "A.B.C" if that's the case we won't return results, what do you think?

Copy link
Contributor

@QilongTang QilongTang Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @RobertGlobant20 Not necessarily, I am trying to confirm that when the user search for A.B.C, with this current code, we are searching for B.C node under the category A right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the validation for supporting the criteria A.B.C, not sure if this is what you expected (if not please put a more detailed example), see the GIF attached with the behavior.
commit: a3f7bd6

PackagesGuideBugFix2

@@ -317,19 +355,31 @@ private WildcardQuery CalculateFieldWeight(string fieldName, string searchTerm,
{
WildcardQuery query;

var termText = fieldName == nameof(LuceneConfig.NodeFieldsEnum.NameSplitted) ? searchTerm + "*" : "*" + searchTerm + "*";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you add some comments on this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment in the next commit: a3f7bd6


var categoryParts = node.FullCategoryName.Split('.');
string categoryParsed = categoryParts.Length > 1 ? categoryParts[categoryParts.Length - 1] : node.FullCategoryName;
SetDocumentFieldValue(doc, nameof(LuceneConfig.NodeFieldsEnum.CategorySplitted), categoryParsed);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you add some comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment in the next commit: a3f7bd6

Copy link
Contributor

@QilongTang QilongTang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, need some more comments for book keeping and could use a unit test if doable

I've added more comments
I've changed the validation for always taking the last two sections (the NameSplitted can be empty due that later there is a validation) so if the search criteria use a large category like "Core.File.FileSystem.A". it will take only the last two sections.
@RobertGlobant20
Copy link
Contributor Author

Overall looks good, need some more comments for book keeping and could use a unit test if doable

@QilongTang does it mean that I need to add a unit test for this behavior?

Copy link

github-actions bot commented Dec 6, 2023

UI Smoke Tests

Test: success. 2 passed, 0 failed.
TestComplete Test Result
Workflow Run: UI Smoke Tests
Check: UI Smoke Tests - net8.0

Copy link

github-actions bot commented Dec 6, 2023

UI Smoke Tests

Test: success. 2 passed, 0 failed.
TestComplete Test Result
Workflow Run: UI Smoke Tests
Check: UI Smoke Tests - net6.0

@QilongTang
Copy link
Contributor

Overall looks good, need some more comments for book keeping and could use a unit test if doable

@QilongTang does it mean that I need to add a unit test for this behavior?

Please do

Adding a unit test that will validate category search based.
@RobertGlobant20
Copy link
Contributor Author

Overall looks good, need some more comments for book keeping and could use a unit test if doable

@QilongTang does it mean that I need to add a unit test for this behavior?

Please do

@QilongTang a new unit test for validate this functionality was added, please check the next commit: d38e79a

Copy link

github-actions bot commented Dec 7, 2023

UI Smoke Tests

Test: success. 2 passed, 0 failed.
TestComplete Test Result
Workflow Run: UI Smoke Tests
Check: UI Smoke Tests - net6.0

Copy link

github-actions bot commented Dec 7, 2023

UI Smoke Tests

Test: success. 2 passed, 0 failed.
TestComplete Test Result
Workflow Run: UI Smoke Tests
Check: UI Smoke Tests - net8.0

Copy link
Contributor

@QilongTang QilongTang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@QilongTang QilongTang merged commit eaff9b3 into DynamoDS:master Dec 11, 2023
22 checks passed
QilongTang pushed a commit that referenced this pull request Dec 11, 2023
* DYN-6055 Lucene Search Category Based.

I've updated the Lucene Search in a way that if the user type the "." character then the search will be "category based" (e.g. the search criteria "list.r" will find all the nodes which belong to the list category and the node name starts with r).
For this implementation I've indexed two new fields: NameSplitted and CategorySplitted. For NameSplitted when the node name contains the Category (like List.Shop) then we will be using the last part (after the "." character), the same case for CategorySplitted, we will be using the last part after the "." character.

* DYN-6055 Lucene Search Category Based Code Review

I've added more comments
I've changed the validation for always taking the last two sections (the NameSplitted can be empty due that later there is a validation) so if the search criteria use a large category like "Core.File.FileSystem.A". it will take only the last two sections.

* DYN-6055 Lucene Search Category Based Code Review

Adding a unit test that will validate category search based.
@QilongTang QilongTang mentioned this pull request Dec 11, 2023
9 tasks
QilongTang added a commit that referenced this pull request Dec 12, 2023
* DYN-6524 Packages Tour Regression (#14721)

I did several updates for the Packages guide:
I've splitted the functionality of collapseExpandPackage so now we will have one method for collapse and another one for expand, so the function collapseExpandPackage was renamed to expandPackageDiv.
Some Steps in the dynamo_guides.json were modified to stick to the functionality of expanding the package when passing from Step8 to Step9.
Finally I added a validation in the function that highlight a div so if the div is already highlighted when we don't add the functionality ( this will prevent showing the orange rectangle when closing the guide ).

* DYN-6055 Lucene Search Category Based. (#14663)

* DYN-6055 Lucene Search Category Based.

I've updated the Lucene Search in a way that if the user type the "." character then the search will be "category based" (e.g. the search criteria "list.r" will find all the nodes which belong to the list category and the node name starts with r).
For this implementation I've indexed two new fields: NameSplitted and CategorySplitted. For NameSplitted when the node name contains the Category (like List.Shop) then we will be using the last part (after the "." character), the same case for CategorySplitted, we will be using the last part after the "." character.

* DYN-6055 Lucene Search Category Based Code Review

I've added more comments
I've changed the validation for always taking the last two sections (the NameSplitted can be empty due that later there is a validation) so if the search criteria use a large category like "Core.File.FileSystem.A". it will take only the last two sections.

* DYN-6055 Lucene Search Category Based Code Review

Adding a unit test that will validate category search based.

* Fixing Search Regressions (#14738)

For the test SearchingForACategoryReturnsAllItsChildren now with my changes related to category based search the term "Category.Child" will be searching the nodes with Category = Category and Name=Child so that's why was not returning results adding an extra "." in the search term fixed it.

The same case for the test LuceneSearchNodesByCategoryValidation, was expecting all the nodes under the category "Core.Input" so we have to add an extra "." in the search term.

For the test LuceneSearchNodesOrderingValidation I have to change a node due that with my changes the order changed and now is one position below (for the search term "list.join" we only guarante that the items at the top will be Category = list and the name starts with "join").

* mark test failure

* Add selection handler after binding. (#14744)

---------

Co-authored-by: Roberto T <[email protected]>
Co-authored-by: Trygve Wastvedt <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants