-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IBX-7987: Node filter for extraction text #155
base: 4.6
Are you sure you want to change the base?
Conversation
b77b0f1
to
8193162
Compare
ef8a6e0
to
f9aaba8
Compare
f9aaba8
to
40499a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting approach and convenient usage. LGTM.
tests/lib/RichText/TextExtractor/NodeFilter/NodePathFilterTest.php
Outdated
Show resolved
Hide resolved
/** | ||
* Return false to preserve the node, true to remove it. | ||
*/ | ||
public function filter(DOMNode $node): bool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO the result of false/true combined with filter
might be a bit misleading. My first thought was that it should work the opposite way. Maybe changing it to filterOut
would be more clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personalny stand with the original naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to agree with @alongosz here.
Most methods (ArrayCollection::filter
) and functions (array_filter
) work in the opposite way:
- when
true
, an entry is preserved - when
false
, an entry is removed
Therefore, I'd suggest reversing the logic to comply with generally established PHP practice.
….php Co-authored-by: Andrew Longosz <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Before giving approval, would love to see a test that includes both FullTextExtractor
and filters that you're using, as we're solving a specific need - to filter out nodes when working with FullTextExtractor
, right?
Quality Gate passedIssues Measures |
/** | ||
* Return false to preserve the node, true to remove it. | ||
*/ | ||
public function filter(DOMNode $node): bool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to agree with @alongosz here.
Most methods (ArrayCollection::filter
) and functions (array_filter
) work in the opposite way:
- when
true
, an entry is preserved - when
false
, an entry is removed
Therefore, I'd suggest reversing the logic to comply with generally established PHP practice.
Background
See JIRA issue.
Node filtering
This PR introduced extension point allowing to exclude nodes from text extraction:
\Ibexa\Contracts\FieldTypeRichText\RichText\TextExtractor\NodeFilterInterface
implementations should be tagged viaibexa.field_type.richtext.text_extractor.node_filter
Common use case
\Ibexa\Contracts\FieldTypeRichText\RichText\TextExtractor\NodeFilterFactoryInterface
1 covers the common use case: excluding node with given path, e.g./foo/bar/baz
.Usage example:
1 The Factory pattern prevents the exposure of implementation details within contracts
Nodes excluded out of the box
The following nodes are excluded by default (xpath syntax):
//eztemplate/ezconfig
Notes for QA
Identify other potential elements which should be excluded from full text index
TODO:
$ composer fix-cs
).