-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IBX-7987: Node filter for extraction text #155
Open
adamwojs
wants to merge
3
commits into
4.6
Choose a base branch
from
filter_node_for_text_extraction
base: 4.6
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 1 commit
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
14 changes: 14 additions & 0 deletions
14
src/contracts/RichText/TextExtractor/NodeFilterFactoryInterface.php
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
<?php | ||
|
||
/** | ||
* @copyright Copyright (C) Ibexa AS. All rights reserved. | ||
* @license For full copyright and license information view LICENSE file distributed with this source code. | ||
*/ | ||
declare(strict_types=1); | ||
|
||
namespace Ibexa\Contracts\FieldTypeRichText\RichText\TextExtractor; | ||
|
||
interface NodeFilterFactoryInterface | ||
{ | ||
public function createPathFilter(string ...$path): NodeFilterInterface; | ||
} |
22 changes: 22 additions & 0 deletions
22
src/contracts/RichText/TextExtractor/NodeFilterInterface.php
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
<?php | ||
|
||
/** | ||
* @copyright Copyright (C) Ibexa AS. All rights reserved. | ||
* @license For full copyright and license information view LICENSE file distributed with this source code. | ||
*/ | ||
declare(strict_types=1); | ||
|
||
namespace Ibexa\Contracts\FieldTypeRichText\RichText\TextExtractor; | ||
|
||
use DOMNode; | ||
|
||
/** | ||
* Filters nodes for text extraction. | ||
*/ | ||
interface NodeFilterInterface | ||
{ | ||
/** | ||
* Return false to preserve the node, true to remove it. | ||
*/ | ||
public function filter(DOMNode $node): bool; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
37 changes: 37 additions & 0 deletions
37
src/lib/RichText/TextExtractor/NodeFilter/AggregateFilter.php
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
<?php | ||
|
||
/** | ||
* @copyright Copyright (C) Ibexa AS. All rights reserved. | ||
* @license For full copyright and license information view LICENSE file distributed with this source code. | ||
*/ | ||
declare(strict_types=1); | ||
|
||
namespace Ibexa\FieldTypeRichText\RichText\TextExtractor\NodeFilter; | ||
|
||
use DOMNode; | ||
use Ibexa\Contracts\FieldTypeRichText\RichText\TextExtractor\NodeFilterInterface; | ||
|
||
final class AggregateFilter implements NodeFilterInterface | ||
{ | ||
/** @var \Ibexa\Contracts\FieldTypeRichText\RichText\TextExtractor\NodeFilterInterface[] */ | ||
private iterable $filters; | ||
|
||
/** | ||
* @param \Ibexa\Contracts\FieldTypeRichText\RichText\TextExtractor\NodeFilterInterface[]|iterable $filters | ||
*/ | ||
public function __construct(iterable $filters) | ||
{ | ||
$this->filters = $filters; | ||
} | ||
|
||
public function filter(DOMNode $node): bool | ||
{ | ||
foreach ($this->filters as $filter) { | ||
if ($filter->filter($node)) { | ||
return true; | ||
} | ||
} | ||
|
||
return false; | ||
} | ||
} |
20 changes: 20 additions & 0 deletions
20
src/lib/RichText/TextExtractor/NodeFilter/NodeFilterFactory.php
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
<?php | ||
|
||
/** | ||
* @copyright Copyright (C) Ibexa AS. All rights reserved. | ||
* @license For full copyright and license information view LICENSE file distributed with this source code. | ||
*/ | ||
declare(strict_types=1); | ||
|
||
namespace Ibexa\FieldTypeRichText\RichText\TextExtractor\NodeFilter; | ||
|
||
use Ibexa\Contracts\FieldTypeRichText\RichText\TextExtractor\NodeFilterFactoryInterface; | ||
use Ibexa\Contracts\FieldTypeRichText\RichText\TextExtractor\NodeFilterInterface; | ||
|
||
final class NodeFilterFactory implements NodeFilterFactoryInterface | ||
{ | ||
public function createPathFilter(string ...$path): NodeFilterInterface | ||
{ | ||
return new NodePathFilter(...$path); | ||
} | ||
} |
40 changes: 40 additions & 0 deletions
40
src/lib/RichText/TextExtractor/NodeFilter/NodePathFilter.php
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
<?php | ||
|
||
/** | ||
* @copyright Copyright (C) Ibexa AS. All rights reserved. | ||
* @license For full copyright and license information view LICENSE file distributed with this source code. | ||
*/ | ||
declare(strict_types=1); | ||
|
||
namespace Ibexa\FieldTypeRichText\RichText\TextExtractor\NodeFilter; | ||
|
||
use DOMNode; | ||
use Ibexa\Contracts\FieldTypeRichText\RichText\TextExtractor\NodeFilterInterface; | ||
|
||
final class NodePathFilter implements NodeFilterInterface | ||
{ | ||
/** | ||
* Path in reverse order. | ||
* | ||
* @var string[] | ||
*/ | ||
private array $path; | ||
|
||
public function __construct(string ...$path) | ||
{ | ||
$this->path = array_reverse($path); | ||
} | ||
|
||
public function filter(DOMNode $node): bool | ||
{ | ||
foreach ($this->path as $name) { | ||
if ($node === null || $node->nodeName !== $name) { | ||
return false; | ||
} | ||
|
||
$node = $node->parentNode; | ||
} | ||
|
||
return true; | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
33 changes: 33 additions & 0 deletions
33
tests/lib/RichText/TextExtractor/NodeFilter/AggregateFilterTest.php
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
<?php | ||
|
||
/** | ||
* @copyright Copyright (C) Ibexa AS. All rights reserved. | ||
* @license For full copyright and license information view LICENSE file distributed with this source code. | ||
*/ | ||
declare(strict_types=1); | ||
|
||
namespace Ibexa\Tests\FieldTypeRichText\RichText\TextExtractor\NodeFilter; | ||
|
||
use DOMNode; | ||
use Ibexa\Contracts\FieldTypeRichText\RichText\TextExtractor\NodeFilterInterface; | ||
use Ibexa\FieldTypeRichText\RichText\TextExtractor\NodeFilter\AggregateFilter; | ||
use PHPUnit\Framework\TestCase; | ||
|
||
final class AggregateFilterTest extends TestCase | ||
{ | ||
public function testFilter(): void | ||
{ | ||
$node = $this->createMock(DOMNode::class); | ||
|
||
$filterA = $this->createMock(NodeFilterInterface::class); | ||
$filterA->expects(self::once())->method('filter')->with($node)->willReturn(false); | ||
$filterB = $this->createMock(NodeFilterInterface::class); | ||
$filterB->expects(self::once())->method('filter')->with($node)->willReturn(true); | ||
$filterC = $this->createMock(NodeFilterInterface::class); | ||
$filterC->expects(self::never())->method('filter'); | ||
|
||
$aggregateFilter = new AggregateFilter([$filterA, $filterB, $filterC]); | ||
|
||
self::assertTrue($aggregateFilter->filter($node)); | ||
} | ||
} |
47 changes: 47 additions & 0 deletions
47
tests/lib/RichText/TextExtractor/NodeFilter/NodePathFilterTest.php
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
<?php | ||
|
||
/** | ||
* @copyright Copyright (C) Ibexa AS. All rights reserved. | ||
* @license For full copyright and license information view LICENSE file distributed with this source code. | ||
*/ | ||
declare(strict_types=1); | ||
|
||
namespace Ibexa\Tests\FieldTypeRichText\RichText\TextExtractor\NodeFilter; | ||
|
||
use DOMDocument; | ||
use DOMNode; | ||
use DOMNodeList; | ||
use DOMXPath; | ||
use Ibexa\FieldTypeRichText\RichText\TextExtractor\NodeFilter\NodePathFilter; | ||
use PHPUnit\Framework\TestCase; | ||
use RuntimeException; | ||
|
||
final class NodePathFilterTest extends TestCase | ||
{ | ||
public function testFilter(): void | ||
{ | ||
$document = new DOMDocument(); | ||
$document->loadXML('<a><b><c></c></b></a>'); | ||
|
||
$nodeA = $this->getNode($document, '//a'); | ||
$nodeB = $this->getNode($document, '//b'); | ||
$nodeC = $this->getNode($document, '//c'); | ||
|
||
self::assertFalse((new NodePathFilter('b', 'c'))->filter($nodeB)); | ||
self::assertTrue((new NodePathFilter('b', 'c'))->filter($nodeC)); | ||
self::assertFalse((new NodePathFilter('a', 'b', 'c', 'd'))->filter($nodeA)); | ||
} | ||
|
||
private function getNode(DOMDocument $document, string $expression): DOMNode | ||
{ | ||
$xpath = new DOMXPath($document); | ||
|
||
$results = $xpath->query($expression); | ||
if ($results instanceof DOMNodeList) { | ||
/** @var \DOMNode */ | ||
return $results->item(0); | ||
} | ||
|
||
throw new RuntimeException("Expression '$expression' did not return a node."); | ||
adamwojs marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO the result of false/true combined with
filter
might be a bit misleading. My first thought was that it should work the opposite way. Maybe changing it tofilterOut
would be more clear?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personalny stand with the original naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to agree with @alongosz here.
Most methods (
ArrayCollection::filter
) and functions (array_filter
) work in the opposite way:true
, an entry is preservedfalse
, an entry is removedTherefore, I'd suggest reversing the logic to comply with generally established PHP practice.