Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOMXPath::quote(string $str): string #13456

Merged
merged 19 commits into from
Feb 22, 2024
Merged

Conversation

divinity76
Copy link
Contributor

method to quote strings in XPath,
similar to PDO::quote() / mysqli::real_escape_string()

sample usage:

$xp->query("//span[contains(text()," . $xp->quote($string) . ")]")

some variant of this is useful in HTML scraping scenarios.

the algorithm is derived from Robert Rossney's research into XPath quoting published at https://stackoverflow.com/a/1352556/1067003 (but using an improved implementation I wrote myself, originally for chrome-php/chrome#575 )

method to quote strings in XPath,
similar to PDO::quote() / mysqli::real_escape_string

sample usage: $xp->query("//span[contains(text()," . $xp->quote($string) . ")]")

the algorithm is derived from Robert Rossney's research into XPath quoting published at https://stackoverflow.com/a/1352556/1067003
(but using an improved implementation I wrote myself, originally for chrome-php/chrome#575 )
ext/dom/tests/DOMXPath_quote.phpt Show resolved Hide resolved
ext/dom/tests/DOMXPath_quote.phpt Outdated Show resolved Hide resolved
ext/dom/xpath.c Outdated Show resolved Hide resolved
Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this patch! I just have some minor comments, it looks mostly good.
I think it's a useful addition that makes a lot of sense.

ext/dom/xpath.c Outdated Show resolved Hide resolved
ext/dom/xpath.c Outdated Show resolved Hide resolved
ext/dom/xpath.c Outdated Show resolved Hide resolved
ext/dom/tests/DOMXPath_quote.phpt Show resolved Hide resolved
ext/dom/xpath.c Outdated Show resolved Hide resolved
ext/dom/xpath.c Outdated Show resolved Hide resolved
ext/dom/tests/DOMXPath_quote.phpt Show resolved Hide resolved
ext/dom/tests/DOMXPath_quote.phpt Show resolved Hide resolved
ext/dom/xpath.c Outdated Show resolved Hide resolved
divinity76 and others added 6 commits February 22, 2024 00:13
Co-authored-by: Niels Dossche <[email protected]>
requested in PR feedback :o
Co-authored-by: Niels Dossche <[email protected]>
Co-authored-by: Niels Dossche <[email protected]>
Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 more minor things and then it's good!
Thanks for your work.

ext/dom/xpath.c Show resolved Hide resolved
ext/dom/xpath.c Outdated Show resolved Hide resolved
* @param string $string string to quote.
* @return string quoted string.
*/
function UserlandDOMXPathQuote(string $string): string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense trying to add this to the fuzzer to see if it finds differences in behaviour?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't know what fuzzer you're talking about, but sounds good because if there is any difference, it's probably a bug in the C version 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could fuzz this, but it will be a bit cumbersome to write a specific driver for this. We could ask ourselves whether we want to do it generalised for all these kinds of quoting methods.

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, tests pass. Tonight I will throw this in a fuzzer to find crashes and merge it if it's fine. Thanks for your work!

@nielsdos
Copy link
Member

I've let it fuzz to search for crashes for a while now. It tried about 23 million cases and all seems good.

If anyone is interested, here's the fuzz harness: https://gist.github.com/nielsdos/44981a753808129b0222f3ef28429c69
Compile & run with: clang tmp.c -O1 -g -o ./tmp -I./main -I./Zend -I. -I./TSRM -fsanitize=fuzzer,address && ./tmp

Basically, I just copied the function's code and made some wrappers for necessary functionality or copied some stuff over from the Zend code into the harness. That way I can run it without having to start the interpreter, which is easier.

@nielsdos nielsdos merged commit 2f9320c into php:master Feb 22, 2024
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants