Gumbo PHP is low-level extension for HTML5 parsing.
Gumbo PHP builds DOMDocument using Gumbo HTML5 Parser. This solution solves all problems with HTML5 parsing or pages with inline JavaScript.
use Layershifter\Gumbo\Parser;
$document = Parser::load('<a>Apples and bananas.</a>');
var_dump($document->saveHTML());
string(33) "<a>Apples and bananas.</a>
"
The following versions of PHP are supported.
- PHP 5.6
- PHP 7.0
To build gumbo-php
extenstion PHP-devel package is required. The package should contain phpize utility.
$ git clone https://github.com/layershifter/gumbo-php.git
$ cd gumbo-php
$ phpize
$ ./configure
$ make
$ make install
This will build a 'gumbo.so' shared extension, load it in php.ini using:
[gumbo]
extension = gumbo.so
- double encoding of entities (#6)
$doc = \Layershifter\Gumbo\Parser::load('<h1>Hello world!</h1>');
var_dump($doc->saveHTML());
string "<h1>Hello&nbsp;world!</h1>"
$ composer install
$ composer test
SORGE - website tracking tool |
This library is released under the Apache 2.0 license. Please see License File for more information.