-
Notifications
You must be signed in to change notification settings - Fork 116
Processor Instructions
The idea of processing instructions was introduced in the XML 1.0 specification and drifted into HTML via XHTML. A processor instruction allows a document to contain instructions that are intended to be read by a processor and replaced. The PHP language itself is an example of a processor instruction.
When it came to the HTML5-PHP library, we opted to include support of processor instructions because they can be fruitfully used on the server side. To remain complaint with HTML5, which does not allow processor instructions, you should make sure to remove them from the document before sending the document to a client.
That said, the HTML5-PHP library provides two ways of parsing processor instructions.
- (Default) Insert the processor instructions into the DOM
- Run the instructions through a processor (that you define) and put the results into the DOM.
Here's how those two modes work.
Take the document:
<!DOCTYPE html>
<html>
<?foo bar?>
</html>
The <?foo bar?>
is a processor instruction. Processor instructions start with a <?
, are followed with a node name (foo
in this case), and close with a ?>
.
When this is parsed using \HTML5::loadHTML()
the processor instruction node will be one of \DOMProcessingInstruction
with a nodeName
property of foo
and a data
property of bar
.
There is another way of handling processor instructions. You can process them at parse-time, and replace them in the DOM tree with DOM nodes. (In other words, you can "render" your processing instructions at parse time.) This section explains how that process works.
Processing instructions can be useful when we act on them. For example, manipulating the DOM. The instruction processor takes an instruction and acts on it. An instruction processor is defined by the interface \HTML5\InstructionProcessor
with a single method of process
. For example, let's create a dummy counter.
<?php
use \HTML5\InstructionProcessor
class foo implements InstructionProcessor {
public $bar = 0;
public function process(\DOMElement $element, $name, $data) {
$this->bar++;
return $element;
}
}
This class is really simple. Every time there is a processor instruction a counter is incremented. The element for the processor instruction is returned. The returned element is what is attached to the DOM. If a processing instruction wants to be replaced with a different element, that element should be returned.
The instruction processor needs to be attached to the DOM tree builder to be used. To do this we need a custom parsing function. Because we already have the building blocks this is really quite simple.
function my_parser(\HTML5\Parser\InputStream $input) {
// Create an instance of the processing instruction.
$foo = new foo();
$events = new DOMTreeBuilder();
// Attach it to the event based DOM tree builder.
$events->setInstructionProcessor($foo);
$scanner = new Scanner($input);
$parser = new Tokenizer($scanner, $events);
$parser->parse();
return $events->document();
}
To parse the document use my_parser
instead of one of the built in parsers and the instruction processor will be called for each one.
For more details on how this works take a peak inside of \HTML5\Parser\DOMTreeBuilder
.