Skip to content

TokenAnalyzer Proposal

Matthias Molitor edited this page Mar 24, 2012 · 2 revisions

TokenAnalyzer Proposal (performance improvement)

Currently the analyzer traverses the token list many times. Check if it is possible to decrease the effort if the tokens are indexed.

The index should assign all token occurences to its type.

The index of the following list...

array(
    '1',
    '2',
    '3',
    '2',
    '1'
);

... would look like that:

array(
    '1' => array(0, 4),
    '2' => array(1, 3),
    '3' => array(2)
);

At least in some cases it should be possible to increase the performance. For example searching for a matching brace does not require a large iteration anymore:

// A token index.
$index = array(
    '{' => array(0, 2, 7),
    '}' => array(9, 13, 20)
);
// Search for matching brace at position 0:
$posInIndex = array_search(0, $index['{']);
$candidates = array_reverse($index['}']);
$matching   = $candidates[$posInIndex];

One should check if its also possible to use simple algorithms in more complex scenarios. The above is just an idea and does not handle real life scenarios where braces are not only nested.

Clone this wiki locally