Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge performance improvements for taint analysis #11342

Merged
merged 44 commits into from
Mar 11, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
2a2ac99
Refactor taints to use bitmaps
danog Feb 26, 2025
0dc132a
Finalize
danog Feb 26, 2025
4f8725e
Finalize
danog Feb 26, 2025
ac46aa8
Cleanup
danog Feb 26, 2025
ce1513b
Cleanup
danog Feb 27, 2025
aeab154
Cleanup
danog Feb 27, 2025
4230cf5
Simplify
danog Feb 27, 2025
4a44d55
Cleanup
danog Feb 27, 2025
fffe341
Cleanup
danog Feb 27, 2025
0d7f3c5
Cleanup
danog Feb 27, 2025
e632fba
Fix
danog Feb 27, 2025
f65e027
More cleanup
danog Feb 27, 2025
487feb3
Merge remote-tracking branch 'origin/6.x' into refactor_taints
danog Feb 27, 2025
6ea418c
Simplify
danog Feb 27, 2025
17d2d23
Fix
danog Feb 27, 2025
aeaf0b0
Cleanup
danog Feb 27, 2025
eceba2e
Performance refactoring
danog Feb 28, 2025
8982e90
Merge remote-tracking branch 'origin/master' into refactor_taints
danog Mar 3, 2025
5f0d74e
Finalize refactoring
danog Mar 4, 2025
a032b3d
Finalize
danog Mar 4, 2025
eeff979
Finalize
danog Mar 4, 2025
a4f8a56
Finalize
danog Mar 4, 2025
1179930
Bump docs
danog Mar 4, 2025
b01faad
Finalize
danog Mar 4, 2025
1b83acb
Final performance fix
danog Mar 4, 2025
bf39498
Finalize
danog Mar 5, 2025
aa23279
Fixes
danog Mar 5, 2025
bda7a57
cs-fix
danog Mar 5, 2025
b513b32
Tmp
danog Mar 7, 2025
53d4113
Fix
danog Mar 7, 2025
fd5d042
fix
danog Mar 7, 2025
e04dda9
Fixes
danog Mar 10, 2025
abcfaab
Fixes
danog Mar 10, 2025
0e7c2d4
Fix
danog Mar 10, 2025
0f32350
Bump upgrading
danog Mar 10, 2025
f78967c
Fix
danog Mar 10, 2025
1cf2511
Fixes
danog Mar 10, 2025
e26e62d
Fix
danog Mar 10, 2025
399383b
Fix
danog Mar 10, 2025
89f9112
Improvement
danog Mar 10, 2025
1844ba5
Fix
danog Mar 11, 2025
f10a077
Finalize
danog Mar 11, 2025
91a1699
cs-fix
danog Mar 11, 2025
caafc3c
Revert
danog Mar 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions UPGRADING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,22 @@

## Changed

- [BC] Taints are now *internally* represented by a bitmap (an integer), instead of an array of strings. Users can still use the usual string taint identifiers (including custom ones, which will be automatically registered by Psalm), but internally, the type of `Psalm\Type\TaintKind` taint types is now an integer.

- [BC] The maximum number of usable taint *types* (including both native taints and custom taints) is now equal to 32 on 32-bit systems and 64 on 64-bit systems: this should be enough for the vast majority of usecases, if more taint types are needed, consider merging some taint types or using some native taint types.

- [BC] `Psalm\Plugin\EventHandler\AddTaintsInterface::addTaints` and `Psalm\Plugin\EventHandler\RemoveTaintsInterface::removeTaints` now must return an integer taint instead of an array of strings (see the new [taint documentation](https://psalm.dev/docs/security_analysis/custom_taint_sources/) for more info).

- [BC] The type of the `$taints` parameter of `Psalm\Codebase::addTaintSource` and `Psalm\Codebase::addTaintSink` was changed to an integer

- [BC] Type of property `Psalm\Storage\FunctionLikeParameter::$sinks` changed from `array|null` to `int`

- [BC] Type of property `Psalm\Storage\FunctionLikeStorage::$taint_source_types` changed from `array` to `int`

- [BC] Type of property `Psalm\Storage\FunctionLikeStorage::$added_taints` changed from `array` to `int`

- [BC] Type of property `Psalm\Storage\FunctionLikeStorage::$removed_taints` changed from `array` to `int`

- [BC] The `startScanningFiles`, `startAnalyzingFiles`, `startAlteringFiles` of `Psalm\Progress\Progress` and subclasses were removed and replaced with a new `startPhase` method, taking a `Psalm\Progress\Phase` enum case.

- [BC] The `start` method was removed, use `expand`, instead; the progress is reset to 0 when changing the current phase.
Expand Down
106 changes: 53 additions & 53 deletions dictionaries/InternalTaintSinkMap.php
Original file line number Diff line number Diff line change
Expand Up @@ -5,59 +5,59 @@
// This maps internal function names to sink types that we don’t want to end up there

/**
* @var non-empty-array<string, non-empty-list<list<TaintKind::*>>>
* @var non-empty-array<string, non-empty-list<int-mask-of<TaintKind::*>>>
*/
return [
'exec' => [['shell']],
'create_function' => [[], ['eval']],
'file_get_contents' => [['file']],
'file_put_contents' => [['file']],
'fopen' => [['file']],
'unlink' => [['file']],
'copy' => [['file'], ['file']],
'file' => [['file']],
'link' => [['file'], ['file']],
'mkdir' => [['file']],
'move_uploaded_file' => [['file'], ['file']],
'parse_ini_file' => [['file']],
'chown' => [['file']],
'lchown' => [['file']],
'readfile' => [['file']],
'rename' => [['file'], ['file']],
'rmdir' => [['file']],
'header' => [['header']],
'symlink' => [['file']],
'tempnam' => [['file']],
'igbinary_unserialize' => [['unserialize']],
'ldap_search' => [[], ['ldap'], ['ldap']],
'mysqli_query' => [[], ['sql']],
'mysqli::query' => [['sql']],
'mysqli_real_query' => [[], ['sql']],
'mysqli::real_query' => [['sql']],
'mysqli_multi_query' => [[], ['sql']],
'mysqli::multi_query' => [['sql']],
'mysqli_prepare' => [[], ['sql']],
'mysqli::prepare' => [['sql']],
'mysqli_stmt::__construct' => [[], ['sql']],
'mysqli_stmt_prepare' => [[], ['sql']],
'mysqli_stmt::prepare' => [['sql']],
'passthru' => [['shell']],
'pcntl_exec' => [['shell']],
'pg_exec' => [[], ['sql']],
'pg_prepare' => [[], [], ['sql']],
'pg_put_line' => [[], ['sql']],
'pg_query' => [[], ['sql']],
'pg_query_params' => [[], ['sql']],
'pg_send_prepare' => [[], [], ['sql']],
'pg_send_query' => [[], ['sql']],
'pg_send_query_params' => [[], ['sql'], []],
'setcookie' => [['cookie'], ['cookie']],
'shell_exec' => [['shell']],
'system' => [['shell']],
'unserialize' => [['unserialize']],
'popen' => [['shell']],
'proc_open' => [['shell']],
'curl_init' => [['ssrf']],
'curl_setopt' => [[], [], ['ssrf']],
'getimagesize' => [['ssrf']],
'exec' => [TaintKind::INPUT_SHELL],
'create_function' => [0, TaintKind::INPUT_EVAL],
'file_get_contents' => [TaintKind::INPUT_FILE],
'file_put_contents' => [TaintKind::INPUT_FILE],
'fopen' => [TaintKind::INPUT_FILE],
'unlink' => [TaintKind::INPUT_FILE],
'copy' => [TaintKind::INPUT_FILE, TaintKind::INPUT_FILE],
'file' => [TaintKind::INPUT_FILE],
'link' => [TaintKind::INPUT_FILE, TaintKind::INPUT_FILE],
'mkdir' => [TaintKind::INPUT_FILE],
'move_uploaded_file' => [TaintKind::INPUT_FILE, TaintKind::INPUT_FILE],
'parse_ini_file' => [TaintKind::INPUT_FILE],
'chown' => [TaintKind::INPUT_FILE],
'lchown' => [TaintKind::INPUT_FILE],
'readfile' => [TaintKind::INPUT_FILE],
'rename' => [TaintKind::INPUT_FILE, TaintKind::INPUT_FILE],
'rmdir' => [TaintKind::INPUT_FILE],
'header' => [TaintKind::INPUT_HEADER],
'symlink' => [TaintKind::INPUT_FILE],
'tempnam' => [TaintKind::INPUT_FILE],
'igbinary_unserialize' => [TaintKind::INPUT_UNSERIALIZE],
'ldap_search' => [0, TaintKind::INPUT_LDAP, TaintKind::INPUT_LDAP],
'mysqli_query' => [0, TaintKind::INPUT_SQL],
'mysqli::query' => [TaintKind::INPUT_SQL],
'mysqli_real_query' => [0, TaintKind::INPUT_SQL],
'mysqli::real_query' => [TaintKind::INPUT_SQL],
'mysqli_multi_query' => [0, TaintKind::INPUT_SQL],
'mysqli::multi_query' => [TaintKind::INPUT_SQL],
'mysqli_prepare' => [0, TaintKind::INPUT_SQL],
'mysqli::prepare' => [TaintKind::INPUT_SQL],
'mysqli_stmt::__construct' => [0, TaintKind::INPUT_SQL],
'mysqli_stmt_prepare' => [0, TaintKind::INPUT_SQL],
'mysqli_stmt::prepare' => [TaintKind::INPUT_SQL],
'passthru' => [TaintKind::INPUT_SHELL],
'pcntl_exec' => [TaintKind::INPUT_SHELL],
'pg_exec' => [0, TaintKind::INPUT_SQL],
'pg_prepare' => [0, 0, TaintKind::INPUT_SQL],
'pg_put_line' => [0, TaintKind::INPUT_SQL],
'pg_query' => [0, TaintKind::INPUT_SQL],
'pg_query_params' => [0, TaintKind::INPUT_SQL],
'pg_send_prepare' => [0, 0, TaintKind::INPUT_SQL],
'pg_send_query' => [0, TaintKind::INPUT_SQL],
'pg_send_query_params' => [0, TaintKind::INPUT_SQL, 0],
'setcookie' => [TaintKind::INPUT_COOKIE, TaintKind::INPUT_COOKIE],
'shell_exec' => [TaintKind::INPUT_SHELL],
'system' => [TaintKind::INPUT_SHELL],
'unserialize' => [TaintKind::INPUT_UNSERIALIZE],
'popen' => [TaintKind::INPUT_SHELL],
'proc_open' => [TaintKind::INPUT_SHELL],
'curl_init' => [TaintKind::INPUT_SSRF],
'curl_setopt' => [0, 0, TaintKind::INPUT_SSRF],
'getimagesize' => [TaintKind::INPUT_SSRF],
];
53 changes: 47 additions & 6 deletions docs/security_analysis/custom_taint_sources.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,29 +26,70 @@ For example this plugin treats all variables named `$bad_data` as taint sources.
namespace Psalm\Example\Plugin;

use PhpParser\Node\Expr\Variable;
use Psalm\Codebase;
use Psalm\Plugin\EventHandler\AddTaintsInterface;
use Psalm\Plugin\EventHandler\Event\AddRemoveTaintsEvent;
use Psalm\Type\TaintKindGroup;
use Psalm\Type\TaintKind;

/**
* Add input taints to all variables named 'bad_data'
* Add input taints to all variables named 'bad_data' or 'even_badder_data'.
*
* RemoveTaintsInterface is also available to remove taints.
*/
class TaintBadDataPlugin implements AddTaintsInterface
{
private static int $myCustomTaint;
private static int $myCustomTaintAlias;
/**
* Must be called by the PluginEntryPointInterface (__invoke) of your plugin.
*/
public static function init(Codebase $codebase): void
{
// Register a new custom taint
// The taint name may be used in @psalm-taint-* annotations in the code.
self::$myCustomTaint = $codebase->getOrRegisterTaint("my_custom_taint");

// Register a taint alias that combines multiple pre-registered taint types
// Taint alias names may be used in @psalm-taint-* annotations in the code.
self::$myCustomTaintAlias = $codebase->registerTaintAlias(
"my_custom_taint_alias",
self::$myCustomTaint | TaintKind::ALL_INPUT
);
}

/**
* Called to see what taints should be added
*
* @return list<string>
* @return int A bitmap of taint from the IDs
*/
public static function addTaints(AddRemoveTaintsEvent $event): array
public static function addTaints(AddRemoveTaintsEvent $event): int
{
$expr = $event->getExpr();

if ($expr instanceof Variable && $expr->name === 'bad_data') {
return TaintKindGroup::ALL_INPUT;
return TaintKind::ALL_INPUT;
}

if ($expr instanceof Variable && $expr->name === 'even_badder_data') {
return self::$myCustomTaint;
}

if ($expr instanceof Variable && $expr->name === 'even_badder_data_2') {
return self::$myCustomTaintAlias;
}

if ($expr instanceof Variable && $expr->name === 'secret_even_badder_data_3') {
// Combine taints using |
return self::$myCustomTaintAlias | USER_SECRET;
}

if ($expr instanceof Variable && $expr->name === 'bad_data_but_ok_cookie') {
// Remove taints using & and ~ to negate a taint (group)
return self::$myCustomTaintAlias & ~TaintKind::INPUT_COOKIE;
}

return [];
// No taints
return 0;
}
}
```
1 change: 1 addition & 0 deletions examples/TemplateChecker.php
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,7 @@ private function checkWithViewClass(Context $context, array $stmts): void
$statements_source = new StatementsAnalyzer(
$view_method_analyzer,
new NodeDataProvider(),
false,
);

$statements_source->analyze($pseudo_method_stmts, $context);
Expand Down
11 changes: 6 additions & 5 deletions examples/plugins/SafeArrayKeyChecker.php
Original file line number Diff line number Diff line change
Expand Up @@ -6,21 +6,22 @@
use Psalm\Internal\Analyzer\StatementsAnalyzer;
use Psalm\Plugin\EventHandler\Event\AddRemoveTaintsEvent;
use Psalm\Plugin\EventHandler\RemoveTaintsInterface;
use Psalm\Type\TaintKind;

final class SafeArrayKeyChecker implements RemoveTaintsInterface
{
/**
* Called to see what taints should be removed
*
* @return list<string>
* @return int
*/
#[\Override]
public static function removeTaints(AddRemoveTaintsEvent $event): array
public static function removeTaints(AddRemoveTaintsEvent $event): int
{
$item = $event->getExpr();
$statements_analyzer = $event->getStatementsSource();
if (!($item instanceof ArrayItem) || !($statements_analyzer instanceof StatementsAnalyzer)) {
return [];
return 0;
}
$item_key_value = '';
if ($item->key) {
Expand All @@ -34,8 +35,8 @@ public static function removeTaints(AddRemoveTaintsEvent $event): array
}

if ($item_key_value === 'safe_key') {
return ['html'];
return TaintKind::INPUT_HTML;
}
return [];
return 0;
}
}
11 changes: 6 additions & 5 deletions examples/plugins/TaintActiveRecords.php
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
use Psalm\Plugin\EventHandler\Event\AddRemoveTaintsEvent;
use Psalm\Type\Atomic;
use Psalm\Type\Atomic\TNamedObject;
use Psalm\Type\TaintKind;
use Psalm\Type\TaintKindGroup;
use Psalm\Type\Union;

Expand All @@ -27,16 +28,16 @@ final class TaintActiveRecords implements AddTaintsInterface
/**
* Called to see what taints should be added
*
* @return list<string>
* @return int
*/
#[Override]
public static function addTaints(AddRemoveTaintsEvent $event): array
public static function addTaints(AddRemoveTaintsEvent $event): int
{
$expr = $event->getExpr();

// Model properties are accessed by property fetch, so abort here
if ($expr instanceof ArrayItem) {
return [];
return 0;
}

$statements_source = $event->getStatementsSource();
Expand All @@ -51,11 +52,11 @@ public static function addTaints(AddRemoveTaintsEvent $event): array
}

if (self::containsActiveRecord($expr_type)) {
return TaintKindGroup::ALL_INPUT;
return TaintKind::ALL_INPUT;
}
} while ($expr = self::getParentNode($expr));

return [];
return 0;
}

/**
Expand Down
22 changes: 10 additions & 12 deletions psalm-baseline.xml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<?xml version="1.0" encoding="UTF-8"?>
<files psalm-version="6.x-dev@4258dc813b28c2b03865f34a17ddf072f006b357">
<files psalm-version="dev-master@b513b32ef4a8f5a46c4412133f345182e34db54c">
<file src="examples/TemplateChecker.php">
<PossiblyUndefinedIntArrayOffset>
<code><![CDATA[$comment_block->tags['variablesfrom'][0]]]></code>
Expand Down Expand Up @@ -911,7 +911,6 @@
<code><![CDATA[$context->calling_method_id]]></code>
<code><![CDATA[$context->calling_method_id]]></code>
<code><![CDATA[$context->calling_method_id]]></code>
<code><![CDATA[$function_param->sinks]]></code>
<code><![CDATA[$function_params]]></code>
<code><![CDATA[$function_params]]></code>
<code><![CDATA[$function_params]]></code>
Expand Down Expand Up @@ -1041,7 +1040,6 @@
<code><![CDATA[$method_id]]></code>
</ImplicitToStringCast>
<RiskyTruthyFalsyComparison>
<code><![CDATA[!$parent_node->specialization_key]]></code>
<code><![CDATA[$var_id]]></code>
</RiskyTruthyFalsyComparison>
</file>
Expand Down Expand Up @@ -1621,12 +1619,7 @@
</file>
<file src="src/Psalm/Internal/Codebase/TaintFlowGraph.php">
<RiskyTruthyFalsyComparison>
<code><![CDATA[$node->specialization_key]]></code>
<code><![CDATA[$node->unspecialized_id]]></code>
<code><![CDATA[$path->escaped_taints]]></code>
<code><![CDATA[$path->unescaped_taints]]></code>
<code><![CDATA[$source->specialization_key]]></code>
<code><![CDATA[end($source->path_types)]]></code>
<code><![CDATA[end($path_types)]]></code>
</RiskyTruthyFalsyComparison>
</file>
<file src="src/Psalm/Internal/Composer.php">
Expand All @@ -1635,9 +1628,14 @@
</RiskyTruthyFalsyComparison>
</file>
<file src="src/Psalm/Internal/DataFlow/DataFlowNode.php">
<RiskyTruthyFalsyComparison>
<code><![CDATA[$specialization_key]]></code>
</RiskyTruthyFalsyComparison>
<RedundantPropertyInitializationCheck>
<code><![CDATA[new self('closure-use', null, null, 'closure use')]]></code>
<code><![CDATA[new self('unknown-origin', null, null, 'unknown origin')]]></code>
<code><![CDATA[new self('variable-use', null, null, 'variable use')]]></code>
<code><![CDATA[self::$forClosureUse]]></code>
<code><![CDATA[self::$forUnknownOrigin]]></code>
<code><![CDATA[self::$forVariableUse]]></code>
</RedundantPropertyInitializationCheck>
</file>
<file src="src/Psalm/Internal/Diff/ClassStatementsDiffer.php">
<ImplicitToStringCast>
Expand Down
Loading