Skip to content

feat(langchain): enhance SQL security with validation, parameterization and injection protection #8377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

christian-bromann
Copy link
Contributor

This PR introduces comprehensive security improvements to the SqlDatabase class to protect against SQL injection attacks and other security vulnerabilities.

🚀 Key Features

Security Enhancements:

  • SQL Injection Protection: Detects and blocks 20+ dangerous SQL patterns including:
    • Multiple statement execution (; DROP TABLE)
    • OR-based injection (OR 1=1)
    • Comment-based injection (--, /* */)
    • Union-based injection (UNION SELECT)
    • Command execution attempts (xp_cmdshell, sp_executesql)
  • Parameterized Queries: New overloaded run() method supports safer parameter binding
  • Statement Validation: Configurable allowed SQL statements (default includes all for backward compatibility)
  • Query Length Limits: Prevents DoS attacks with configurable maximum query length (default: 10,000 chars)
  • Multiple Statement Prevention: Blocks stacked queries and command chaining

Configuration Options:

const db = await SqlDatabase.fromDataSourceParams({
  appDataSource: dataSource,
  allowedStatements: ["SELECT"],           // Restrict to read-only
  enableSqlValidation: true,               // Enable security validation
  maxQueryLength: 5000,                    // Custom query length limit
});

🔧 Usage Examples

Secure parameterized queries (recommended):

// ✅ Safe - uses parameter binding
const result = await db.run(
  "SELECT * FROM users WHERE age > ? AND name = ?", 
  [18, "John"]
);

Traditional string queries (with validation):

// ⚠️ Validated but less secure
const result = await db.run("SELECT * FROM users WHERE age > 18");

🔄 Backward Compatibility

All existing functionality remains unchanged. Security features are enabled by default but can be configured or disabled for compatibility with legacy applications.

I would consider this non-breaking, however it is technically breaking as security defaults are stricter.

🛡️ Security Impact

This addresses potential SQL injection vulnerabilities while maintaining flexibility for legitimate use cases through configurable security settings.

…on and injection protection

This commit introduces comprehensive security improvements to the SqlDatabase class:

Security Features:
- Add SQL injection pattern detection with 20+ dangerous patterns
- Implement configurable allowed statements (default: SELECT only in future)
- Add parameterized query support with safer parameter binding
- Enable SQL validation by default with opt-out capability
- Enforce maximum query length limits (default: 10000 chars)
- Block multiple statement execution and stacked queries
- Detect comment-based and encoding-based injection attempts

Breaking Changes (Future):
- In next major version, default allowedStatements will be ["SELECT"] only
- Applications requiring other SQL operations must explicitly configure allowedStatements

Configuration Options:
- allowedStatements: Array of permitted SQL statement types
- enableSqlValidation: Toggle SQL validation (default: true)
- maxQueryLength: Maximum query length in characters (default: 10000)

Documentation:
- Enhanced security notices and best practices in JSDoc
- Added comprehensive examples for secure usage patterns
- Detailed security configuration guidance

Testing:
- Added 400+ lines of comprehensive security test coverage
- Unit tests for all injection patterns and edge cases
- Parameterized query validation tests
- Configuration validation tests

This addresses potential SQL injection vulnerabilities while maintaining
backward compatibility through configurable security settings.
@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jun 17, 2025
Copy link

vercel bot commented Jun 17, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchainjs-docs ✅ Ready (Inspect) Visit Preview Jun 17, 2025 7:15am
1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchainjs-api-refs ⬜️ Ignored (Inspect) Jun 17, 2025 7:15am

@dosubot dosubot bot added the auto:improvement Medium size change to existing code to handle new use-cases label Jun 17, 2025
Copy link
Contributor

@hntrl hntrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the surface I think it's a good change to codify our security best practices rather than saying "just do this," but I wonder if there's a better form factor than baking this into the sql abstraction directly. That way we can reuse it and we're not forcing people to use our opinionated way of interfacing with their data.

Something to the tune of:

// hypothetical
const validator = new SqlValidator({
  allowedStatements: ["SELECT"],
  enableSqlValidation: true,
  maxQueryLength: 10000
});

const db = await SqlDatabase.fromDataSourceParams({
  appDataSource: datasource
});

await model.pipe(validator).pipe(db).invoke(...);

@christian-bromann thoughts?

@christian-bromann
Copy link
Contributor Author

I can definitely see the appeal of extracting the validation logic into a separate SqlValidator class—especially for use cases where modularity or reuse across different adapters is valuable. From a design perspective, that separation could make things more composable and clearer for advanced users who want full control.

That said, from my point of view, there's also something reassuring about having strong validation and protection built directly into the SqlDatabase implementation. A few reasons why I personally lean in that direction:

  • Security-first defaults: I’ve found that when it comes to security, it’s generally helpful to have safe behavior enabled by default—especially in libraries like LangChain where users might not expect to configure every detail themselves. Requiring an extra step to get that protection might increase the risk of someone unintentionally skipping it.
  • Lower cognitive overhead for most users: For many use cases, it seems beneficial if users don’t have to learn about or manually wire in an additional primitive like a validator class. If the goal is to make it easy to do the secure thing, having it integrated into SqlDatabase helps reduce friction.
  • Still configurable: I really like that this PR keeps things flexible through the allowedStatements and enableValidation options. So while there are strong defaults, users who need something different still have an escape hatch.

Of course, happy to follow whatever direction the team feels is best here—whether that's evolving this into a separate component or keeping the validation built-in.

@hntrl
Copy link
Contributor

hntrl commented Jun 19, 2025

Maybe there's a 'best of both' worlds approach to this where we have a default validator instance inside of SqlDatabase. That way someone can break it out if they want to, but it still lets them bring some kind of validation if they choose to have their own way of issuing queries:

class SqlDatabase {
  // the default
  protected validator: SqlValidator | null = new SqlValidator({
    allowedStatements: ["SELECT"],
    enableSqlValidation: true,
    maxQueryLength: 10000
  });
}

// 1. I can either pipe to SqlDatabase that has the default validator
model.pipe(new SqlDatabase(...));

// 2. Use a custom validator in SqlDatabase
model.pipe(new SqlDatabase({
  validator: new SqlValidator(...),
  ...
}));

// 3. Or use SqlValidator before using my own database call
model.pipe(new SqlValidator(...)).pipe(dbRunnable);

I could also see it being useful in traces to have a distinct validation step.

Looking into this some more, I'm gathering that the straight to .pipe syntax for SqlDatabase isn't possible today (it isn't a runnable), but it might make for an interesting iteration for sql tooling (this code was last updated 2 years ago!). If we hoist SqlValidator into its own class as a runnable, then it's possible to retrofit SqlDatabase to use validation defaults and also lets us use sql validation in a chain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:improvement Medium size change to existing code to handle new use-cases size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants