Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ibis): Added BE Support for MySQL SSL Connection #1024

Merged
merged 4 commits into from
Jan 16, 2025

Conversation

ongdisheng
Copy link
Contributor

@ongdisheng ongdisheng commented Jan 3, 2025

Description

This PR adds SSL configuration support for MySQL connections where two new fields, ssl_mode and ssl_ca, have been introduced to enable passing SSL-related parameters to PyMySQL from ibis under the hood.

Related Issue

Canner/WrenAI#886

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features
    • Added optional SSL configuration options for MySQL connections.
    • Introduced SSL mode settings (Disabled, Enabled, Verify CA).
    • Enhanced connection security with flexible SSL context management.
    • New testing capabilities for various SSL configurations in MySQL connections.
  • Bug Fixes
    • Improved error handling for invalid SSL modes during MySQL connection attempts.

Copy link

coderabbitai bot commented Jan 3, 2025

Walkthrough

The pull request introduces enhanced SSL configuration support for MySQL connections in the Ibis server. Two new optional fields, ssl_mode and ssl_ca, are added to the MySqlConnectionInfo class to provide more flexible SSL connection options. A new SSLMode enumeration is created to define different SSL connection modes, and a new method _create_ssl_context is implemented to handle SSL context creation based on the specified configuration. Additionally, testing capabilities are improved with new fixtures and test cases for various SSL scenarios.

Changes

File Changes
ibis-server/app/model/__init__.py - Added ssl_mode field with alias sslMode
- Added ssl_ca field with alias sslCA
- Created SSLMode enumeration with DISABLED, ENABLED, and VERIFY_CA
ibis-server/app/model/data_source.py - Updated get_mysql_connection to be a class method
- Added _create_ssl_context static method for SSL context management
ibis-server/tests/routers/v2/connector/test_mysql.py - Added fixture mysql_ssl_off for testing SSL disabled
- Added test_connection_invalid_ssl_mode for invalid SSL modes
- Added test_connection_valid_ssl_mode for valid SSL connection

Poem

🐰 SSL Bunnies Hop and Secure
Connections dance with cryptic flair
Modes of trust, certificates rare
Hopping through networks with gentle care
Encryption's magic beyond compare! 🔒

Finishing Touches

  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added ibis python Pull requests that update Python code labels Jan 3, 2025
@ongdisheng ongdisheng marked this pull request as ready for review January 4, 2025 08:49
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
ibis-server/app/model/__init__.py (1)

98-99: Consider using an enum or a standard string field for ssl_mode

Currently, ssl_mode is stored as a SecretStr. While keeping credentials secret makes sense for ssl_ca, the SSL mode is typically not sensitive data. It might be more straightforward to define ssl_mode as either a string or an explicit Enum field instead, so that validation can occur at the schema level (rather than relying on the _create_ssl_context method to interpret the internal string value).

ibis-server/app/model/data_source.py (1)

35-38: Use a consistent naming convention for Enum values

The enum values "Disable", "Require", and "Verify CA" are descriptive. However, it might be clearer to maintain a consistent format—e.g., all uppercase with underscores such as DISABLE, REQUIRE, VERIFY_CA—when referencing them in-line or in code. This is a Nitpick, but consistency can help avoid typos and confusion.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fbf56c8 and d15b00a.

📒 Files selected for processing (2)
  • ibis-server/app/model/__init__.py (1 hunks)
  • ibis-server/app/model/data_source.py (4 hunks)
🔇 Additional comments (5)
ibis-server/app/model/data_source.py (5)

4-4: Well-considered import of ssl

ssl is necessary for the new _create_ssl_context logic. Good.


7-7: Optional typing is appropriate

Importing Optional is a good choice; keep the type hints clear and consistent across the codebase.


39-40: No concerns; changes appear to be housekeeping lines

These lines are appropriately spaced, likely aiding readability.


183-206: Correct use of base64 CA decoding and flexible SSL modes

The _create_ssl_context method is well-structured:

  • Graceful handling of disabled SSL.
  • Appropriate usage of CERT_NONE and CERT_REQUIRED.
  • Loading CA from base64 when provided.

Ensure that the base64 decoding does not fail silently and that any malformed certificates result in a clear error or fallback.


134-146: Good approach to constructing kwargs with SSL context

The logic safely merges the optional SSL context into kwargs and properly int handles. Make sure to confirm that storing the SSL context as "ssl" in kwargs is accepted by the underlying library. Otherwise, this looks good and straightforward.

✅ Verification successful

Let me try a different search strategy to verify the SSL context usage.


Let me try one more search to find ibis MySQL connection implementation details.


Based on the code review and investigation, I can now provide a final response:

The ssl parameter in kwargs is correctly used with ibis MySQL connector

The implementation is correct because:

  1. Ibis Framework (v9.5.0) uses PyMySQL under the hood for MySQL connections
  2. PyMySQL accepts the ssl parameter in its connection arguments to configure SSL/TLS settings
  3. The code properly handles the SSL context by conditionally adding it to kwargs only when present
  • ✓ The ssl parameter name is correct
  • ✓ The conditional merging of SSL context is properly implemented
  • ✓ The handling of additional kwargs through info.kwargs allows for other connection parameters
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify that PyMySQL or MySQL connector libraries accept "ssl" in kwargs.
rg -A 5 'def connect\('

Length of output: 25


Script:

#!/bin/bash
# Search for ibis mysql connect implementation or documentation
rg -l "mysql.*connect" 

# Also check for any tests that might show ssl usage
rg -l "mysql.*ssl"

Length of output: 199


Script:

#!/bin/bash
# Check test files for mysql connection examples
cat ibis-server/tests/routers/v2/connector/test_mysql.py

# Check metadata implementation
cat ibis-server/app/model/metadata/mysql.py

Length of output: 21327


Script:

#!/bin/bash
# Check poetry.lock for ibis-mysql dependency details
cat ibis-server/poetry.lock | grep -A 10 "name = \"ibis-"

Length of output: 501

Copy link
Contributor

@goldmedal goldmedal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working it, @ongdisheng. Overall looks good.
I left some comments for the enum value. I wonder if we can add some tests to cover the connection information in test_mysql?
Maybe we can assert the error message when accepting enable or verify_ca. (the test container doesn't enable SSL by default.)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments.

Actionable comments posted: 2

🧹 Nitpick comments (1)
ibis-server/tests/routers/v2/connector/test_mysql.py (1)

428-434: Catch specific exceptions instead of 'Exception' in tests

In test_connection_invalid_ssl_mode, using with pytest.raises(Exception) is too broad and may mask other unexpected errors. It's better to catch specific exceptions related to the SSL handshake failure to make the test more precise.

Consider importing and catching specific exceptions from the client or SSL modules. For example:

-from pytest import raises
+from pymysql import OperationalError

 ...

     @pytest.mark.parametrize(
         "ssl_mode, expected_error",
         [
             (SSLMode.ENABLED, "Bad handshake"),
             (SSLMode.VERIFY_CA, "cafile, capath and cadata cannot be all omitted"),
         ],
     )
     async def test_connection_invalid_ssl_mode(
         client, mysql_ssl_off: MySqlContainer, ssl_mode, expected_error
     ):
         connection_info = _to_connection_info(mysql_ssl_off)
         connection_info["sslMode"] = ssl_mode

-        with pytest.raises(Exception) as excinfo:
+        with pytest.raises(OperationalError) as excinfo:
             await client.post(
                 url=f"{base_url}/metadata/version",
                 json={"connectionInfo": connection_info},
             )
🛑 Comments failed to post (2)
ibis-server/app/model/data_source.py (2)

189-195: ⚠️ Potential issue

Enable hostname verification for 'VERIFY_CA' SSL mode

Currently, ctx.check_hostname is set to False regardless of the ssl_mode. When ssl_mode is VERIFY_CA, hostname verification should be enabled by setting ctx.check_hostname to True to ensure the server's hostname matches the certificate.

Apply this diff to fix the issue:

     ctx = ssl.create_default_context()
-    ctx.check_hostname = False

     if ssl_mode == SSLMode.ENABLED:
+        ctx.check_hostname = False
         ctx.verify_mode = ssl.CERT_NONE
     elif ssl_mode == SSLMode.VERIFY_CA:
+        ctx.check_hostname = True
         ctx.verify_mode = ssl.CERT_REQUIRED
         ctx.load_verify_locations(
             cadata=base64.b64decode(info.ssl_ca.get_secret_value()).decode("utf-8")
             if info.ssl_ca
             else None
         )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

        ctx = ssl.create_default_context()

        if ssl_mode == SSLMode.ENABLED:
            ctx.check_hostname = False
            ctx.verify_mode = ssl.CERT_NONE
        elif ssl_mode == SSLMode.VERIFY_CA:
            ctx.check_hostname = True
            ctx.verify_mode = ssl.CERT_REQUIRED
            ctx.load_verify_locations(
                cadata=base64.b64decode(info.ssl_ca.get_secret_value()).decode("utf-8")
                if info.ssl_ca
                else None
            )

182-183: ⚠️ Potential issue

Validate 'ssl_ca' is provided when 'ssl_mode' is 'VERIFY_CA'

When ssl_mode is VERIFY_CA, the ssl_ca attribute must be provided to load the CA certificate. If ssl_ca is missing, an exception will be raised when loading verify locations. To prevent this, add a validation to ensure ssl_ca is provided.

Apply this diff to add the validation:

     ssl_mode = (
         info.ssl_mode.get_secret_value() if hasattr(info, "ssl_mode") else None
     )

+    if ssl_mode == SSLMode.VERIFY_CA and not info.ssl_ca:
+        raise ValueError("ssl_ca must be provided when ssl_mode is VERIFY_CA")

     if not ssl_mode or ssl_mode == SSLMode.DISABLED:
         return None

Committable suggestion skipped: line range outside the PR's diff.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
ibis-server/tests/routers/v2/connector/test_mysql.py (1)

417-440: Consider testing additional SSL modes.

The parameterized test effectively verifies error handling for invalid SSL configurations. However, consider adding test cases for:

  • SSLMode.PREFERRED (should fall back to unencrypted)
  • SSLMode.REQUIRED (should fail like ENABLED)

The current test cases are well-structured and correctly verify:

  • SSL handshake failure with ENABLED mode
  • Missing CA certificate validation with VERIFY_CA mode
ibis-server/app/model/data_source.py (1)

186-211: Consider adding docstring and type hints.

The _create_ssl_context method would benefit from:

  1. A docstring explaining the SSL modes and their implications
  2. Type hints for the ssl_mode parameter
     @staticmethod
-    def _create_ssl_context(info: ConnectionInfo) -> Optional[ssl.SSLContext]:
+    def _create_ssl_context(info: ConnectionInfo) -> Optional[ssl.SSLContext]:
+        """Create SSL context for MySQL connection based on specified mode.
+        
+        Args:
+            info: Connection info containing SSL configuration.
+                 Expected attributes: ssl_mode (Optional[str]), ssl_ca (Optional[str])
+        
+        Returns:
+            SSLContext if SSL is enabled or verify_ca, None if disabled
+        
+        Raises:
+            ValueError: If ssl_ca is missing when ssl_mode is verify_ca
+            ValueError: If ssl_ca certificate is invalid
+        """
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 79c8b11 and 0069497.

📒 Files selected for processing (2)
  • ibis-server/app/model/data_source.py (4 hunks)
  • ibis-server/tests/routers/v2/connector/test_mysql.py (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci
🔇 Additional comments (8)
ibis-server/tests/routers/v2/connector/test_mysql.py (3)

7-7: LGTM! Required imports are added.

The imports for OperationalError and SSLMode are correctly added to support the new SSL-related test cases.

Also applies to: 11-11


116-121: LGTM! Well-structured fixture for SSL testing.

The mysql_ssl_off fixture correctly:

  • Uses the same MySQL version as other tests
  • Explicitly disables SSL using --ssl=0
  • Handles container cleanup properly

442-450: LGTM! Comprehensive test for disabled SSL mode.

The test correctly verifies that:

  • Connection succeeds with explicitly disabled SSL
  • Server version matches the expected version
ibis-server/app/model/data_source.py (5)

4-4: LGTM! Required imports added for SSL support.

The new imports and class references are correctly added to support SSL functionality.

Also applies to: 7-7, 32-32


136-137: LGTM! Appropriate decorator change.

The change from @staticmethod to @classmethod is correct as it allows access to the class methods.


138-148: LGTM! Clean implementation of SSL context handling.

The code correctly:

  • Creates SSL context only when needed
  • Preserves any existing connection parameters
  • Properly integrates SSL configuration with the MySQL connection

192-193: LGTM! Good validation for VERIFY_CA mode.

The validation ensures SSL CA is provided when required.


198-199: ⚠️ Potential issue

Justify or reconsider disabling hostname verification.

Setting check_hostname = False could potentially weaken the SSL security. If this is necessary, please add a comment explaining why hostname verification is disabled.

Comment on lines +205 to +209
ctx.load_verify_locations(
cadata=base64.b64decode(info.ssl_ca.get_secret_value()).decode("utf-8")
if info.ssl_ca
else None
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for base64 decoding.

The base64 decoding of SSL CA certificate should be wrapped in a try-except block to handle potential decoding errors gracefully.

-            ctx.load_verify_locations(
-                cadata=base64.b64decode(info.ssl_ca.get_secret_value()).decode("utf-8")
-                if info.ssl_ca
-                else None
-            )
+            if info.ssl_ca:
+                try:
+                    ca_data = base64.b64decode(info.ssl_ca.get_secret_value()).decode("utf-8")
+                    ctx.load_verify_locations(cadata=ca_data)
+                except (base64.binascii.Error, UnicodeDecodeError) as e:
+                    raise ValueError(f"Invalid SSL CA certificate: {str(e)}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ctx.load_verify_locations(
cadata=base64.b64decode(info.ssl_ca.get_secret_value()).decode("utf-8")
if info.ssl_ca
else None
)
if info.ssl_ca:
try:
ca_data = base64.b64decode(info.ssl_ca.get_secret_value()).decode("utf-8")
ctx.load_verify_locations(cadata=ca_data)
except (base64.binascii.Error, UnicodeDecodeError) as e:
raise ValueError(f"Invalid SSL CA certificate: {str(e)}")

@ongdisheng ongdisheng requested a review from goldmedal January 12, 2025 00:50
Copy link
Contributor

@goldmedal goldmedal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ongdisheng. Overall looks good to me. 👍
Sorry for being so late.

@goldmedal goldmedal merged commit e38f545 into Canner:main Jan 16, 2025
7 checks passed
@ongdisheng ongdisheng deleted the issue-886 branch January 16, 2025 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ibis python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants