Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ibis): Add Oracle connector #1067

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

douenergy
Copy link
Contributor

@douenergy douenergy commented Feb 21, 2025

Description

  • Introduce Oracle datasource connector.
  • The usage is the same as other datasource connector.

Summary by CodeRabbit

  • New Features
    • Introduced Oracle database integration to enable connection configuration, query execution, and metadata retrieval for Oracle systems.
  • Tests
    • Added comprehensive tests to validate Oracle connectivity, query handling, and error scenarios.
  • Documentation
    • Updated guides with clear instructions and examples to extend connection capabilities.
  • Chores
    • Refined dependency management and tooling enhancements to support Oracle functionalities.

Copy link

coderabbitai bot commented Feb 21, 2025

Walkthrough

This pull request adds Oracle database support across multiple modules. It introduces new data transfer objects and connection information classes for Oracle in the model, updates the data source enums and methods, and integrates Oracle metadata handling. Additionally, the changes augment the JSON serialization logic, update the dependency configuration with Oracle-specific libraries, and introduce a comprehensive test suite for Oracle integration. Documentation and build targets have also been updated to accommodate these features.

Changes

File(s) Change Summary
ibis-server/app/model/init.py Added QueryOracleDTO (extending QueryDTO) and new OracleConnectionInfo class; updated ConnectionInfo union to include OracleConnectionInfo.
ibis-server/app/model/data_source.py Introduced Oracle support by adding enum entries for oracle, new static method get_oracle_connection, and necessary Oracle class imports.
ibis-server/app/util.py Added a conditional check in the default function to convert LOB objects (with a read attribute) into strings during JSON serialization.
ibis-server/pyproject.toml Added the oracle extra for ibis-framework, a new dependency oracledb (version 2.5.1), and an Oracle marker in pytest configuration for test categorization.
ibis-server/tests/routers/v2/connector/test_oracle.py Added a comprehensive test suite for Oracle integration, including fixtures for setting up Oracle containers, tests for query execution, error handling, dry runs, parameter validations, metadata retrieval, and helper functions for constructing connection info and URLs.
ibis-server/docs/how-to-add-data-source.md Updated the documentation with a new code snippet showing how to include PostgresConnectionInfo in the ConnectionInfo union, demonstrating extension for new data sources.
ibis-server/justfile Added a new target test-verbose MARKER to enable verbose pytest runs with disabled output capturing.
ibis-server/app/model/metadata/factory.py Integrated Oracle metadata support by importing OracleMetadata and updating the mapping to associate DataSource.oracle with it.
ibis-server/app/model/metadata/oracle.py Introduced the OracleMetadata class with methods for retrieving table lists, constraints, and database version, as well as utility functions for formatting and type transformation relevant to Oracle data.

Sequence Diagram(s)

sequenceDiagram
  participant C as Client
  participant QE as Query Endpoint (/query)
  participant DSE as DataSourceExtension
  participant OCon as Oracle Connector (ibis.oracle.connect)
  
  C->>QE: Send QueryOracleDTO with OracleConnectionInfo
  QE->>DSE: Invoke get_oracle_connection(info)
  DSE->>OCon: Establish connection using provided Oracle credentials
  OCon-->>DSE: Return Oracle DB connection
  DSE-->>QE: Send established connection
  QE-->>C: Return query results
Loading
sequenceDiagram
  participant M as OracleMetadata
  participant DB as Oracle Database
  
  M->>DB: Execute SQL query for table list
  DB-->>M: Return list of tables and columns
  M->>DB: Execute SQL query for constraints
  DB-->>M: Return foreign key constraints
  M->>DB: Query for Oracle version
  DB-->>M: Return version information
Loading

Poem

Oh, what a delight to see,
Oracle joins our family tree!
With queries and metadata in tow,
Our code now has a new glow.
Hopping along with updates so fine,
This rabbit sings in ASCII shine!
🐰✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a7fd85d and 30611a1.

⛔ Files ignored due to path filters (1)
  • ibis-server/poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • ibis-server/app/util.py (1 hunks)
  • ibis-server/pyproject.toml (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • ibis-server/app/util.py
  • ibis-server/pyproject.toml
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added ibis dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Feb 21, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
ibis-server/app/util.py (1)

50-52: Consider adding type hints for LOB objects.

While the implementation correctly handles LOB objects, consider adding type hints or documentation to clarify which specific LOB types (BLOB, CLOB, NCLOB) are supported.

ibis-server/tests/routers/v2/connector/test_oracle.py (1)

72-99: Consider adding cleanup for test data.

While the container cleanup is handled via request.addfinalizer(oracle.stop), consider adding explicit cleanup for the test tables to ensure a clean state between test runs.

 def oracle(request) -> OracleDbContainer:
     oracle = OracleDbContainer(
         "gvenzl/oracle-free:23.6-slim-faststart", oracle_password="Oracle123"
     )

     oracle.start()

     host = oracle.get_container_host_ip()
     port = oracle.get_exposed_port(1521)
     connection_url = (
         f"oracle+oracledb://SYSTEM:Oracle123@{host}:{port}/?service_name=FREEPDB1"
     )
     engine = sqlalchemy.create_engine(connection_url, echo=True)

     with engine.begin() as conn:
+        # Drop existing tables if they exist
+        conn.execute(text("DROP TABLE IF EXISTS orders"))
+        conn.execute(text("DROP TABLE IF EXISTS customer"))
+
         pd.read_parquet(file_path("resource/tpch/data/orders.parquet")).to_sql(
             "orders", engine, index=False
         )
         pd.read_parquet(file_path("resource/tpch/data/customer.parquet")).to_sql(
             "customer", engine, index=False
         )
         # Add table and column comments
         conn.execute(text("COMMENT ON TABLE orders IS 'This is a table comment'"))
         conn.execute(text("COMMENT ON COLUMN orders.o_comment IS 'This is a comment'"))

     request.addfinalizer(oracle.stop)
     return oracle
ibis-server/app/model/__init__.py (1)

138-144: Consider adding Oracle-specific connection parameters.

While the basic connection parameters are covered, Oracle connections often require additional parameters like:

  • service_name vs SID
  • SSL/TLS configuration
  • Connection mode (SYSDBA, SYSOPER)
 class OracleConnectionInfo(BaseModel):
     host: SecretStr = Field(examples=["localhost"])
     port: SecretStr = Field(examples=[1521])
     database: SecretStr
     user: SecretStr
     password: SecretStr | None = None
+    service_name: bool = Field(default=True, description="Use service_name instead of SID")
+    mode: str = Field(default="DEFAULT", examples=["DEFAULT", "SYSDBA", "SYSOPER"])
+    ssl_enabled: bool = Field(default=False)
+    ssl_ca: SecretStr | None = None
ibis-server/pyproject.toml (1)

60-67: Refine Oracle Test Marker Label

The marker "oracle: mark a test as a oracle test" contains an extra space and is slightly inconsistent with other markers. For consistency and readability, consider updating it to remove the extra space. For example:

-  "oracle: mark a test  as a oracle test",
+  "oracle: mark a test as a oracle test",
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 90cd416 and 97a470c.

📒 Files selected for processing (5)
  • ibis-server/app/model/__init__.py (2 hunks)
  • ibis-server/app/model/data_source.py (5 hunks)
  • ibis-server/app/util.py (2 hunks)
  • ibis-server/pyproject.toml (3 hunks)
  • ibis-server/tests/routers/v2/connector/test_oracle.py (1 hunks)
🔇 Additional comments (5)
ibis-server/app/util.py (1)

16-18: LGTM: LOB object handling in DataFrame columns.

The implementation correctly handles Oracle LOB objects by converting them to strings when the column type is object.

ibis-server/tests/routers/v2/connector/test_oracle.py (1)

16-64: LGTM: Comprehensive test manifest.

The manifest includes a good variety of Oracle data types including NUMBER, VARCHAR2, DATE, TIMESTAMP, and BLOB, providing good coverage for testing.

ibis-server/app/model/__init__.py (1)

43-44: LGTM: Oracle DTO follows established patterns.

The QueryOracleDTO class correctly extends QueryDTO and follows the same pattern as other database-specific DTOs.

ibis-server/pyproject.toml (2)

13-22: Oracle Extra Added in ibis-framework Dependency

The extras list for ibis-framework now includes "oracle" to enable Oracle-specific functionality. This is consistent with the addition of the Oracle connector.


36-37: New oracledb Dependency Introduced

The dependency oracledb = "2.5.1" has been added to support Oracle connectivity. Please verify that version 2.5.1 meets the project’s compatibility and security requirements.

@goldmedal
Copy link
Contributor

@douenergy You need to add OracleConnectionInfo to

We will verify the DTO when got a request

@router.post("/{data_source}/query", dependencies=[Depends(verify_query_dto)])

That's why the connection info way can't work.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Feb 21, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
ibis-server/justfile (1)

33-34: LGTM! Consider adding a comment to document the target.

The new test-verbose target is a valuable addition, especially for debugging complex database integrations like Oracle. The implementation follows the existing patterns and provides useful debugging capabilities with the -s and -v flags.

Consider adding a comment above the target to document its purpose, similar to the comment above the test target:

+# run the pytest tests for the given marker with increased verbosity
 test-verbose MARKER:
     poetry run pytest -s -v -m '{{ MARKER }}'
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 97a470c and caf7545.

📒 Files selected for processing (4)
  • ibis-server/app/model/__init__.py (3 hunks)
  • ibis-server/docs/how-to-add-data-source.md (1 hunks)
  • ibis-server/justfile (1 hunks)
  • ibis-server/tests/routers/v2/connector/test_oracle.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • ibis-server/app/model/init.py
  • ibis-server/tests/routers/v2/connector/test_oracle.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
ibis-server/tests/routers/v2/connector/test_oracle.py (1)

100-224: 🛠️ Refactor suggestion

Expand test coverage for comprehensive validation.

The current test suite needs additional test cases for:

  • Different Oracle data types (especially LOB types)
  • Error conditions (invalid queries, connection issues)
  • Edge cases (null values, large datasets)
  • Connection pooling and timeout scenarios
  • Transaction handling
  • Concurrent query execution
🧹 Nitpick comments (3)
ibis-server/tests/routers/v2/connector/test_oracle.py (3)

14-17: Consider moving Oracle connection constants to a configuration file.

These constants could be moved to a dedicated configuration file to improve maintainability and reusability across different test files.


19-72: Add more Oracle-specific data types to the test manifest.

Consider adding tests for additional Oracle-specific data types:

  • CLOB for large text data
  • NVARCHAR2 for Unicode data
  • RAW for binary data
  • FLOAT for floating-point numbers
  • INTERVAL types for time intervals

82-84: Consider making the Oracle container version configurable.

The Oracle container version is hardcoded. Consider:

  1. Moving it to a configuration file
  2. Adding version validation
  3. Documenting the minimum required version
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between caf7545 and 11a5731.

📒 Files selected for processing (1)
  • ibis-server/tests/routers/v2/connector/test_oracle.py (1 hunks)
🔇 Additional comments (1)
ibis-server/tests/routers/v2/connector/test_oracle.py (1)

231-231: Use original port to avoid potential conflicts.

Replace oracle.get_exposed_port(oracle.port) with the original port to prevent potential port conflicts.

-        "port": oracle.get_exposed_port(oracle.port),
+        "port": oracle.port,

Comment on lines +87 to +92
pd.read_parquet(file_path("resource/tpch/data/orders.parquet")).to_sql(
"orders", engine, index=False
)
pd.read_parquet(file_path("resource/tpch/data/customer.parquet")).to_sql(
"customer", engine, index=False
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for data loading.

The data loading from parquet files lacks error handling. Consider:

  1. Validating file existence
  2. Adding error handling for file read/write operations
  3. Verifying data integrity after loading

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
ibis-server/tests/routers/v2/connector/test_oracle.py (2)

20-73: Consider adding multi-model manifest or more complex schemas for thorough coverage.

The manifest structure is well-defined and covers key data types (number, date, timestamp, blob). However, you might extend coverage by including multiple models or more complex schemas (e.g., joined tables) to ensure comprehensive end-to-end testing of Oracle-specific functionality.


227-244: Expand negative test scenarios for broader error coverage.

Currently, this test covers a single invalid table error (ORA-00942). For more robust testing of Oracle error handling, you might consider additional negative scenarios such as invalid syntax or permission issues, ensuring correct error codes and messages are returned.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 11a5731 and 13dedb1.

📒 Files selected for processing (1)
  • ibis-server/tests/routers/v2/connector/test_oracle.py (1 hunks)
🔇 Additional comments (3)
ibis-server/tests/routers/v2/connector/test_oracle.py (3)

81-99: Validate container lifecycle and resource utilization.

The oracle fixture correctly provisions an Oracle container, but ensure you handle potential concurrency or resource usage issues if tests run simultaneously in CI. Fixture scope is module, which should minimize overhead, but consider adding logs or health checks to confirm the container is ready before executing queries if you encounter flakiness in concurrent test environments.


101-139: Great test coverage for basic query functionality.

This test effectively validates the main query scenario and checks both data integrity and data types. The approach looks solid, as it includes various column types and ensures correctness for each.


345-355: Confirm key naming consistency with OracleConnectionInfo.

Make sure that "database" here matches the attribute naming in your OracleConnectionInfo class. Past feedback indicated that the key name must match exactly to avoid deserialization issues in the router.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
ibis-server/app/model/metadata/oracle.py (2)

19-20: Add error handling for failed connections.

Currently, the constructor does not account for a scenario where establishing the connection might fail. Consider adding error handling or retries during DataSource.oracle.get_connection(connection_info) to improve resilience and provide clear error messages to the caller.


49-51: Avoid hard-coding schema filtering.

Filtering tables strictly by t.owner = 'SYSTEM' may exclude other relevant schemas or limit user customization. Consider making the schema owner dynamic or configurable, especially if this connector is intended to handle multiple schemas in various environments.

ibis-server/tests/routers/v2/connector/test_oracle.py (1)

84-97: Add fallback or error handling for data ingestion.

When creating tables and loading data with to_sql, consider handling scenarios such as the table already existing or having insufficient privileges. At scale, error handling ensures transactions won't fail silently and clarifies the root cause if loading data into Oracle fails.

Do you want assistance adding more robust error handling for the data loading process?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13dedb1 and baa1f15.

📒 Files selected for processing (3)
  • ibis-server/app/model/metadata/factory.py (2 hunks)
  • ibis-server/app/model/metadata/oracle.py (1 hunks)
  • ibis-server/tests/routers/v2/connector/test_oracle.py (1 hunks)
🔇 Additional comments (4)
ibis-server/app/model/metadata/oracle.py (1)

177-183: Verify necessary privileges for querying Oracle version.

SELECT version FROM v$instance may require higher privileges in some Oracle deployments (e.g., the user may not have access to v$instance). Verify that your default user has sufficient rights or consider using a more accessible view, like PRODUCT_COMPONENT_VERSION or v$version, for retrieving the database version.

ibis-server/app/model/metadata/factory.py (1)

14-25: Ensure coverage for the new Oracle mapping.

Including DataSource.oracle: OracleMetadata is essential. However, confirm that your tests thoroughly exercise this new mapping. If not already present, add integration tests to verify that Oracle metadata retrieval works end-to-end.

ibis-server/tests/routers/v2/connector/test_oracle.py (2)

101-139: Add more test coverage for corner cases.

The test_query function checks a successful scenario, but additional coverage (like empty result sets, invalid columns, or large data returns) could help catch edge-case issues in production. This is similar to a past recommendation about increasing test scope.


389-403: Consider using the original port.

Approach aligns with previous feedback: you’re calling oracle.get_exposed_port(oracle.port) to avoid conflicts. Some users prefer a static Oracle port to minimize variability. Decide on a consistent port strategy and document it for contributors referencing your tests.

@douenergy douenergy requested a review from goldmedal February 24, 2025 01:43
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
ibis-server/tests/routers/v2/connector/test_oracle.py (3)

16-18: Consider using environment variables for credentials.

Hard-coded credentials in test files can be problematic for security and flexibility. Consider using environment variables with defaults.

-oracle_password = "Oracle123"
-oracle_user = "SYSTEM"
-oracle_database = "FREEPDB1"
+oracle_password = os.getenv("ORACLE_TEST_PASSWORD", "Oracle123")
+oracle_user = os.getenv("ORACLE_TEST_USER", "SYSTEM")
+oracle_database = os.getenv("ORACLE_TEST_DATABASE", "FREEPDB1")

30-73: Consider adding more Oracle-specific data types.

The manifest covers basic types but could be enhanced with more Oracle-specific types for comprehensive testing:

  • CLOB for large text
  • NVARCHAR2/NCLOB for Unicode
  • RAW for binary data
  • INTERVAL types

Example additions:

             "columns": [
                 // ... existing columns ...
+                {
+                    "name": "clob_column",
+                    "expression": "TO_CLOB('large text')",
+                    "type": "clob"
+                },
+                {
+                    "name": "nvarchar_column",
+                    "expression": "N'Unicode text'",
+                    "type": "nvarchar2"
+                },
+                {
+                    "name": "interval_column",
+                    "expression": "INTERVAL '1' DAY",
+                    "type": "interval day to second"
+                }
             ],

393-408: Add input validation to helper functions.

The helper functions could benefit from input validation to ensure robust testing:

 def _to_connection_info(oracle: OracleDbContainer):
+    if not oracle or not isinstance(oracle, OracleDbContainer):
+        raise ValueError("Invalid Oracle container instance")
+    
+    host = oracle.get_container_host_ip()
+    port = oracle.get_exposed_port(oracle.port)
+    
+    if not host or not port:
+        raise ValueError("Failed to get Oracle container connection details")
+    
     return {
-        "host": oracle.get_container_host_ip(),
-        "port": oracle.get_exposed_port(oracle.port),
+        "host": host,
+        "port": port,
         "user": f"{oracle_user}",
         "password": f"{oracle_password}",
         "database": f"{oracle_database}",
     }

 def _to_connection_url(oracle: OracleDbContainer):
+    if not oracle or not isinstance(oracle, OracleDbContainer):
+        raise ValueError("Invalid Oracle container instance")
+    
     info = _to_connection_info(oracle)
+    for key in ["user", "password", "host", "port", "database"]:
+        if not info.get(key):
+            raise ValueError(f"Missing required connection parameter: {key}")
+    
     return f"oracle://{info['user']}:{info['password']}@{info['host']}:{info['port']}/{info['database']}"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe2a2a1 and 747e197.

⛔ Files ignored due to path filters (1)
  • ibis-server/poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • ibis-server/tests/routers/v2/connector/test_oracle.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci

Comment on lines 105 to 391
result = response.json()
assert result["detail"][0] is not None
assert result["detail"][0]["type"] == "missing"
assert result["detail"][0]["loc"] == ["body", "parameters"]
assert result["detail"][0]["msg"] == "Field required"


async def test_validate_rule_column_is_valid_without_parameters(
client, manifest_str, oracle: OracleDbContainer
):
connection_info = _to_connection_info(oracle)
response = await client.post(
url=f"{base_url}/validate/column_is_valid",
json={"connectionInfo": connection_info, "manifestStr": manifest_str},
)
assert response.status_code == 422
result = response.json()
assert result["detail"][0] is not None
assert result["detail"][0]["type"] == "missing"
assert result["detail"][0]["loc"] == ["body", "parameters"]
assert result["detail"][0]["msg"] == "Field required"


async def test_validate_rule_column_is_valid_without_one_parameter(
client, manifest_str, oracle: OracleDbContainer
):
connection_info = _to_connection_info(oracle)
response = await client.post(
url=f"{base_url}/validate/column_is_valid",
json={
"connectionInfo": connection_info,
"manifestStr": manifest_str,
"parameters": {"modelName": "Orders"},
},
)
assert response.status_code == 422
assert response.text == "Missing required parameter: `columnName`"

response = await client.post(
url=f"{base_url}/validate/column_is_valid",
json={
"connectionInfo": connection_info,
"manifestStr": manifest_str,
"parameters": {"columnName": "orderkey"},
},
)
assert response.status_code == 422
assert response.text == "Missing required parameter: `modelName`"


async def test_metadata_list_tables(client, oracle: OracleDbContainer):
connection_info = _to_connection_info(oracle)
response = await client.post(
url=f"{base_url}/metadata/tables",
json={"connectionInfo": connection_info},
)
assert response.status_code == 200

result = response.json()
# Use case-insensitive filtering since Oracle returns "SYSTEM.ORDERS"
result = next(filter(lambda x: x["name"].lower() == "system.orders", result))

assert result["name"].lower() == "system.orders"
assert result["primaryKey"] is not None
assert result["description"] == "This is a table comment"
assert result["properties"] == {
"catalog": "",
"schema": "SYSTEM",
"table": "ORDERS",
"path": None,
}
assert len(result["columns"]) == 9
assert result["columns"][8] == {
"name": "O_COMMENT",
"nestedColumns": None,
"type": "TEXT",
"notNull": False,
"description": "This is a comment",
"properties": None,
}


async def test_metadata_list_constraints(client, oracle: OracleDbContainer):
connection_info = _to_connection_info(oracle)
response = await client.post(
url=f"{base_url}/metadata/constraints",
json={"connectionInfo": connection_info},
)
assert response.status_code == 200


async def test_metadata_db_version(client, oracle: OracleDbContainer):
connection_info = _to_connection_info(oracle)
response = await client.post(
url=f"{base_url}/metadata/version",
json={"connectionInfo": connection_info},
)
assert response.status_code == 200
assert "23.0" in response.text

Copy link

@coderabbitai coderabbitai bot Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add test cases for LOB handling and connection resilience.

Current test coverage could be enhanced with:

  1. Tests for large LOB data handling
  2. Connection timeout and retry scenarios
  3. Tests for concurrent connections

Example test case:

async def test_query_with_large_lob(client, manifest_str, oracle: OracleDbContainer):
    connection_info = _to_connection_info(oracle)
    # Generate large CLOB content (> 1MB)
    large_text = "x" * (1024 * 1024 * 2)  # 2MB
    
    # Insert large LOB data
    engine = sqlalchemy.create_engine(oracle.get_connection_url())
    with engine.begin() as conn:
        conn.execute(text("CREATE TABLE test_lob (id NUMBER, content CLOB)"))
        conn.execute(text("INSERT INTO test_lob VALUES (1, :content)"), {"content": large_text})
    
    response = await client.post(
        url=f"{base_url}/query",
        json={
            "connectionInfo": connection_info,
            "manifestStr": manifest_str,
            "sql": "SELECT * FROM test_lob",
        },
    )
    assert response.status_code == 200
    result = response.json()
    assert len(result["data"]) == 1
    assert len(result["data"][0][1]) == len(large_text)

Example connection resilience test:

async def test_query_with_connection_timeout(client, manifest_str, oracle: OracleDbContainer):
    connection_info = _to_connection_info(oracle)
    connection_info["connect_timeout"] = 1  # 1 second timeout
    
    # Simulate network delay using iptables
    oracle.exec(["iptables", "-A", "OUTPUT", "-p", "tcp", "--sport", "1521", "-j", "DROP"])
    
    response = await client.post(
        url=f"{base_url}/query",
        json={
            "connectionInfo": connection_info,
            "manifestStr": manifest_str,
            "sql": "SELECT 1 FROM dual",
        },
    )
    assert response.status_code == 500
    assert "connection timeout" in response.text.lower()
    
    # Cleanup iptables rule
    oracle.exec(["iptables", "-D", "OUTPUT", "-p", "tcp", "--sport", "1521", "-j", "DROP"])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's interesting. @douenergy, should we care about the LOB type? maybe add a test for it?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
ibis-server/tests/routers/v2/connector/test_local_file.py (1)

334-334: LGTM! Good improvement to test reliability.

The change from direct array access to filtering for a specific table makes the test more reliable and deterministic. This prevents potential failures if the order of tables in the result changes.

Consider adding a comment explaining why we specifically filter for the "type-test-csv" table:

+    # Filter for type-test-csv as it contains our test type mappings
     columns = next(filter(lambda x: x["name"] == "type-test-csv", result))["columns"]
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 747e197 and 2bad36e.

📒 Files selected for processing (1)
  • ibis-server/tests/routers/v2/connector/test_local_file.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci

Copy link
Contributor

@goldmedal goldmedal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @douenergy. Overall looks good to me but I think we could enhance the tests.

Comment on lines 105 to 391
result = response.json()
assert result["detail"][0] is not None
assert result["detail"][0]["type"] == "missing"
assert result["detail"][0]["loc"] == ["body", "parameters"]
assert result["detail"][0]["msg"] == "Field required"


async def test_validate_rule_column_is_valid_without_parameters(
client, manifest_str, oracle: OracleDbContainer
):
connection_info = _to_connection_info(oracle)
response = await client.post(
url=f"{base_url}/validate/column_is_valid",
json={"connectionInfo": connection_info, "manifestStr": manifest_str},
)
assert response.status_code == 422
result = response.json()
assert result["detail"][0] is not None
assert result["detail"][0]["type"] == "missing"
assert result["detail"][0]["loc"] == ["body", "parameters"]
assert result["detail"][0]["msg"] == "Field required"


async def test_validate_rule_column_is_valid_without_one_parameter(
client, manifest_str, oracle: OracleDbContainer
):
connection_info = _to_connection_info(oracle)
response = await client.post(
url=f"{base_url}/validate/column_is_valid",
json={
"connectionInfo": connection_info,
"manifestStr": manifest_str,
"parameters": {"modelName": "Orders"},
},
)
assert response.status_code == 422
assert response.text == "Missing required parameter: `columnName`"

response = await client.post(
url=f"{base_url}/validate/column_is_valid",
json={
"connectionInfo": connection_info,
"manifestStr": manifest_str,
"parameters": {"columnName": "orderkey"},
},
)
assert response.status_code == 422
assert response.text == "Missing required parameter: `modelName`"


async def test_metadata_list_tables(client, oracle: OracleDbContainer):
connection_info = _to_connection_info(oracle)
response = await client.post(
url=f"{base_url}/metadata/tables",
json={"connectionInfo": connection_info},
)
assert response.status_code == 200

result = response.json()
# Use case-insensitive filtering since Oracle returns "SYSTEM.ORDERS"
result = next(filter(lambda x: x["name"].lower() == "system.orders", result))

assert result["name"].lower() == "system.orders"
assert result["primaryKey"] is not None
assert result["description"] == "This is a table comment"
assert result["properties"] == {
"catalog": "",
"schema": "SYSTEM",
"table": "ORDERS",
"path": None,
}
assert len(result["columns"]) == 9
assert result["columns"][8] == {
"name": "O_COMMENT",
"nestedColumns": None,
"type": "TEXT",
"notNull": False,
"description": "This is a comment",
"properties": None,
}


async def test_metadata_list_constraints(client, oracle: OracleDbContainer):
connection_info = _to_connection_info(oracle)
response = await client.post(
url=f"{base_url}/metadata/constraints",
json={"connectionInfo": connection_info},
)
assert response.status_code == 200


async def test_metadata_db_version(client, oracle: OracleDbContainer):
connection_info = _to_connection_info(oracle)
response = await client.post(
url=f"{base_url}/metadata/version",
json={"connectionInfo": connection_info},
)
assert response.status_code == 200
assert "23.0" in response.text

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's interesting. @douenergy, should we care about the LOB type? maybe add a test for it?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
ibis-server/app/model/metadata/oracle.py (1)

167-171: Rename “constraintedTable” for clarity.

“constraintedTable” is non-standard and may cause confusion. For improved readability, replace with “constrainedTable” or “referencedTable” to align with common naming conventions.

Apply this diff to rename the property:

-                    constraintedTable=self._format_compact_table_name(
+                    referencedTable=self._format_compact_table_name(
ibis-server/tests/routers/v2/connector/test_oracle.py (1)

389-390: Fix minor typographical error.

There's a comment stating “oracale has a lot of constraints …” at line 389. Replace “oracale” with “oracle” for accuracy.

-    # oracale has a lot of constraints on the system tables
+    # oracle has a lot of constraints on the system tables
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2bad36e and a7fd85d.

📒 Files selected for processing (3)
  • ibis-server/app/model/metadata/oracle.py (1 hunks)
  • ibis-server/app/util.py (1 hunks)
  • ibis-server/tests/routers/v2/connector/test_oracle.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • ibis-server/app/util.py
🔇 Additional comments (2)
ibis-server/app/model/metadata/oracle.py (1)

21-52: Question on hardcoded schema “SYSTEM”.

The SQL query limits retrieval to the SYSTEM schema. If you intend to support other schemas, consider parameterizing the schema name instead of hardcoding "t.owner = 'SYSTEM'". Otherwise, confirm that using the SYSTEM schema for storing user tables is intentional.

ibis-server/tests/routers/v2/connector/test_oracle.py (1)

120-121:

✅ Verification successful

Confirm Oracle compatibility with “LIMIT 1”.

The SQL query uses LIMIT 1, which is not traditionally supported in Oracle. Ensure that a translation layer applies the appropriate Oracle syntax (e.g. FETCH FIRST 1 ROWS ONLY), or confirm that this syntax works as intended.


🏁 Script executed:

#!/bin/bash
# Description: Check if there's a query rewriting mechanism or references to Oracle
# syntax adjustments for "LIMIT".
rg -A 5 'LIMIT 1'

Length of output: 53201


Oracle Query Compatibility Verified: Query Rewriting in Place

After checking the test outputs and the query rewriting logic in ibis-server/app/model/validator.py, it appears that our SQL rewriter correctly transforms queries using LIMIT 1 into the appropriate Oracle-compliant syntax. The Oracle connector tests in ibis-server/tests/routers/v2/connector/test_oracle.py confirm that these queries are handled as expected.

No changes to the current query syntax are necessary as long as you continue to rely on the rewriter for compatibility. Please ensure that any new queries using LIMIT are similarly covered by tests.

@douenergy douenergy requested a review from goldmedal February 27, 2025 11:43
@douenergy
Copy link
Contributor Author

Thanks @goldmedal Jax for reviewing! I addressed most of the comments.

But I don't think we need to explicitly add a LOB test, since we already define blob_column in the MDL, which is already a LOB column.

LOB is actually a logical type in Oracle, and BLOB is the physical type. You can see this in the Oracle documentation:
https://docs.oracle.com/en/database/oracle/oracle-database/23/adlob/LOB-classifications.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation ibis python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants