-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(ibis): format float with scientific notation #1023
Conversation
WalkthroughThe pull request introduces modifications to the Changes
Sequence DiagramsequenceDiagram
participant Util as Utility Module
participant DataFrame as Pandas DataFrame
participant JSON as JSON Converter
Util->>DataFrame: Apply float formatting
DataFrame-->>Util: Mapped DataFrame
Util->>JSON: Convert to dictionary
JSON-->>Util: Formatted JSON object
Poem
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
ibis-server/app/util.py
(1 hunks)ibis-server/tests/routers/v2/connector/test_postgres.py
(1 hunks)
🔇 Additional comments (1)
ibis-server/app/util.py (1)
32-34
: Confirm the float format consistency with test requirements.Here, using
f"{x:.9f}"
enforces a non-scientific decimal notation with exactly nine decimal digits. However, thetest_format_floating
intest_postgres.py
checks for scientific notation (e.g.,"1.23E-7"
). Verify whether the desired format is strictly decimal or scientific notation. If scientific notation is desired for large or small floats, replace"{x:.9f}"
with a more suitable specifier like"{x:.9g}"
or a custom formatting function that selectively uses scientific notation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
ibis-server/tests/routers/v2/connector/test_postgres.py (3)
218-253
: Improve SQL query readability with consistent formatting.The SQL query could be more readable with consistent indentation and grouping. Consider adding comments to separate the test case groups.
"sql": """ SELECT + -- Scientific notation cases 0.0123e-5 AS case_scientific_original, 1.23e+4 AS case_scientific_positive, -4.56e-3 AS case_scientific_negative, 7.89e0 AS case_scientific_zero_exponent, 0e0 AS case_scientific_zero, + -- Decimal cases 123.456 AS case_decimal_positive, -123.456 AS case_decimal_negative, 0.0000123 AS case_decimal_small, 123.0000 AS case_decimal_trailing_zeros, 0.0 AS case_decimal_zero, + -- Integer cases 0 AS case_integer_zero, 0e-9 AS case_integer_zero_scientific, -1 AS case_integer_negative, 9999999999 AS case_integer_large, + -- Float boundary cases 1.7976931348623157E+308 AS case_float_max, 2.2250738585072014E-308 AS case_float_min, -1.7976931348623157E+308 AS case_float_min_negative, + -- Mixed arithmetic cases 1.23e4 + 4.56 AS case_mixed_addition, -1.23e-4 - 123.45 AS case_mixed_subtraction, 0.0123e-5 * 1000 AS case_mixed_multiplication, 123.45 / 1.23e2 AS case_mixed_division, + -- Special cases CAST('NaN' AS FLOAT) AS case_special_nan, CAST('Infinity' AS FLOAT) AS case_special_infinity, CAST('-Infinity' AS FLOAT) AS case_special_negative_infinity, NULL AS case_special_null, + -- Type casting cases CAST(123.456 AS FLOAT) AS case_cast_float, CAST(1.23e4 AS DECIMAL(10,5)) AS case_cast_decimal """,🧰 Tools
🪛 GitHub Actions: ibis CI
[warning] File needs to be reformatted using Ruff formatter. Code style does not match the required format.
256-289
: Enhance test assertions for better error reporting.While the test covers various floating-point scenarios, the assertions could be more specific and easier to debug:
- Add validation for response column types (dtypes)
- Validate column names
- Split assertions for each test case
assert response.status_code == 200 result = response.json() + + # Validate response structure + assert "columns" in result, "Response should contain 'columns'" + assert "dtypes" in result, "Response should contain 'dtypes'" + assert "data" in result, "Response should contain 'data'" + + # Validate column names match test cases + expected_columns = [ + "case_scientific_original", + "case_scientific_positive", + # ... add all expected column names + ] + assert result["columns"] == expected_columns + + # Validate data types + for column in expected_columns: + assert column in result["dtypes"], f"Missing dtype for column {column}" + assert result["dtypes"][column] in ["object", "float64"], f"Unexpected dtype for column {column}" - assert result["data"] == [ - [ - "1.23E-7", - "1.23E+4", + # Validate individual test cases for easier debugging + data = result["data"][0] + assert data[0] == "1.23E-7", "Scientific notation (original) failed" + assert data[1] == "1.23E+4", "Scientific notation (positive) failed"🧰 Tools
🪛 GitHub Actions: ibis CI
[warning] File needs to be reformatted using Ruff formatter. Code style does not match the required format.
211-211
: Add docstring to describe test purpose and cases.Add a docstring to document the test's purpose and the different categories of test cases it covers.
async def test_format_floating(client, manifest_str, postgres): + """ + Test PostgreSQL connector's handling of various floating-point number formats. + + Test cases include: + - Scientific notation (positive, negative, zero exponent) + - Decimal numbers (positive, negative, small values, trailing zeros) + - Integer representations + - Float boundaries (max, min, negative) + - Mixed arithmetic operations + - Special values (NaN, Infinity) + - Type casting + """🧰 Tools
🪛 GitHub Actions: ibis CI
[warning] File needs to be reformatted using Ruff formatter. Code style does not match the required format.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
ibis-server/app/util.py
(1 hunks)ibis-server/tests/routers/v2/connector/test_postgres.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- ibis-server/app/util.py
🧰 Additional context used
🪛 GitHub Actions: ibis CI
ibis-server/tests/routers/v2/connector/test_postgres.py
[warning] File needs to be reformatted using Ruff formatter. Code style does not match the required format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
ibis-server/tests/routers/v2/connector/test_postgres.py (2)
219-253
: Consider adding edge cases for comprehensive testing.The test cases cover a wide range of scenarios. Consider adding these edge cases to strengthen the test suite:
- Numbers close to underflow (e.g.,
1.18e-308
)- Numbers with many significant digits (e.g.,
1.234567890123456789
)- Numbers that might trigger rounding behavior (e.g.,
9.9999999999999999
)SELECT 0.0123e-5 AS case_scientific_original, 1.23e+4 AS case_scientific_positive, -4.56e-3 AS case_scientific_negative, 7.89e0 AS case_scientific_zero_exponent, 0e0 AS case_scientific_zero, + 1.18e-308 AS case_edge_near_underflow, + 1.234567890123456789 AS case_edge_many_digits, + 9.9999999999999999 AS case_edge_rounding,
218-253
: Improve SQL query readability with comments.Consider adding SQL comments to explain the expected behavior and any specific formatting rules for each test case. This will make it easier for other developers to understand the test's intentions.
"sql": """ SELECT + -- Test scientific notation formatting + -- Expected: Maintains E notation for very small numbers 0.0123e-5 AS case_scientific_original, + -- Expected: Maintains E notation for large numbers 1.23e+4 AS case_scientific_positive, + -- Expected: Converts to decimal for small negative numbers -4.56e-3 AS case_scientific_negative,
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ibis-server/tests/routers/v2/connector/test_postgres.py
(1 hunks)
🔇 Additional comments (2)
ibis-server/tests/routers/v2/connector/test_postgres.py (2)
211-211
: Well-structured test organization!The test function is well-organized with clear sections for different floating-point scenarios, making it easy to understand and maintain.
211-290
: Address formatting issues.The code needs to be reformatted using the Ruff formatter as mentioned in previous reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
ibis-server/tests/routers/v3/connector/postgres/test_query.py (1)
Line range hint
1-1
: Code formatting requiredThe pipeline indicates a formatting issue. Please reformat the code to adhere to your project's style guidelines (e.g., using Black, isort, or the configured linting tool).
#!/bin/bash # Example: using Black to reformat this file. black ibis-server/tests/routers/v3/connector/postgres/test_query.pyibis-server/tests/routers/v2/connector/test_postgres.py (2)
211-286
: Enhance test organization with grouped assertions.The test cases are well-structured in the SQL query, but the assertions could be better organized to match this structure. Consider grouping related assertions and adding comments to match the SQL sections.
assert response.status_code == 200 result = response.json() + # Scientific notation cases assert result["data"][0] == "1.23E-7" assert result["data"][1] == "1.23E+4" assert result["data"][2] == "-0.00456" assert result["data"][3] == "7.89" assert result["data"][4] == "0" + + # Decimal cases assert result["data"][5] == "123.456" assert result["data"][6] == "-123.456" assert result["data"][7] == "0.0000123" assert result["data"][8] == "123" assert result["data"][9] == "0" + + # Integer cases assert result["data"][10] == 0 assert result["data"][11] == "0" assert result["data"][12] == -1 assert result["data"][13] == 9999999999 + + # Float boundary cases assert result["data"][14] == "1.7976931348623157E+308" assert result["data"][15] == "2.2250738585072014E-308" assert result["data"][16] == "-1.7976931348623157E+308" + + # Mixed arithmetic cases assert result["data"][17] == "12304.56" assert result["data"][18] == "-123.450123" assert result["data"][19] == "0.000123" assert result["data"][20] == "1.0036585365853659" + + # Special cases assert result["data"][21] == "nan" assert result["data"][22] == "inf" assert result["data"][23] == "-inf" assert result["data"][24] is None + + # Type casting cases assert result["data"][25] == "123.456001" assert result["data"][26] == "12300.00000"
219-253
: Consider adding more edge cases and type verification.While the test cases are comprehensive, consider adding:
- Edge cases for very small numbers (e.g.,
1e-323
) near the denormal range- Cases with repeated decimals (e.g.,
1/3
)- Type verification using
isinstance()
for numeric valuesSELECT 0.0123e-5 AS case_scientific_original, 1.23e+4 AS case_scientific_positive, -4.56e-3 AS case_scientific_negative, 7.89e0 AS case_scientific_zero_exponent, 0e0 AS case_scientific_zero, + 1e-323 AS case_scientific_denormal, 123.456 AS case_decimal_positive, -123.456 AS case_decimal_negative, 0.0000123 AS case_decimal_small, 123.0000 AS case_decimal_trailing_zeros, 0.0 AS case_decimal_zero, + CAST(1.0/3.0 AS FLOAT) AS case_decimal_repeating, 0 AS case_integer_zero, 0e-9 AS case_integer_zero_scientific, -1 AS case_integer_negative, 9999999999 AS case_integer_large,And add type verification:
assert result["data"][13] == 9999999999 + assert isinstance(result["data"][13], int) assert result["data"][14] == "1.7976931348623157E+308" + assert isinstance(result["data"][14], str)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
ibis-server/tests/routers/v2/connector/test_bigquery.py
(1 hunks)ibis-server/tests/routers/v2/connector/test_canner.py
(1 hunks)ibis-server/tests/routers/v2/connector/test_postgres.py
(1 hunks)ibis-server/tests/routers/v2/connector/test_trino.py
(1 hunks)ibis-server/tests/routers/v3/connector/postgres/test_query.py
(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: ibis CI
ibis-server/tests/routers/v2/connector/test_canner.py
[warning] File needs reformatting. Code formatting does not meet the required standards.
🔇 Additional comments (6)
ibis-server/tests/routers/v3/connector/postgres/test_query.py (1)
118-118
: Changed expected total price type from float to stringThis assertion now treats
o_totalprice
as a string ("172799.49"
) instead of a float. Ensure that the underlying_to_json_obj
conversion and any consuming code expects and handles the string type consistently across all connectors.ibis-server/tests/routers/v2/connector/test_canner.py (1)
97-97
: Changed expected total price type from float to stringThe test now expects
"172799.49"
instead of the float value172799.49
. Verify that all references and validations fortotalprice
are updated accordingly throughout the code.🧰 Tools
🪛 GitHub Actions: ibis CI
[warning] File needs reformatting. Code formatting does not meet the required standards.
ibis-server/tests/routers/v2/connector/test_bigquery.py (1)
93-93
: Changed expected total price type from float to stringThis test expects
"172799.49"
instead of a float value. Confirm that your BigQuery connector and downstream logic handle string representations of float values correctly.ibis-server/tests/routers/v2/connector/test_trino.py (1)
114-114
: Changed expected total price type from float to stringSimilar to other connectors, Trino’s test now anticipates
"172799.49"
. Ensure that the type change is consistent with the new float-to-string logic and that calling code or consumers are aware of this change.ibis-server/tests/routers/v2/connector/test_postgres.py (2)
Line range hint
287-309
: LGTM! Error message assertion is more precise.The modified error message assertion improves test reliability by matching the exact PostgreSQL error message format.
211-286
: Effectively addresses float formatting concerns.The comprehensive test cases for floating-point formatting effectively address the concerns raised in previous reviews about float formatting consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
ibis-server/tests/routers/v2/connector/test_postgres.py (2)
259-285
: Consider grouping assertions for better maintainability.The individual assertions, while thorough, could be organized better. Consider grouping related assertions and using helper methods to improve readability and maintenance.
- assert result["data"][0][0] == "1.23E-7" - assert result["data"][0][1] == "1.23E+4" - # ... more assertions ... + data = result["data"][0] + + # Scientific notation cases + scientific_cases = { + 0: "1.23E-7", # case_scientific_original + 1: "1.23E+4", # case_scientific_positive + 2: "-0.00456", # case_scientific_negative + 3: "7.89", # case_scientific_zero_exponent + 4: "0" # case_scientific_zero + } + for idx, expected in scientific_cases.items(): + assert data[idx] == expected, f"Scientific notation case {idx} failed" + + # Similar grouping for other categories...
218-253
: Add descriptive comments for test case categories.The SQL query is well-structured but could benefit from category headers to improve readability.
"sql": """ SELECT + -- Scientific notation cases 0.0123e-5 AS case_scientific_original, 1.23e+4 AS case_scientific_positive, -4.56e-3 AS case_scientific_negative, 7.89e0 AS case_scientific_zero_exponent, 0e0 AS case_scientific_zero, + -- Decimal number cases 123.456 AS case_decimal_positive, -123.456 AS case_decimal_negative, # ... rest of the cases with similar category headers
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ibis-server/tests/routers/v2/connector/test_postgres.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: ci
🔇 Additional comments (2)
ibis-server/tests/routers/v2/connector/test_postgres.py (2)
211-286
: Comprehensive test coverage for floating-point formatting!The test cases thoroughly cover various scenarios including scientific notation, decimal numbers, integers, float boundaries, mixed arithmetic, and special values. This ensures robust handling of floating-point representations.
246-249
: Verify PostgreSQL version compatibility for special float values.The test includes special float values (NaN, Infinity). Ensure these are consistently handled across different PostgreSQL versions.
✅ Verification successful
Special float values are safely handled across PostgreSQL versions
These CAST operations for special float values (NaN, Infinity) use standard SQL syntax and have been reliably supported in PostgreSQL since before 2013, with continuous maintenance and improvements. The test implementation is version-agnostic and uses well-established features.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check PostgreSQL version support for special float values # Test: Check PostgreSQL version and float handling documentation gh api graphql -f query=' { repository(owner: "postgres", name: "postgres") { object(expression: "master") { ... on Commit { history(path: "src/backend/utils/adt/float.c") { nodes { messageHeadline committedDate } } } } } } 'Length of output: 11491
Script:
#!/bin/bash # Check test configuration and related test cases rg -A 5 "postgresql" ibis-server/tests/routers/v2/connector/test_postgres.py rg -A 5 "FLOAT" ibis-server/tests/routers/v2/connector/test_postgres.pyLength of output: 523
Description
Summary by CodeRabbit
New Features
Bug Fixes
Tests
totalprice
in multiple test cases, changing from float to string representation.