Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Address incompatibility of constant folding for ipaddress #12065

Closed

Conversation

yuandagits
Copy link
Contributor

Summary:
Presto may perform constant folding on queries before sending the fragment to velox workers. However, when the workers receive the fragments, the fragments may contain types which had a different implementation than how velox implemented the type. This incompatibility results in incorrect results.

For example, this PR fixes the type incompatibility between Java coordinator and C++ worker for ipaddress types.

  • Java coordinator, ipaddress is represented as a slice of 16 bytes which if represented as a number, would be big endian.
  • C++ worker, ipaddress is represented as an int128_t, in little endian form.

The discrepancy between these two can be see with on native engine, the result set will be ::ffff:1.2.3.4 represented in reverse byte order

SELECT 
  CAST(ip AS ipaddress) as casted_ip 
FROM 
  (
    VALUES 
      ('::ffff:1.2.3.4')
  ) AS t (ip)

To address this issue, we can reverse the byte order of the ipaddress type sent from and to Java.

Note:

  • This issue is not exclusive to ipaddrss, and other custom types in velox which have different underlying type/implementation than Java may suffer from this issue as well.

  • We can likely enhance the fuzzer to help catch cases like this at diff time once custom fuzzer inputs are landed (feat(fuzzer): Support custom input generator in VectorFuzzer #11466)

Differential Revision: D68039050

Summary:
Presto may perform constant folding on queries before sending the fragment to velox workers. However, when the workers receive the fragments, the fragments may contain types which had a different implementation than how velox implemented the type. This incompatibility results in incorrect results.

For example, this PR fixes the type incompatibility between Java coordinator and C++ worker for `ipaddress` types. 
 - Java coordinator, ipaddress is represented as a slice of 16 bytes which if represented as a number, would be big endian. 
- C++ worker, ipaddress is represented as an int128_t, in little endian form.

The discrepancy between these two can be see with on native engine, the result set will be `::ffff:1.2.3.4` represented in reverse byte order
```
SELECT 
  CAST(ip AS ipaddress) as casted_ip 
FROM 
  (
    VALUES 
      ('::ffff:1.2.3.4')
  ) AS t (ip)
```

To address this issue, we can reverse the byte order of the ipaddress type sent from and to Java.

**Note**: 

- This issue is not exclusive to ipaddrss, and other custom types in velox which have different underlying type/implementation than Java may suffer from this issue as well.

- We can likely enhance the fuzzer to help catch cases like this at diff time once custom fuzzer inputs are landed (facebookincubator#11466)

Differential Revision: D68039050
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 11, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D68039050

Copy link

netlify bot commented Jan 11, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit d1963b7
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/678305d672baa50008951093

@yuandagits
Copy link
Contributor Author

cc @mohsaka as well (not sure why your name doesn't show up when I try to add you as a reviewer ... )

@yuandagits yuandagits requested a review from Yuhta January 12, 2025 00:02
@yuandagits
Copy link
Contributor Author

Created issue for IPPrefix here: #12070

@yuandagits
Copy link
Contributor Author

will open a new diff to avoid rebase hell

@yuandagits yuandagits closed this Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants