Skip to content

Commit

Permalink
feat: add find_bigdecimals (#273)
Browse files Browse the repository at this point in the history
Please be sure to look over the pull request guidelines here:
https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr.

# Please go through the following checklist
- [ ] The PR title and commit messages adhere to guidelines here:
https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md.
In particular `!` is used if and only if at least one breaking change
has been introduced.
- [ ] I have run the ci check script with `source
scripts/run_ci_checks.sh`.

# Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the linked issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.

 Example:
 Add `NestedLoopJoinExec`.
 Closes #345.

Since we added `HashJoinExec` in #323 it has been possible to do
provable inner joins. However performance is not satisfactory in some
cases. Hence we need to fix the problem by implement
`NestedLoopJoinExec` and speed up the code
 for `HashJoinExec`.
-->

# What changes are included in this PR?

<!--
There is no need to duplicate the description in the ticket here but it
is sometimes worth providing a summary of the individual changes in this
PR.

Example:
- Add `NestedLoopJoinExec`.
- Speed up `HashJoinExec`.
- Route joins to `NestedLoopJoinExec` if the outer input is sufficiently
small.
-->

# Are these changes tested?
<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?

Example:
Yes.
-->
  • Loading branch information
iajoiner authored Oct 17, 2024
2 parents 6feddb8 + 69eca14 commit b47481c
Show file tree
Hide file tree
Showing 5 changed files with 101 additions and 0 deletions.
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ rayon = { version = "1.5" }
serde = { version = "1", default-features = false }
serde_json = { version = "1", default-features = false, features = ["alloc"] }
snafu = { version = "0.8.4", default-features = false }
sqlparser = { version = "0.45.0", default-features = false }
tiny-keccak = { version = "2.0.2", features = [ "keccak" ] }
tracing = { version = "0.1.36", default-features = false }
tracing-opentelemetry = { version = "0.22.0" }
Expand Down
1 change: 1 addition & 0 deletions crates/proof-of-sql/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ rayon = { workspace = true, optional = true }
serde = { workspace = true, features = ["serde_derive"] }
serde_json = { workspace = true }
snafu = { workspace = true }
sqlparser = { workspace = true }
tiny-keccak = { workspace = true }
tracing = { workspace = true, features = ["attributes"] }
zerocopy = { workspace = true }
Expand Down
2 changes: 2 additions & 0 deletions crates/proof-of-sql/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ extern crate alloc;
pub mod base;
pub mod proof_primitive;
pub mod sql;
/// Utilities for working with the library
pub mod utils;

#[cfg(test)]
mod tests;
3 changes: 3 additions & 0 deletions crates/proof-of-sql/src/utils/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
//! This module contains utilities for working with the library
/// Parse DDLs and find bigdecimal columns
pub mod parse;
94 changes: 94 additions & 0 deletions crates/proof-of-sql/src/utils/parse.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
use crate::base::map::IndexMap;
use alloc::{
string::{String, ToString},
vec::Vec,
};
use sqlparser::{
ast::{DataType, ExactNumberInfo, Statement},
dialect::GenericDialect,
parser::Parser,
};

/// Parse a DDL file and return a map of table names to bigdecimal columns
///
/// # Panics
/// Panics if there is an error parsing the SQL
#[must_use]
pub fn find_bigdecimals(queries: &str) -> IndexMap<String, Vec<(String, u8, i8)>> {
let dialect = GenericDialect {};
let ast = Parser::parse_sql(&dialect, queries).expect("Failed to parse SQL");
// Find all `CREATE TABLE` statements
ast.iter()
.filter_map(|statement| match statement {
Statement::CreateTable { name, columns, .. } => {
// Find all `DECIMAL` columns where precision > 38
// Find the table name
// Add the table name and column name to the map
let str_name = name.to_string();
let big_decimal_specs: Vec<(String, u8, i8)> = columns
.iter()
.filter_map(|column_def| match column_def.data_type {
DataType::Decimal(ExactNumberInfo::PrecisionAndScale(precision, scale))
if precision > 38 =>
{
Some((column_def.name.to_string(), precision as u8, scale as i8))
}
_ => None,
})
.collect();
Some((str_name, big_decimal_specs))
}
_ => None,
})
.collect::<IndexMap<String, Vec<_>>>()
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn test_find_bigdecimals() {
let sql = "CREATE TABLE IF NOT EXISTS ETHEREUM.BLOCKS(
BLOCK_NUMBER BIGINT NOT NULL,
TIME_STAMP TIMESTAMP,
BLOCK_HASH VARCHAR,
MINER VARCHAR,
REWARD DECIMAL(78, 0),
SIZE_ INT,
GAS_USED INT,
GAS_LIMIT INT,
BASE_FEE_PER_GAS DECIMAL(78, 0),
TRANSACTION_COUNT INT,
PARENT_HASH VARCHAR,
PRIMARY KEY(BLOCK_NUMBER)
);
CREATE TABLE IF NOT EXISTS ETHEREUM.BLOCK_DETAILS(
BLOCK_NUMBER BIGINT NOT NULL,
TIME_STAMP TIMESTAMP,
SHA3_UNCLES VARCHAR,
STATE_ROOT VARCHAR,
TRANSACTIONS_ROOT VARCHAR,
RECEIPTS_ROOT VARCHAR,
UNCLES_COUNT INT,
VERSION VARCHAR,
LOGS_BLOOM VARCHAR,
NONCE VARCHAR,
PRIMARY KEY(BLOCK_NUMBER)
);";
let bigdecimals = find_bigdecimals(sql);
assert_eq!(
bigdecimals.get("ETHEREUM.BLOCKS").unwrap(),
&[
("REWARD".to_string(), 78, 0),
("BASE_FEE_PER_GAS".to_string(), 78, 0)
]
);
let empty_vec: Vec<(String, u8, i8)> = vec![];
assert_eq!(
bigdecimals.get("ETHEREUM.BLOCK_DETAILS").unwrap(),
&empty_vec
);
}
}

0 comments on commit b47481c

Please sign in to comment.