Skip to content

Commit

Permalink
refactor!: proof_of_sql_parser::intermediate_ast::Identifier with `…
Browse files Browse the repository at this point in the history
…sqlparser::ast::Ident` in the proof-of-sql crate (#382)

Please be sure to look over the pull request guidelines here:
https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md#submit-pr.

# Please go through the following checklist
- [x] The PR title and commit messages adhere to guidelines here:
https://github.com/spaceandtimelabs/sxt-proof-of-sql/blob/main/CONTRIBUTING.md.
In particular `!` is used if and only if at least one breaking change
has been introduced.
- [x] I have run the ci check script with `source
scripts/run_ci_checks.sh`.

# Rationale for this change
This PR addresses the need to replace the
`proof_of_sql_parser::Identifier` with the `sqlparser::ast::Ident` in
the `proof-of-sql` crate as part of a larger transition toward
integrating the `sqlparser` .

This change is a subtask of issue #235, with the main goal of
streamlining the repository by switching to the `sqlparser` crate and
gradually replacing intermediary constructs like
`proof_of_sql_parser::intermediate_ast` with `sqlparser::ast`.
<!--
Why are you proposing this change? If this is already explained clearly
in the linked issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.

 Example:
 Add `NestedLoopJoinExec`.
 Closes #345.

Since we added `HashJoinExec` in #323 it has been possible to do
provable inner joins. However performance is not satisfactory in some
cases. Hence we need to fix the problem by implement
`NestedLoopJoinExec` and speed up the code
 for `HashJoinExec`.
-->

# What changes are included in this PR?
- All instances of `proof_of_sql_parser::Identifier` have been replaced
with `sqlparser::ast::Ident`
- A few of them required an identifier (e.g. Expression::Column, etc..),
which is dependent on the Identifier and will be migrated at the
refactoring of Exprs.
- Every usage of `Identifier` has been updated to maintain the original
functionality, ensuring no changes to the logic or behavior.
- The breaking change here is that `Ident` doesn't support `Copy` trait
so we have needed the clones in the places where values are moved
- Deleted the test
`we_cannot_convert_a_record_batch_if_it_has_repeated_column_names`
because the `sqlparser` now differentiates between uppercase and
lowercase identifiers. Case normalization is no longer applied and
`sqlparser` treats `a` and `A` as distinct identifiers.

- Examples are updated to align with `sqlparser`'s case-sensitive
behavior.

<!--
There is no need to duplicate the description in the ticket here but it
is sometimes worth providing a summary of the individual changes in this
PR.

Example:
- Add `NestedLoopJoinExec`.
- Speed up `HashJoinExec`.
- Route joins to `NestedLoopJoinExec` if the outer input is sufficiently
small.
-->

# Are these changes tested?

Yes
<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?

Example:
Yes.
-->

Part of #235
  • Loading branch information
iajoiner authored Dec 16, 2024
2 parents a9ba6e0 + 1807f71 commit be3853a
Show file tree
Hide file tree
Showing 107 changed files with 1,207 additions and 1,288 deletions.
2 changes: 1 addition & 1 deletion crates/proof-of-sql/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ serde = { workspace = true, features = ["serde_derive"] }
serde_json = { workspace = true }
sha2 = { workspace = true }
snafu = { workspace = true }
sqlparser = { workspace = true }
sqlparser = { workspace = true, features = ["serde"] }
sysinfo = {workspace = true }
tiny-keccak = { workspace = true }
tracing = { workspace = true, features = ["attributes"] }
Expand Down
21 changes: 10 additions & 11 deletions crates/proof-of-sql/benches/scaffold/benchmark_accessor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,14 @@ use proof_of_sql::base::{
SchemaAccessor, TableRef,
},
};
use proof_of_sql_parser::Identifier;

use sqlparser::ast::Ident;
#[derive(Default)]
pub struct BenchmarkAccessor<'a, C: Commitment> {
columns: IndexMap<ColumnRef, Column<'a, C::Scalar>>,
lengths: IndexMap<TableRef, usize>,
commitments: IndexMap<ColumnRef, C>,
column_types: IndexMap<(TableRef, Identifier), ColumnType>,
table_schemas: IndexMap<TableRef, Vec<(Identifier, ColumnType)>>,
column_types: IndexMap<(TableRef, Ident), ColumnType>,
table_schemas: IndexMap<TableRef, Vec<(Ident, ColumnType)>>,
}

impl<'a, C: Commitment> BenchmarkAccessor<'a, C> {
Expand All @@ -24,14 +23,14 @@ impl<'a, C: Commitment> BenchmarkAccessor<'a, C> {
pub fn insert_table(
&mut self,
table_ref: TableRef,
columns: &[(Identifier, Column<'a, C::Scalar>)],
columns: &[(Ident, Column<'a, C::Scalar>)],
setup: &C::PublicSetup<'_>,
) {
self.table_schemas.insert(
table_ref,
columns
.iter()
.map(|(id, col)| (*id, col.column_type()))
.map(|(id, col)| (id.clone(), col.column_type()))
.collect(),
);

Expand All @@ -45,15 +44,15 @@ impl<'a, C: Commitment> BenchmarkAccessor<'a, C> {
let mut length = None;
for (column, commitment) in columns.iter().zip(commitments) {
self.columns.insert(
ColumnRef::new(table_ref, column.0, column.1.column_type()),
ColumnRef::new(table_ref, column.0.clone(), column.1.column_type()),
column.1,
);
self.commitments.insert(
ColumnRef::new(table_ref, column.0, column.1.column_type()),
ColumnRef::new(table_ref, column.0.clone(), column.1.column_type()),
commitment,
);
self.column_types
.insert((table_ref, column.0), column.1.column_type());
.insert((table_ref, column.0.clone()), column.1.column_type());

if let Some(len) = length {
assert!(len == column.1.len());
Expand Down Expand Up @@ -93,13 +92,13 @@ impl<C: Commitment> CommitmentAccessor<C> for BenchmarkAccessor<'_, C> {
}
}
impl<C: Commitment> SchemaAccessor for BenchmarkAccessor<'_, C> {
fn lookup_column(&self, table_ref: TableRef, column_id: Identifier) -> Option<ColumnType> {
fn lookup_column(&self, table_ref: TableRef, column_id: Ident) -> Option<ColumnType> {
self.column_types.get(&(table_ref, column_id)).copied()
}
/// # Panics
///
/// Will panic if the table reference does not exist in the table schemas map.
fn lookup_schema(&self, table_ref: TableRef) -> Vec<(Identifier, ColumnType)> {
fn lookup_schema(&self, table_ref: TableRef) -> Vec<(Ident, ColumnType)> {
self.table_schemas.get(&table_ref).unwrap().clone()
}
}
3 changes: 1 addition & 2 deletions crates/proof-of-sql/benches/scaffold/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,7 @@ fn scaffold<'a, CP: CommitmentEvaluationProof>(
&generate_random_columns(alloc, rng, columns, size),
prover_setup,
);
let query =
QueryExpr::try_new(query.parse().unwrap(), "bench".parse().unwrap(), accessor).unwrap();
let query = QueryExpr::try_new(query.parse().unwrap(), "bench".into(), accessor).unwrap();
let result = VerifiableQueryResult::new(query.proof_expr(), accessor, prover_setup);
(query, result)
}
Expand Down
6 changes: 3 additions & 3 deletions crates/proof-of-sql/benches/scaffold/random_util.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ use proof_of_sql::base::{
database::{Column, ColumnType},
scalar::Scalar,
};
use proof_of_sql_parser::Identifier;
use rand::Rng;
use sqlparser::ast::Ident;

pub type OptionalRandBound = Option<fn(usize) -> i64>;
/// # Panics
Expand All @@ -18,12 +18,12 @@ pub fn generate_random_columns<'a, S: Scalar>(
rng: &mut impl Rng,
columns: &[(&str, ColumnType, OptionalRandBound)],
num_rows: usize,
) -> Vec<(Identifier, Column<'a, S>)> {
) -> Vec<(Ident, Column<'a, S>)> {
columns
.iter()
.map(|(id, ty, bound)| {
(
id.parse().unwrap(),
Ident::new(*id),
match (ty, bound) {
(ColumnType::Boolean, _) => {
Column::Boolean(alloc.alloc_slice_fill_with(num_rows, |_| rng.gen()))
Expand Down
3 changes: 1 addition & 2 deletions crates/proof-of-sql/examples/albums/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,7 @@ fn prove_and_verify_query(
// Parse the query:
println!("Parsing the query: {sql}...");
let now = Instant::now();
let query_plan =
QueryExpr::try_new(sql.parse().unwrap(), "albums".parse().unwrap(), accessor).unwrap();
let query_plan = QueryExpr::try_new(sql.parse().unwrap(), "albums".into(), accessor).unwrap();
println!("Done in {} ms.", now.elapsed().as_secs_f64() * 1000.);

// Generate the proof and result:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Year,Price
year,price
1990,96
1991,100
1992,269
Expand Down
3 changes: 1 addition & 2 deletions crates/proof-of-sql/examples/avocado-prices/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,7 @@ fn prove_and_verify_query(
// Parse the query:
println!("Parsing the query: {sql}...");
let now = Instant::now();
let query_plan =
QueryExpr::try_new(sql.parse().unwrap(), "avocado".parse().unwrap(), accessor).unwrap();
let query_plan = QueryExpr::try_new(sql.parse().unwrap(), "avocado".into(), accessor).unwrap();
println!("Done in {} ms.", now.elapsed().as_secs_f64() * 1000.);

// Generate the proof and result:
Expand Down
3 changes: 1 addition & 2 deletions crates/proof-of-sql/examples/books/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,7 @@ fn prove_and_verify_query(
// Parse the query:
println!("Parsing the query: {sql}...");
let now = Instant::now();
let query_plan =
QueryExpr::try_new(sql.parse().unwrap(), "books".parse().unwrap(), accessor).unwrap();
let query_plan = QueryExpr::try_new(sql.parse().unwrap(), "books".into(), accessor).unwrap();
println!("Done in {} ms.", now.elapsed().as_secs_f64() * 1000.);

// Generate the proof and result:
Expand Down
2 changes: 1 addition & 1 deletion crates/proof-of-sql/examples/brands/brands.csv
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Name,Country,Founded,Revenue
name,country,founded,revenue
Apple,United States,1976,365.82
Samsung,South Korea,1938,200.73
Microsoft,United States,1975,198.27
Expand Down
3 changes: 1 addition & 2 deletions crates/proof-of-sql/examples/brands/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,7 @@ fn prove_and_verify_query(
// Parse the query:
println!("Parsing the query: {sql}...");
let now = Instant::now();
let query_plan =
QueryExpr::try_new(sql.parse().unwrap(), "brands".parse().unwrap(), accessor).unwrap();
let query_plan = QueryExpr::try_new(sql.parse().unwrap(), "brands".into(), accessor).unwrap();
println!("Done in {} ms.", now.elapsed().as_secs_f64() * 1000.);

// Generate the proof and result:
Expand Down
2 changes: 1 addition & 1 deletion crates/proof-of-sql/examples/census/census-income.csv
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Id,Geography,Id2,Households_Estimate_Total
id,geography,id2,households_estimate_total
0400000US01,Alabama,1,1837292
0400000US02,Alaska,2,250875
0400000US04,Arizona,4,2381501
Expand Down
3 changes: 1 addition & 2 deletions crates/proof-of-sql/examples/census/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,7 @@ fn prove_and_verify_query(
// Parse the query:
println!("Parsing the query: {sql}...");
let now = Instant::now();
let query_plan =
QueryExpr::try_new(sql.parse().unwrap(), "census".parse().unwrap(), accessor).unwrap();
let query_plan = QueryExpr::try_new(sql.parse().unwrap(), "census".into(), accessor).unwrap();
println!("Done in {} ms.", now.elapsed().as_secs_f64() * 1000.);

// Generate the proof and result:
Expand Down
2 changes: 1 addition & 1 deletion crates/proof-of-sql/examples/countries/countries_gdp.csv
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Country,Continent,GDP,GDPP
country,continent,gdp,gdpp
UnitedStates,NorthAmerica,21137,63543
China,Asia,14342,10261
Japan,Asia,5081,40293
Expand Down
2 changes: 1 addition & 1 deletion crates/proof-of-sql/examples/countries/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ fn prove_and_verify_query(
println!("Parsing the query: {sql}...");
let now = Instant::now();
let query_plan =
QueryExpr::try_new(sql.parse().unwrap(), "countries".parse().unwrap(), accessor).unwrap();
QueryExpr::try_new(sql.parse().unwrap(), "countries".into(), accessor).unwrap();
println!("Done in {} ms.", now.elapsed().as_secs_f64() * 1000.);

// Generate the proof and result:
Expand Down
2 changes: 1 addition & 1 deletion crates/proof-of-sql/examples/dinosaurs/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ fn prove_and_verify_query(
println!("Parsing the query: {sql}...");
let now = Instant::now();
let query_plan =
QueryExpr::try_new(sql.parse().unwrap(), "dinosaurs".parse().unwrap(), accessor).unwrap();
QueryExpr::try_new(sql.parse().unwrap(), "dinosaurs".into(), accessor).unwrap();
println!("Done in {} ms.", now.elapsed().as_secs_f64() * 1000.);

// Generate the proof and result:
Expand Down
2 changes: 1 addition & 1 deletion crates/proof-of-sql/examples/dog_breeds/dog_breeds.csv
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Name,Origin,Size,Lifespan
name,origin,size,lifespan
Labrador Retriever,Canada,Large,12
German Shepherd,Germany,Large,11
Chihuahua,Mexico,Small,14
Expand Down
8 changes: 2 additions & 6 deletions crates/proof-of-sql/examples/dog_breeds/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,8 @@ fn prove_and_verify_query(
// Parse the query:
println!("Parsing the query: {sql}...");
let now = Instant::now();
let query_plan = QueryExpr::try_new(
sql.parse().unwrap(),
"dog_breeds".parse().unwrap(),
accessor,
)
.unwrap();
let query_plan =
QueryExpr::try_new(sql.parse().unwrap(), "dog_breeds".into(), accessor).unwrap();
println!("Done in {} ms.", now.elapsed().as_secs_f64() * 1000.);

// Generate the proof and result:
Expand Down
2 changes: 1 addition & 1 deletion crates/proof-of-sql/examples/hello_world/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ fn main() {
let timer = start_timer("Parsing Query");
let query = QueryExpr::try_new(
"SELECT b FROM table WHERE a = 2".parse().unwrap(),
"sxt".parse().unwrap(),
"sxt".into(),
&accessor,
)
.unwrap();
Expand Down
3 changes: 1 addition & 2 deletions crates/proof-of-sql/examples/plastics/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,7 @@ fn prove_and_verify_query(
// Parse the query:
println!("Parsing the query: {sql}...");
let now = Instant::now();
let query_plan =
QueryExpr::try_new(sql.parse().unwrap(), "plastics".parse().unwrap(), accessor).unwrap();
let query_plan = QueryExpr::try_new(sql.parse().unwrap(), "plastics".into(), accessor).unwrap();
println!("Done in {} ms.", now.elapsed().as_secs_f64() * 1000.);

// Generate the proof and result:
Expand Down
2 changes: 1 addition & 1 deletion crates/proof-of-sql/examples/plastics/plastics.csv
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Name,Code,Density,Biodegradable
name,code,density,biodegradable
Polyethylene Terephthalate (PET),1,1.38,FALSE
High-Density Polyethylene (HDPE),2,0.97,FALSE
Polyvinyl Chloride (PVC),3,1.40,FALSE
Expand Down
4 changes: 2 additions & 2 deletions crates/proof-of-sql/examples/posql_db/commit_accessor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ impl<C: Commitment> SchemaAccessor for CommitAccessor<C> {
fn lookup_column(
&self,
table_ref: proof_of_sql::base::database::TableRef,
column_id: proof_of_sql_parser::Identifier,
column_id: sqlparser::ast::Ident,
) -> Option<proof_of_sql::base::database::ColumnType> {
self.inner.lookup_column(table_ref, column_id)
}
Expand All @@ -64,7 +64,7 @@ impl<C: Commitment> SchemaAccessor for CommitAccessor<C> {
&self,
table_ref: proof_of_sql::base::database::TableRef,
) -> Vec<(
proof_of_sql_parser::Identifier,
sqlparser::ast::Ident,
proof_of_sql::base::database::ColumnType,
)> {
self.inner.lookup_schema(table_ref)
Expand Down
4 changes: 2 additions & 2 deletions crates/proof-of-sql/examples/posql_db/csv_accessor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -96,15 +96,15 @@ impl SchemaAccessor for CsvDataAccessor {
fn lookup_column(
&self,
table_ref: TableRef,
column_id: proof_of_sql_parser::Identifier,
column_id: sqlparser::ast::Ident,
) -> Option<proof_of_sql::base::database::ColumnType> {
self.inner.lookup_column(table_ref, column_id)
}
fn lookup_schema(
&self,
table_ref: TableRef,
) -> Vec<(
proof_of_sql_parser::Identifier,
sqlparser::ast::Ident,
proof_of_sql::base::database::ColumnType,
)> {
self.inner.lookup_schema(table_ref)
Expand Down
10 changes: 4 additions & 6 deletions crates/proof-of-sql/examples/posql_db/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ fn main() {
commit_accessor
.lookup_schema(table_name)
.iter()
.map(|(i, t)| Field::new(i.as_str(), t.into(), false))
.map(|(i, t)| Field::new(i.value.as_str(), t.into(), false))
.collect::<Vec<_>>(),
);
let append_batch =
Expand Down Expand Up @@ -233,15 +233,14 @@ fn main() {
commit_accessor
.lookup_schema(table)
.iter()
.map(|(i, t)| Field::new(i.as_str(), t.into(), false))
.map(|(i, t)| Field::new(i.value.as_str(), t.into(), false))
.collect::<Vec<_>>(),
);
csv_accessor
.load_table(table, schema)
.expect("Failed to load table");
}
let query =
QueryExpr::try_new(query, "example".parse().unwrap(), &commit_accessor).unwrap();
let query = QueryExpr::try_new(query, "example".into(), &commit_accessor).unwrap();
let timer = start_timer("Generating Proof");
let proof = VerifiableQueryResult::<DynamicDoryEvaluationProof>::new(
query.proof_expr(),
Expand All @@ -265,8 +264,7 @@ fn main() {
.load_commit(table_name)
.expect("Failed to load commit");
}
let query =
QueryExpr::try_new(query, "example".parse().unwrap(), &commit_accessor).unwrap();
let query = QueryExpr::try_new(query, "example".into(), &commit_accessor).unwrap();
let result: VerifiableQueryResult<DynamicDoryEvaluationProof> =
postcard::from_bytes(&fs::read(file).expect("Failed to read proof"))
.expect("Failed to deserialize proof");
Expand Down
13 changes: 6 additions & 7 deletions crates/proof-of-sql/examples/posql_db/record_batch_accessor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,7 @@ use proof_of_sql::base::{
},
scalar::Scalar,
};
use proof_of_sql_parser::Identifier;

use sqlparser::ast::Ident;
#[derive(Default)]
/// An implementation of a data accessor that uses a record batch as the underlying data source.
///
Expand All @@ -31,7 +30,7 @@ impl<S: Scalar> DataAccessor<S> for RecordBatchAccessor {
.get(&column.table_ref())
.expect("Table not found.");
let arrow_column = table
.column_by_name(column.column_id().as_str())
.column_by_name(column.column_id().value.as_str())
.expect("Column not found.");
let result = arrow_column
.to_column(&self.alloc, &(0..table.num_rows()), None)
Expand All @@ -58,12 +57,12 @@ impl MetadataAccessor for RecordBatchAccessor {
}
}
impl SchemaAccessor for RecordBatchAccessor {
fn lookup_column(&self, table_ref: TableRef, column_id: Identifier) -> Option<ColumnType> {
fn lookup_column(&self, table_ref: TableRef, column_id: Ident) -> Option<ColumnType> {
self.tables
.get(&table_ref)
.expect("Table not found.")
.schema()
.column_with_name(column_id.as_str())
.column_with_name(column_id.value.as_str())
.map(|(_, f)| {
f.data_type()
.clone()
Expand All @@ -72,7 +71,7 @@ impl SchemaAccessor for RecordBatchAccessor {
})
}

fn lookup_schema(&self, table_ref: TableRef) -> Vec<(Identifier, ColumnType)> {
fn lookup_schema(&self, table_ref: TableRef) -> Vec<(Ident, ColumnType)> {
self.tables
.get(&table_ref)
.expect("Table not found.")
Expand All @@ -81,7 +80,7 @@ impl SchemaAccessor for RecordBatchAccessor {
.iter()
.map(|field| {
(
field.name().parse().expect("Failed to parse field name."),
Ident::new(field.name()),
field
.data_type()
.clone()
Expand Down
8 changes: 2 additions & 6 deletions crates/proof-of-sql/examples/programming_books/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,8 @@ fn prove_and_verify_query(
// Parse the query:
println!("Parsing the query: {sql}...");
let now = Instant::now();
let query_plan = QueryExpr::try_new(
sql.parse().unwrap(),
"programming_books".parse().unwrap(),
accessor,
)
.unwrap();
let query_plan =
QueryExpr::try_new(sql.parse().unwrap(), "programming_books".into(), accessor).unwrap();
println!("Done in {} ms.", now.elapsed().as_secs_f64() * 1000.);

// Generate the proof and result:
Expand Down
Loading

0 comments on commit be3853a

Please sign in to comment.