Skip to content

Commit

Permalink
feat!: initial support for timestamp (#12)
Browse files Browse the repository at this point in the history
# Rationale for this change

Provides the initial typing support for a postgresql-like
```TimeStamp``` type. The proofs ```TimeStampTZ``` is typed over a
custom ```TimeUnit``` and ```TimeZone``` type.

The choice is made to type out TimeStamp in this way because the
arrow::datatypes::TimeStamp has a required TimeUnit field and an Option
TimeZone field. Typing out our own TimeStamp with these fields gives us
greater control over the arrow type if we want. We can also jut default
to seconds and UTC if needed, which would simplify this design a bit.

## Typing Rationale

The ```arrow::datatypes::timezone``` is typed over a ```TimeUnit``` and
an optional timezone ```Option<Arc<str>>```. Thus in our application it
makes sense to have a mapping of this metadata:

example:
```rust
// arrow datatype mapping to our new timestamp type
DataType::Timestamp(time_unit, timezone_option) => Ok(ColumnType::TimestampTZ(
    PoSQLTimeUnit::from(time_unit),
    PoSQLTimeZone::try_from(timezone_option)?,
)),
```
If this becomes burdensome, we could just as easily remove the timezone
type and simply default to UTC, and handle any timezone conversion in
DML and DDL. We will align with postgresql and store all times as UTC by
default.

Finally, the ```PoSQLTimeUnit``` type gives us the flexibility to store
times in either seconds, milliseconds, nanoseconds, or microseconds for
high precision. This type maps directly to ```TimeUnit``` which we alias
as ```ArrowTimeUnit``` in this PR.

# What changes are included in this PR?

## Typing updates:

- [x] Column
- [x] OwnedColumn
- [x] CommittableColumn
- [x] ColumnBounds
- [x] Typed TimeZone
- [x] Typed TimeUnit
- [x] ```impl ArrayRefExt for ArrayRef -> to_curve_25519_scalar &
to_column```
- [x] LiteralValue
- [x] owned_and_arrow_conversions
- [x]  ```impl<S: Scalar> FromIterator<i64> for OwnedColumn<S>```
- [x] ```impl<CP: CommitmentEvaluationProof> DataAccessor<CP::Scalar>
for OwnedTableTestAccessor<CP>```
- [x] owned_table_utility
- [x] test accessor_utility
- [x] multi-linear-extension
- [x] Scalar trait bounds
- [x] compute_dory_commitment
- [x] filter_column_by_index
- [x] prover_evaluate
- [x] sum_aggregate_column_by_index_counts
- [x] compare_indexes_by_columns
- [x] impl ProvableQueryResult
- [x] to_owned_table
- [x] trait ProvableResultColumn 
- [x] make_empty_query_result
- [x] record_batch_dataframe_conversion
- [x] impl ToArrow for RecordBatch 

# Are these changes tested?

## Tests:

- [x] TimeUnit Conversions
- [x] ColumnBounds
- [x] ColumnCommitmentMetadata
- [x] arrow_array_to_column_conversion
- [x] column
- [x] owned_table 

# Split:

- lalrpop grammar update and token parsing
- timestamp.now()
- timestamp.current_time()
  • Loading branch information
Dustin-Ray authored Jun 19, 2024
1 parent 22bec93 commit 623df7d
Show file tree
Hide file tree
Showing 29 changed files with 1,219 additions and 60 deletions.
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ bytemuck = {version = "1.14.2" }
byte-slice-cast = { version = "1.2.1" }
clap = { version = "4.5.4" }
criterion = { version = "0.5.1" }
chrono-tz = {version = "0.9.0", features = ["serde"]}
curve25519-dalek = { version = "4", features = ["rand_core"] }
derive_more = { version = "0.99" }
dyn_partial_eq = { version = "0.1.2" }
Expand Down
1 change: 1 addition & 0 deletions crates/proof-of-sql/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ bumpalo = { workspace = true, features = ["collections"] }
bytemuck = { workspace = true }
byte-slice-cast = { workspace = true }
curve25519-dalek = { workspace = true, features = ["serde"] }
chrono-tz = {workspace = true, features = ["serde"]}
derive_more = { workspace = true }
dyn_partial_eq = { workspace = true }
hashbrown = { workspace = true }
Expand Down
54 changes: 51 additions & 3 deletions crates/proof-of-sql/src/base/commitment/column_bounds.rs
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,8 @@ pub enum ColumnBounds {
BigInt(Bounds<i64>),
/// The bounds of an Int128 column.
Int128(Bounds<i128>),
/// The bounds of a Timestamp column.
TimestampTZ(Bounds<i64>),
}

impl ColumnBounds {
Expand All @@ -219,6 +221,9 @@ impl ColumnBounds {
CommittableColumn::Int(ints) => ColumnBounds::Int(Bounds::from_iter(*ints)),
CommittableColumn::BigInt(ints) => ColumnBounds::BigInt(Bounds::from_iter(*ints)),
CommittableColumn::Int128(ints) => ColumnBounds::Int128(Bounds::from_iter(*ints)),
CommittableColumn::TimestampTZ(_, _, times) => {
ColumnBounds::TimestampTZ(Bounds::from_iter(*times))
}
CommittableColumn::Boolean(_)
| CommittableColumn::Decimal75(_, _, _)
| CommittableColumn::Scalar(_)
Expand All @@ -241,6 +246,9 @@ impl ColumnBounds {
(ColumnBounds::BigInt(bounds_a), ColumnBounds::BigInt(bounds_b)) => {
Ok(ColumnBounds::BigInt(bounds_a.union(bounds_b)))
}
(ColumnBounds::TimestampTZ(bounds_a), ColumnBounds::TimestampTZ(bounds_b)) => {
Ok(ColumnBounds::TimestampTZ(bounds_a.union(bounds_b)))
}
(ColumnBounds::Int128(bounds_a), ColumnBounds::Int128(bounds_b)) => {
Ok(ColumnBounds::Int128(bounds_a.union(bounds_b)))
}
Expand Down Expand Up @@ -269,7 +277,9 @@ impl ColumnBounds {
(ColumnBounds::Int128(bounds_a), ColumnBounds::Int128(bounds_b)) => {
Ok(ColumnBounds::Int128(bounds_a.difference(bounds_b)))
}

(ColumnBounds::TimestampTZ(bounds_a), ColumnBounds::TimestampTZ(bounds_b)) => {
Ok(ColumnBounds::TimestampTZ(bounds_a.difference(bounds_b)))
}
(_, _) => Err(ColumnBoundsMismatch(Box::new(self), Box::new(other))),
}
}
Expand All @@ -278,7 +288,12 @@ impl ColumnBounds {
#[cfg(test)]
mod tests {
use super::*;
use crate::base::{database::OwnedColumn, math::decimal::Precision, scalar::Curve25519Scalar};
use crate::base::{
database::OwnedColumn,
math::decimal::Precision,
scalar::Curve25519Scalar,
time::timestamp::{PoSQLTimeUnit, PoSQLTimeZone},
};
use itertools::Itertools;

#[test]
Expand Down Expand Up @@ -518,8 +533,19 @@ mod tests {
);
let committable_decimal75_column = CommittableColumn::from(&decimal75_column);
let decimal75_column_bounds = ColumnBounds::from_column(&committable_decimal75_column);

assert_eq!(decimal75_column_bounds, ColumnBounds::NoOrder);

let timestamp_column = OwnedColumn::<Curve25519Scalar>::TimestampTZ(
PoSQLTimeUnit::Second,
PoSQLTimeZone::UTC,
vec![1_i64, 2, 3, 4],
);
let committable_timestamp_column = CommittableColumn::from(&timestamp_column);
let timestamp_column_bounds = ColumnBounds::from_column(&committable_timestamp_column);
assert_eq!(
timestamp_column_bounds,
ColumnBounds::TimestampTZ(Bounds::Sharp(BoundsInner { min: 1, max: 4 }))
);
}

#[test]
Expand Down Expand Up @@ -561,6 +587,14 @@ mod tests {
int128_a.try_union(int128_b).unwrap(),
ColumnBounds::Int128(Bounds::Bounded(BoundsInner { min: 1, max: 6 }))
);

let timestamp_a = ColumnBounds::TimestampTZ(Bounds::Sharp(BoundsInner { min: 1, max: 3 }));
let timestamp_b =
ColumnBounds::TimestampTZ(Bounds::Bounded(BoundsInner { min: 4, max: 6 }));
assert_eq!(
timestamp_a.try_union(timestamp_b).unwrap(),
ColumnBounds::TimestampTZ(Bounds::Bounded(BoundsInner { min: 1, max: 6 }))
);
}

#[test]
Expand All @@ -570,13 +604,15 @@ mod tests {
let int = ColumnBounds::Int(Bounds::Sharp(BoundsInner { min: -10, max: 10 }));
let bigint = ColumnBounds::BigInt(Bounds::Sharp(BoundsInner { min: 1, max: 3 }));
let int128 = ColumnBounds::Int128(Bounds::Sharp(BoundsInner { min: 4, max: 6 }));
let timestamp = ColumnBounds::TimestampTZ(Bounds::Sharp(BoundsInner { min: 4, max: 6 }));

let bounds = [
(no_order, "NoOrder"),
(smallint, "SmallInt"),
(int, "Int"),
(bigint, "BigInt"),
(int128, "Int128"),
(timestamp, "Timestamp"),
];

for ((bound_a, name_a), (bound_b, name_b)) in bounds.iter().tuple_combinations() {
Expand Down Expand Up @@ -618,13 +654,22 @@ mod tests {
int128_a.try_difference(int128_b).unwrap(),
ColumnBounds::Int128(Bounds::Bounded(BoundsInner { min: 1, max: 4 }))
);

let timestamp_a = ColumnBounds::TimestampTZ(Bounds::Sharp(BoundsInner { min: 1, max: 4 }));
let timestamp_b = ColumnBounds::TimestampTZ(Bounds::Sharp(BoundsInner { min: 3, max: 6 }));
assert_eq!(
timestamp_a.try_difference(timestamp_b).unwrap(),
ColumnBounds::TimestampTZ(Bounds::Bounded(BoundsInner { min: 1, max: 4 }))
);
}

#[test]
fn we_cannot_difference_mismatched_column_bounds() {
let no_order = ColumnBounds::NoOrder;
let bigint = ColumnBounds::BigInt(Bounds::Sharp(BoundsInner { min: 1, max: 3 }));
let int128 = ColumnBounds::Int128(Bounds::Sharp(BoundsInner { min: 4, max: 6 }));
let timestamp = ColumnBounds::TimestampTZ(Bounds::Sharp(BoundsInner { min: 4, max: 6 }));
let smallint = ColumnBounds::SmallInt(Bounds::Sharp(BoundsInner { min: 1, max: 3 }));

assert!(no_order.try_difference(bigint).is_err());
assert!(bigint.try_difference(no_order).is_err());
Expand All @@ -634,5 +679,8 @@ mod tests {

assert!(bigint.try_difference(int128).is_err());
assert!(int128.try_difference(bigint).is_err());

assert!(smallint.try_difference(timestamp).is_err());
assert!(timestamp.try_difference(smallint).is_err());
}
}
154 changes: 153 additions & 1 deletion crates/proof-of-sql/src/base/commitment/column_commitment_metadata.rs
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ impl ColumnCommitmentMetadata {
| (ColumnType::Int, ColumnBounds::Int(_))
| (ColumnType::BigInt, ColumnBounds::BigInt(_))
| (ColumnType::Int128, ColumnBounds::Int128(_))
| (ColumnType::TimestampTZ(_, _), ColumnBounds::TimestampTZ(_))
| (
ColumnType::Boolean
| ColumnType::VarChar
Expand Down Expand Up @@ -72,6 +73,10 @@ impl ColumnCommitmentMetadata {
BoundsInner::try_new(i64::MIN, i64::MAX)
.expect("i64::MIN and i64::MAX are valid bounds for BigInt"),
)),
ColumnType::TimestampTZ(_, _) => ColumnBounds::TimestampTZ(super::Bounds::Bounded(
BoundsInner::try_new(i64::MIN, i64::MAX)
.expect("i64::MIN and i64::MAX are valid bounds for TimeStamp"),
)),
ColumnType::Int128 => ColumnBounds::Int128(super::Bounds::Bounded(
BoundsInner::try_new(i128::MIN, i128::MAX)
.expect("i128::MIN and i128::MAX are valid bounds for Int128"),
Expand Down Expand Up @@ -160,8 +165,11 @@ impl ColumnCommitmentMetadata {
mod tests {
use super::*;
use crate::base::{
commitment::column_bounds::Bounds, database::OwnedColumn, math::decimal::Precision,
commitment::column_bounds::Bounds,
database::OwnedColumn,
math::decimal::Precision,
scalar::Curve25519Scalar,
time::timestamp::{PoSQLTimeUnit, PoSQLTimeZone},
};

#[test]
Expand Down Expand Up @@ -219,6 +227,18 @@ mod tests {
}
);

assert_eq!(
ColumnCommitmentMetadata::try_new(
ColumnType::TimestampTZ(PoSQLTimeUnit::Second, PoSQLTimeZone::UTC),
ColumnBounds::TimestampTZ(Bounds::Empty),
)
.unwrap(),
ColumnCommitmentMetadata {
column_type: ColumnType::TimestampTZ(PoSQLTimeUnit::Second, PoSQLTimeZone::UTC),
bounds: ColumnBounds::TimestampTZ(Bounds::Empty),
}
);

assert_eq!(
ColumnCommitmentMetadata::try_new(
ColumnType::Int128,
Expand Down Expand Up @@ -349,6 +369,26 @@ mod tests {
);
assert_eq!(decimal_metadata.bounds(), &ColumnBounds::NoOrder);

let timestamp_column: OwnedColumn<Curve25519Scalar> =
OwnedColumn::<Curve25519Scalar>::TimestampTZ(
PoSQLTimeUnit::Second,
PoSQLTimeZone::UTC,
[1i64, 2, 3, 4, 5].to_vec(),
);
let committable_timestamp_column = CommittableColumn::from(&timestamp_column);
let timestamp_metadata =
ColumnCommitmentMetadata::from_column(&committable_timestamp_column);
assert_eq!(
timestamp_metadata.column_type(),
&ColumnType::TimestampTZ(PoSQLTimeUnit::Second, PoSQLTimeZone::UTC)
);
if let ColumnBounds::TimestampTZ(Bounds::Sharp(bounds)) = timestamp_metadata.bounds() {
assert_eq!(bounds.min(), &1);
assert_eq!(bounds.max(), &5);
} else {
panic!("Bounds constructed from nonempty TimestampTZ column should be ColumnBounds::BigInt(Bounds::Sharp(_))");
}

let varchar_column = OwnedColumn::<Curve25519Scalar>::VarChar(
["Lorem", "ipsum", "dolor", "sit", "amet"]
.map(String::from)
Expand Down Expand Up @@ -484,6 +524,80 @@ mod tests {
bigint_metadata_a.try_union(bigint_metadata_b).unwrap(),
bigint_metadata_c
);

// Ordered case for TimestampTZ
// Example Unix epoch times
let times = [
1_625_072_400,
1_625_076_000,
1_625_079_600,
1_625_072_400,
1_625_065_000,
];
let timezone = PoSQLTimeZone::UTC;
let timeunit = PoSQLTimeUnit::Second;
let timestamp_column_a = CommittableColumn::TimestampTZ(timeunit, timezone, &times[..2]);
let timestamp_metadata_a = ColumnCommitmentMetadata::from_column(&timestamp_column_a);
let timestamp_column_b = CommittableColumn::TimestampTZ(timeunit, timezone, &times[2..]);
let timestamp_metadata_b = ColumnCommitmentMetadata::from_column(&timestamp_column_b);
let timestamp_column_c = CommittableColumn::TimestampTZ(timeunit, timezone, &times);
let timestamp_metadata_c = ColumnCommitmentMetadata::from_column(&timestamp_column_c);
assert_eq!(
timestamp_metadata_a
.try_union(timestamp_metadata_b)
.unwrap(),
timestamp_metadata_c
);
}

#[test]
fn we_can_difference_timestamp_tz_matching_metadata() {
// Ordered case
let times = [
1_625_072_400,
1_625_076_000,
1_625_079_600,
1_625_072_400,
1_625_065_000,
];
let timezone = PoSQLTimeZone::UTC;
let timeunit = PoSQLTimeUnit::Second;

let timestamp_column_a = CommittableColumn::TimestampTZ(timeunit, timezone, &times[..2]);
let timestamp_metadata_a = ColumnCommitmentMetadata::from_column(&timestamp_column_a);
let timestamp_column_b = CommittableColumn::TimestampTZ(timeunit, timezone, &times);
let timestamp_metadata_b = ColumnCommitmentMetadata::from_column(&timestamp_column_b);

let b_difference_a = timestamp_metadata_b
.try_difference(timestamp_metadata_a)
.unwrap();
assert_eq!(
b_difference_a.column_type,
ColumnType::TimestampTZ(timeunit, timezone)
);
if let ColumnBounds::TimestampTZ(Bounds::Bounded(bounds)) = b_difference_a.bounds {
assert_eq!(bounds.min(), &1_625_065_000);
assert_eq!(bounds.max(), &1_625_079_600);
} else {
panic!("difference of overlapping bounds should be Bounded");
}

let timestamp_column_empty = CommittableColumn::TimestampTZ(timeunit, timezone, &[]);
let timestamp_metadata_empty =
ColumnCommitmentMetadata::from_column(&timestamp_column_empty);

assert_eq!(
timestamp_metadata_b
.try_difference(timestamp_metadata_empty)
.unwrap(),
timestamp_metadata_b
);
assert_eq!(
timestamp_metadata_empty
.try_difference(timestamp_metadata_b)
.unwrap(),
timestamp_metadata_empty
);
}

#[test]
Expand Down Expand Up @@ -741,5 +855,43 @@ mod tests {
assert!(different_decimal75_metadata
.try_union(decimal75_metadata)
.is_err());

let timestamp_tz_metadata_a = ColumnCommitmentMetadata {
column_type: ColumnType::TimestampTZ(PoSQLTimeUnit::Second, PoSQLTimeZone::UTC),
bounds: ColumnBounds::TimestampTZ(Bounds::Empty),
};

let timestamp_tz_metadata_b = ColumnCommitmentMetadata {
column_type: ColumnType::TimestampTZ(PoSQLTimeUnit::Millisecond, PoSQLTimeZone::UTC),
bounds: ColumnBounds::TimestampTZ(Bounds::Empty),
};

// Tests for union operations
assert!(timestamp_tz_metadata_a.try_union(varchar_metadata).is_err());
assert!(varchar_metadata.try_union(timestamp_tz_metadata_a).is_err());

// Tests for difference operations
assert!(timestamp_tz_metadata_a
.try_difference(scalar_metadata)
.is_err());
assert!(scalar_metadata
.try_difference(timestamp_tz_metadata_a)
.is_err());

// Tests for different time units within the same type
assert!(timestamp_tz_metadata_a
.try_union(timestamp_tz_metadata_b)
.is_err());
assert!(timestamp_tz_metadata_b
.try_union(timestamp_tz_metadata_a)
.is_err());

// Difference with different time units
assert!(timestamp_tz_metadata_a
.try_difference(timestamp_tz_metadata_b)
.is_err());
assert!(timestamp_tz_metadata_b
.try_difference(timestamp_tz_metadata_a)
.is_err());
}
}
Loading

0 comments on commit 623df7d

Please sign in to comment.