Skip to content

Commit

Permalink
[db-queries] Decouples CTE usage from Diesel (RegionAllocation) (#5063)
Browse files Browse the repository at this point in the history
Diesel's complex type have been a pain point in the region allocation
CTE in particular:

- It relies on several JOINs, which requires explicitly adding Diesel
macros to make the type system happy
- With ongoing work by @jmpesp , it would require additional `alias!`
calls to allow a table to appear in a `SELECT` clause in multiple spots
- Generally, the disconnect between "what SQL do I want to write" and
"what invocations will make Diesel happy" has been "not really worth it"
in this space.

This PR does the following:
- It relies heavily on
https://docs.diesel.rs/master/diesel/query_builder/struct.SqlQuery.html
, which is Diesel's "you do whatever you want" query type. Although this
is still using Diesel, this usage is actually pretty aligned with other
simpler DB interfaces - other crates (see:
[tokio_postgres](https://docs.rs/tokio-postgres/latest/tokio_postgres/struct.Client.html#method.query))
take arguments like "a String + list of bind parameters", in some form.
- It adds support in `raw_query_builder.rs` for a wrapper around
Diesel's `SqlQuery` object, to make SQL injections less possible and to
track bind parameter counting.
- It fully converts the `RegionAllocation` CTE to use this builder,
interleaved with raw SQL.
- Since a large portion of the CTE was rote "repeated columns", I also
added a function, accessible as
`AllColumnsOf<Table>::with_prefix(&'static str)`, to enumerate all the
columns of a table as strings.
- I also added a simple `EXPLAIN` test to the CTE, to quickly validate
that CockroachDB thinks it's producing valid output.

Here are my thoughts for future improvements:
- [ ] I spent a while trying to make the "query construction" a
compile-time operation. I think that this is possible with nightly
features. I think this is extremely difficult with stable rust. However,
I think this is a great direction for future work, as statically-known
queries would be easier to cache and validate.
- [ ] I'd like to encapsulate more type information about the
constructed query, as an "input/output" object. Right now, we're relying
on existing integration tests for validation, but it seems possible to
send these "example" queries (like the ones I'm using in my `EXPLAIN`
tests) to ask CockroachDB to validate type information for us.
- [ ] I want to make this format as digestible as possible. If there's
anything I can do to make this easier to read, write, and validate, I'm
totally on-board. I have been debating adding macro support for SQL
formatting the raw strings, but I'm on the fence about whether or not
that would make interleaved code harder to parse by humans.
- As a follow-up: I'm auto-formatting the output of these queries in the
EXPECTORATE-d output files
  • Loading branch information
smklein authored Mar 27, 2024
1 parent cf185c5 commit 7b29090
Show file tree
Hide file tree
Showing 16 changed files with 1,198 additions and 876 deletions.
21 changes: 21 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,7 @@ cfg-if = "1.0"
chrono = { version = "0.4", features = [ "serde" ] }
clap = { version = "4.5", features = ["cargo", "derive", "env", "wrap_help"] }
colored = "2.1"
const_format = "0.2.32"
cookie = "0.18"
criterion = { version = "0.5.1", features = [ "async_tokio" ] }
crossbeam = "0.8"
Expand Down
1 change: 0 additions & 1 deletion nexus/db-model/src/queries/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,4 @@

//! Subqueries used in CTEs.

pub mod region_allocation;
pub mod virtual_provisioning_collection_update;
195 changes: 0 additions & 195 deletions nexus/db-model/src/queries/region_allocation.rs

This file was deleted.

1 change: 1 addition & 0 deletions nexus/db-queries/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ base64.workspace = true
bb8.workspace = true
camino.workspace = true
chrono.workspace = true
const_format.workspace = true
cookie.workspace = true
diesel.workspace = true
diesel-dtrace.workspace = true
Expand Down
62 changes: 0 additions & 62 deletions nexus/db-queries/src/db/cast_uuid_as_bytea.rs

This file was deleted.

27 changes: 26 additions & 1 deletion nexus/db-queries/src/db/column_walker.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

//! CTE utility for iterating over all columns in a table.

use crate::db::raw_query_builder::TrustedStr;
use diesel::prelude::*;
use std::marker::PhantomData;

Expand All @@ -17,14 +18,30 @@ pub(crate) struct ColumnWalker<T> {
remaining: PhantomData<T>,
}

pub type AllColumnsOf<T> = ColumnWalker<<T as diesel::Table>::AllColumns>;

impl<T> ColumnWalker<T> {
pub fn new() -> Self {
pub const fn new() -> Self {
Self { remaining: PhantomData }
}
}

macro_rules! impl_column_walker {
( $len:literal $($column:ident)+ ) => (
#[allow(dead_code)]
impl<$($column: Column),+> ColumnWalker<($($column,)+)> {
pub fn with_prefix(prefix: &'static str) -> TrustedStr {
// This string is derived from:
// - The "table" type, with associated columns, which
// are not controlled by an arbitrary user, and
// - The "prefix" type, which is a "&'static str" (AKA,
// hopefully known at compile-time, and not leaked).
TrustedStr::i_take_responsibility_for_validating_this_string(
[$([prefix, $column::NAME].join("."),)+].join(", ")
)
}
}

impl<$($column: Column),+> IntoIterator for ColumnWalker<($($column,)+)> {
type Item = &'static str;
type IntoIter = std::array::IntoIter<Self::Item, $len>;
Expand Down Expand Up @@ -109,4 +126,12 @@ mod test {
assert_eq!(iter.next(), Some("value"));
assert_eq!(iter.next(), None);
}

#[test]
fn test_all_columns_with_prefix() {
assert_eq!(
AllColumnsOf::<test_table::table>::with_prefix("foo").as_str(),
"foo.id, foo.value, foo.time_deleted"
);
}
}
11 changes: 11 additions & 0 deletions nexus/db-queries/src/db/datastore/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -949,10 +949,21 @@ mod test {
assert_eq!(expected_region_count, dataset_and_regions.len());
let mut disk_datasets = HashSet::new();
let mut disk_zpools = HashSet::new();
let mut regions = HashSet::new();

for (dataset, region) in dataset_and_regions {
// Must be 3 unique datasets
assert!(disk_datasets.insert(dataset.id()));
// All regions should be unique
assert!(regions.insert(region.id()));

// Check there's no cross contamination between returned UUIDs
//
// This is a little goofy, but it catches a bug that has
// happened before. The returned columns share names (like
// "id"), so we need to process them in-order.
assert!(regions.get(&dataset.id()).is_none());
assert!(disk_datasets.get(&region.id()).is_none());

// Dataset must not be eligible for provisioning.
if let Some(kind) =
Expand Down
Loading

0 comments on commit 7b29090

Please sign in to comment.