Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use subcondition of list in get_matching_values #102

Merged
merged 105 commits into from
Sep 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
105 commits
Select commit Hold shift + click to select a range
3f80498
remove support for under() conditions
lmcmicu Jul 24, 2024
2841c75
Use subcondition of list in get_matching_values
jamesaoverton Jul 26, 2024
4a5ede7
small reorg
lmcmicu Jul 26, 2024
abf2a10
move get_table_ddl() function to Valve API
lmcmicu Jul 27, 2024
888c6ad
only create foreign keys in the db for from() constraints on single-v…
lmcmicu Jul 27, 2024
e9417f9
validate foreign keys on individual items from columns with list() da…
lmcmicu Jul 27, 2024
751885d
shorten variable names
lmcmicu Jul 28, 2024
6025da0
always validate constraints
lmcmicu Jul 28, 2024
70bb737
Revert "always validate constraints"
lmcmicu Jul 30, 2024
896cc3e
optimizations
lmcmicu Jul 30, 2024
15e22f4
add get_value_type() to API and toolkit
lmcmicu Jul 30, 2024
9de4aa9
do not rely on database errors tables that have list columns with fro…
lmcmicu Jul 30, 2024
ece7076
do not crash on list types whose reference type has no condition, and…
lmcmicu Jul 31, 2024
8ca60b2
remove foreign key cache
lmcmicu Aug 1, 2024
0b71b08
first pass trial implementation of new caching method for validate_ro…
lmcmicu Aug 1, 2024
30a1dda
consider conflict tables when determining forbidden and allowed value…
lmcmicu Aug 2, 2024
4eb7708
accept a list of 'received values' in when validating foreign and uni…
lmcmicu Aug 2, 2024
159a5b6
check for sql type errors when collecting values for cache
lmcmicu Aug 6, 2024
6ec116b
clean up fkey_in_db()
lmcmicu Aug 6, 2024
7df1ec0
take list values into account when collecting received_values for cac…
lmcmicu Aug 6, 2024
84f4de1
fetch allowed and forbidden values less often
lmcmicu Aug 7, 2024
4be55d5
add tree children to unique cache
lmcmicu Aug 7, 2024
e495616
use vertical caching in unique validation
lmcmicu Aug 7, 2024
a001df7
tweak
lmcmicu Aug 7, 2024
cbf21e4
fix unique caching check
lmcmicu Aug 7, 2024
7303d5d
increase perf_test rows and add it to workflow
lmcmicu Aug 7, 2024
ebc7048
tweaks
lmcmicu Aug 7, 2024
7a957c1
add numeric list test
lmcmicu Aug 8, 2024
85a5486
add missing ontology files
lmcmicu Aug 8, 2024
c7c20d5
change performance and penguin test thresholds
lmcmicu Aug 8, 2024
766a5eb
small tweaks
lmcmicu Aug 9, 2024
8b1badd
remove redundant tree:child-unique error message
lmcmicu Aug 9, 2024
cb62617
remove redundant tree:child-unique error message
lmcmicu Aug 9, 2024
e32fde7
consider also tree-foreign keys in get_rows_to_update()
lmcmicu Aug 9, 2024
53614e3
update test threshold times
lmcmicu Aug 12, 2024
7cb37ac
Merge branch 'remove-under' into use-list-subcondition
lmcmicu Aug 12, 2024
3faafc6
fix typo
lmcmicu Aug 12, 2024
94b2ad0
fix handling of pipes in match() conditions
lmcmicu Aug 12, 2024
1bb7487
update tests
lmcmicu Aug 12, 2024
89c0b83
add proper link to documentation in readme
lmcmicu Aug 12, 2024
662461d
Revert "add proper link to documentation in readme"
lmcmicu Aug 12, 2024
c324722
first documentation steps
lmcmicu Aug 13, 2024
9af7fc2
test
lmcmicu Aug 13, 2024
b19bd58
test table alignment
lmcmicu Aug 13, 2024
120fe73
test table alignment
lmcmicu Aug 13, 2024
bd16268
test table alignment
lmcmicu Aug 13, 2024
8476e28
test table alignment
lmcmicu Aug 13, 2024
3a53030
test table alignment
lmcmicu Aug 13, 2024
594d2f9
add documention on table table
lmcmicu Aug 13, 2024
87f6102
test heading
lmcmicu Aug 14, 2024
631f148
test headings
lmcmicu Aug 14, 2024
f12f2b2
finish draft of table table configuration section of readme
lmcmicu Aug 14, 2024
8ae4f7b
replace backticks in readme
lmcmicu Aug 14, 2024
1e32a0c
start readme doc for table, column, datatype, and rule tables
lmcmicu Aug 14, 2024
c41eff5
mess with table formatting
lmcmicu Aug 14, 2024
acd530d
mess with table formatting
lmcmicu Aug 14, 2024
7b5c787
undo messing with table formatting
lmcmicu Aug 14, 2024
5fbcaac
add doc for column table
lmcmicu Aug 15, 2024
57306de
tweak
lmcmicu Aug 15, 2024
3f8b522
add draft of datatype table doc
lmcmicu Aug 15, 2024
b757f5c
update readme, force name of table table to be 'table'
lmcmicu Aug 18, 2024
6e4845c
formatting tweak
lmcmicu Aug 18, 2024
44ca784
formatting tweak
lmcmicu Aug 18, 2024
b962103
formatting tweak
lmcmicu Aug 18, 2024
8b1085b
tweak
lmcmicu Aug 18, 2024
892bb6a
add required datatypes to readme
lmcmicu Aug 18, 2024
7d48742
add trimmed_line and nonspace to list of required datatypes
lmcmicu Aug 18, 2024
7cbd744
add doc on guess
lmcmicu Aug 19, 2024
b2cc105
add troubleshooting section to readme
lmcmicu Aug 19, 2024
0efe9d6
add link to rule table
lmcmicu Aug 19, 2024
c077bc7
typo fixes and tweaks to readme
lmcmicu Aug 19, 2024
d2f45d5
fix typo
lmcmicu Aug 19, 2024
ca65215
add row_order to text_views, fix broken penguin test
lmcmicu Aug 29, 2024
b7710b2
add notes on path
lmcmicu Aug 29, 2024
8555e71
tweak
lmcmicu Aug 29, 2024
cea8d2c
tweak
lmcmicu Aug 29, 2024
4817043
tweak
lmcmicu Aug 29, 2024
8f30352
tweak
lmcmicu Aug 29, 2024
5e6266a
tweak
lmcmicu Aug 29, 2024
a216451
return an error if asked to save to a default non-tsv path; add commo…
lmcmicu Aug 30, 2024
87db022
allow 'save as' for non-saveable tables
lmcmicu Aug 30, 2024
eb780fb
start on validation section
lmcmicu Aug 30, 2024
dd72d9d
try mermaid flow chart
lmcmicu Aug 30, 2024
a4d0b8c
test mermaid diagram
lmcmicu Aug 30, 2024
749742e
add validation flowchart
lmcmicu Aug 30, 2024
2771fa1
tweak validation flowchart
lmcmicu Aug 30, 2024
3b12cae
tweak validation flowchart
lmcmicu Aug 30, 2024
36487ea
tweak validation flowchart
lmcmicu Aug 30, 2024
c94ada2
add headers for subsections to be done
lmcmicu Aug 30, 2024
5ea7f91
add more validation stuff to readme
lmcmicu Aug 30, 2024
878b759
add section on batch validation
lmcmicu Aug 31, 2024
a42e491
fix typo
lmcmicu Aug 31, 2024
282c4a1
add stuff to readme on editing data
lmcmicu Aug 31, 2024
fa28ee2
complete first draft of design and concepts documentation
lmcmicu Aug 31, 2024
b1b67a6
add table of contents to readme
lmcmicu Sep 2, 2024
c82b253
use better example in readme
lmcmicu Sep 2, 2024
cc1f04a
tweak
lmcmicu Sep 2, 2024
589e45c
tweak
lmcmicu Sep 2, 2024
d25716c
table formatting
lmcmicu Sep 2, 2024
25416f3
table formatting
lmcmicu Sep 2, 2024
c040aa0
table formatting
lmcmicu Sep 2, 2024
68ca403
typo fixes
lmcmicu Sep 2, 2024
4678042
typo fixes
lmcmicu Sep 2, 2024
21ea38f
small readme updates
lmcmicu Sep 3, 2024
a5c3e17
fix typos and tweaks
lmcmicu Sep 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/valve-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,4 +43,4 @@ jobs:
pip3 install -r requirements.txt
- name: Run tests on both sqlite and on postgresql
run: |
make test penguin_test
make test penguin_test perf_test
29 changes: 22 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ sqlite_api_test: valve test/src/table.tsv build/valve.db test/insert_update.sh |
$(word 4,$^) $(word 3,$^) $(word 2,$^)
scripts/export_messages.py $(word 3,$^) $| $(tables_to_test)
diff --strip-trailing-cr -q test/expected/messages_after_api_test.tsv test/output/messages.tsv
echo "select \"history_id\", \"table\", \"row\", \"from\", \"to\", \"summary\", \"user\", \"undone_by\" from history where history_id < 15 order by history_id" | sqlite3 -header -tabs build/valve.db > test/output/history.tsv
echo "select \"history_id\", \"table\", \"row\", \"from\", \"to\", \"summary\", \"user\", \"undone_by\" from history where history_id < 16 order by history_id" | sqlite3 -header -tabs build/valve.db > test/output/history.tsv
diff --strip-trailing-cr -q test/expected/history.tsv test/output/history.tsv
# We drop all of the db tables because the schema for the next test (random test) is different
# from the schema used for this test.
Expand All @@ -114,7 +114,7 @@ pg_api_test: valve test/src/table.tsv test/insert_update.sh | test/output
$(word 3,$^) $(pg_connect_string) $(word 2,$^)
scripts/export_messages.py $(pg_connect_string) $| $(tables_to_test)
diff --strip-trailing-cr -q test/expected/messages_after_api_test.tsv test/output/messages.tsv
psql $(pg_connect_string) -c "COPY (select \"history_id\", \"table\", \"row\", \"from\", \"to\", \"summary\", \"user\", \"undone_by\" from history where history_id < 15 order by history_id) TO STDOUT WITH NULL AS ''" > test/output/history.tsv
psql $(pg_connect_string) -c "COPY (select \"history_id\", \"table\", \"row\", \"from\", \"to\", \"summary\", \"user\", \"undone_by\" from history where history_id < 16 order by history_id) TO STDOUT WITH NULL AS ''" > test/output/history.tsv
tail -n +2 test/expected/history.tsv | diff --strip-trailing-cr -q test/output/history.tsv -
# We drop all of the db tables because the schema for the next test (random test) is different
# from the schema used for this test.
Expand Down Expand Up @@ -151,7 +151,12 @@ pg_random_test: valve random_test_data | build test/output
test/penguins/src/data:
mkdir -p $@

penguin_test_threshold = 60
# At last check, the penguin performance test was running on GitHub's runner
# (Ubuntu 22.04.4 LTS, runner version 2.317.0) in just under 30s. GitHub
# sometimes changes the runner version, however, thus if we set the threshold
# too low we might get a failure. The threshold below is about 10s more than the time
# it takes on my laptop (while plugged).
penguin_test_threshold = 50
num_penguin_rows = 100000
penguin_command_sqlite = ./valve --assume-yes load src/schema/table.tsv --initial-load penguins.db
penguin_command_pg = ./valve --assume-yes load src/schema/table.tsv $(pg_connect_string)
Expand Down Expand Up @@ -200,9 +205,16 @@ $(guess_test_db): valve guess_test_data $(guess_test_dir)/*.tsv | build $(guess_
rm -f $@
./$< --assume-yes load $(guess_test_dir)/table.tsv $@

# At last check, the performance test was running on GitHub's runner
# (Ubuntu 22.04.4 LTS, runner version 2.317.0) in just over 20s. GitHub
# sometimes changes the runner version, however, thus if we set the threshold
# too low we might get a failure. The threshold below is about 10s more than the time
# it takes using postgresql on my laptop (while plugged), and about 15s more than it takes
# using sqlite.
perf_test_threshold = 45
perf_test_dir = test/perf_test_data
perf_test_db = build/valve_perf.db
num_perf_test_rows = 1000
num_perf_test_rows = 10000
perf_test_error_rate = 5

$(perf_test_dir)/ontology:
Expand All @@ -212,19 +224,22 @@ $(perf_test_dir)/ontology:
perf_test_data: test/generate_random_test_data.py valve confirm_overwrite.sh $(perf_test_dir)/*.tsv | $(perf_test_dir)/ontology
./confirm_overwrite.sh $(perf_test_dir)/ontology
rm -f $(perf_test_dir)/ontology/*.tsv
./$< $$(date +"%s") $(num_perf_test_rows) $(perf_test_error_rate) $(perf_test_dir)/table.tsv $|
./$< 0 $(num_perf_test_rows) $(perf_test_error_rate) $(perf_test_dir)/table.tsv $|

$(perf_test_db): valve perf_test_data $(perf_test_dir)/*.tsv | build $(perf_test_dir)/ontology
rm -f $@
time -p ./$< --verbose load $(perf_test_dir)/table.tsv --initial-load $@
timeout $(perf_test_threshold) time -p ./$< --assume-yes --verbose load $(perf_test_dir)/table.tsv --initial-load $@ || \
(echo "Performance test (SQLite) took longer than $(perf_test_threshold) seconds." && false)


.PHONY: sqlite_perf_test
sqlite_perf_test: $(perf_test_db) | test/output
time -p scripts/export_messages.py $< $| $(tables_to_test)

.PHONY: pg_perf_test
pg_perf_test: valve $(perf_test_dir)/ontology | test/output
time -p ./$< --verbose load $(perf_test_dir)/table.tsv $(pg_connect_string)
timeout $(perf_test_threshold) time -p ./$< --assume-yes --verbose load $(perf_test_dir)/table.tsv $(pg_connect_string) || \
(echo "Performance test (PostgreSQL) took longer than $(perf_test_threshold) seconds." && false)
time -p scripts/export_messages.py $(pg_connect_string) $| $(tables_to_test)

.PHONY: perf_test
Expand Down
798 changes: 784 additions & 14 deletions README.md

Large diffs are not rendered by default.

31 changes: 2 additions & 29 deletions src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,32 +1,8 @@
//! <!-- Please do not edit README.md directly. To generate a new readme from the crate documentation
//! in src/lib.rs, install cargo-readme using `cargo install cargo-readme` and then run:
//! `cargo readme > README.md` -->
//!
//! # valve.rs
//! # ontodev/valve.rs
//! A lightweight validation engine written in rust.
//!
//! ## API
//! See [valve]
//!
//! ## Command line usage
//! Run:
//! ```
//! valve --help
//! ```
//! to see command line options.
//!
//! ## Logging
//! By default Valve only logs error messages. To also enable warning and information messages,
//! set the environment variable `RUST_LOG` to the minimum logging level desired for ontodev_valve:
//! `debug`, `info`, `warn`, or `error`.
//! For instance:
//! ```
//! export RUST_LOG="ontodev_valve=info"
//! ```
//! For further information see the [Rust Cookbook](https://rust-lang-nursery.github.io/rust-cookbook/development_tools/debugging/config_log.html).
//!
//! ## Python bindings
//! See [valve.py](https://github.com/ontodev/valve.py)

#[macro_use]
extern crate lalrpop_util;
Expand Down Expand Up @@ -62,10 +38,7 @@ pub static MOVE_INTERVAL: u32 = 1000;
pub static PRINTF_RE: &str = r#"^%.*([\w%])$"#;

/// The size of the datatype validation cache.
static DT_CACHE_SIZE: usize = 10000;

/// The size of the foreign key validation cache.
static FKEY_CACHE_SIZE: usize = 10000;
pub static DT_CACHE_SIZE: usize = 10000;

// Note that SQL_PARAM must be a 'word' (from the point of view of regular expressions) since in the
// local_sql_syntax() function below we are matchng against it using '\b' which represents a word
Expand Down
2 changes: 1 addition & 1 deletion src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ static SAVE_DIR_HELP: &str = "Save tables to DIR instead of to their configured
long_about = None)]
struct Cli {
/// Use this option with caution. When set, Valve will not not ask the user for confirmation
/// before executing potentially destructive operations.
/// before executing potentially destructive operations on the database and/or table files.
#[arg(long, action = ArgAction::SetTrue)]
assume_yes: bool,

Expand Down
56 changes: 32 additions & 24 deletions src/tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,25 @@ async fn test_insert_2(valve: &Valve) -> Result<()> {
Ok(())
}

async fn test_insert_3(valve: &Valve) -> Result<()> {
eprint!("Running test_insert_3() ... ");

let row = json!({
"id": "BFO:0000099",
"label": "jafar",
"parent": "mar",
"source": "COB",
"type": "owl:Class",
});
let (_new_row_num, _new_row) = valve.insert_row("table3", row.as_object().unwrap()).await?;

// The result of this insertion is that the tree:foreign error message will be resolved
// for table3 row 5 column parent: "Value 'jafar' of column parent is not in column label"

eprintln!("done.");
Ok(())
}

async fn test_dependencies(valve: &Valve) -> Result<()> {
eprint!("Running test_dependencies() ... ");

Expand Down Expand Up @@ -648,55 +667,43 @@ async fn test_modes(valve: &Valve) -> Result<()> {

let result = valve.insert_row("readonly1", &readonly_row).await;
match result {
Err(e) => assert_eq!(
format!("{:?}", e),
r#"InputError("Inserting to table 'readonly1' is not allowed")"#,
),
Err(e) => assert!(format!("{:?}", e)
.starts_with(r#"InputError("Inserting to table 'readonly1' is not allowed")"#)),
_ => assert!(false, "Expected an error result but got an OK result"),
};

let result = valve.insert_row("view1", &view_row).await;
match result {
Err(e) => assert_eq!(
format!("{:?}", e),
r#"InputError("Inserting to table 'view1' is not allowed")"#,
),
Err(e) => assert!(format!("{:?}", e)
.starts_with(r#"InputError("Inserting to table 'view1' is not allowed")"#)),
_ => assert!(false, "Expected an error result but got an OK result"),
};

let result = valve.update_row("readonly1", &1, &readonly_row).await;
match result {
Err(e) => assert_eq!(
format!("{:?}", e),
r#"InputError("Updating table 'readonly1' is not allowed")"#,
),
Err(e) => assert!(format!("{:?}", e)
.starts_with(r#"InputError("Updating table 'readonly1' is not allowed")"#)),
_ => assert!(false, "Expected an error result but got an OK result"),
};

let result = valve.update_row("view1", &1, &view_row).await;
match result {
Err(e) => assert_eq!(
format!("{:?}", e),
r#"InputError("Updating table 'view1' is not allowed")"#,
),
Err(e) => assert!(format!("{:?}", e)
.starts_with(r#"InputError("Updating table 'view1' is not allowed")"#)),
_ => assert!(false, "Expected an error result but got an OK result"),
};

let result = valve.delete_row("readonly1", &1).await;
match result {
Err(e) => assert_eq!(
format!("{:?}", e),
r#"InputError("Deleting from table 'readonly1' is not allowed")"#,
),
Err(e) => assert!(format!("{:?}", e)
.starts_with(r#"InputError("Deleting from table 'readonly1' is not allowed")"#)),
_ => assert!(false, "Expected an error result but got an OK result"),
};

let result = valve.delete_row("view1", &1).await;
match result {
Err(e) => assert_eq!(
format!("{:?}", e),
r#"InputError("Deleting from table 'view1' is not allowed")"#,
),
Err(e) => assert!(format!("{:?}", e)
.starts_with(r#"InputError("Deleting from table 'view1' is not allowed")"#)),
_ => assert!(false, "Expected an error result but got an OK result"),
};

Expand Down Expand Up @@ -977,6 +984,7 @@ pub async fn run_api_tests(valve: &Valve) -> Result<()> {
test_insert_1(&valve).await?;
test_update_2(&valve).await?;
test_insert_2(&valve).await?;
test_insert_3(&valve).await?;
test_dependencies(&valve).await?;
test_undo_redo(&valve).await?;
test_randomized_api_test_with_undo_redo(&valve).await?;
Expand Down
Loading
Loading