Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inclusion test with shortest words #201

Draft
wants to merge 1 commit into
base: devel
Choose a base branch
from
Draft

Conversation

jurajsic
Copy link
Member

@jurajsic jurajsic commented Dec 5, 2024

Trying to replace inclusion test in stabilization algorithm with inclusion test based on shortest words (which should be complete for SAT non-chan-free word equations + regular constraints).

@jurajsic
Copy link
Member Author

jurajsic commented Dec 5, 2024

The test right now is not implemented very nicely, probably could be done more effeciently.

@jurajsic
Copy link
Member Author

jurajsic commented Dec 5, 2024

# of formulae: 103405
--------------------------------------------------
tool                            ✅    ❌       time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ------  ----  ---------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef  102581   824    9425.04   0.09   0.01   1.37  63078    39503        548   198        78        0
z3-noodler-eafdc5b-56780ef  102582   823    9281.37   0.09   0.01   1.31  63079    39503        548   199        76        0
cvc5-1.2.0                   99841  3564   73861.19   0.74   0.01   6.42  60931    38910          2  3553         9        0
z3-4.13.3                    97952  5453  110832.31   1.13   0.01   7.15  59124    38828        211  4762       480        0
--------------------------------------------------


Benchmark sygus_qgen
# of formulae: 343
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef   343     0    3.54   0.01   0.01   0.00    343        0          0     0         0        0
z3-noodler-eafdc5b-56780ef   343     0    3.74   0.01   0.01   0.00    343        0          0     0         0        0
cvc5-1.2.0                   343     0   80.99   0.24   0.04   0.93    343        0          0     0         0        0
z3-4.13.3                    343     0    6.02   0.02   0.01   0.02    343        0          0     0         0        0
--------------------------------------------------

Benchmark denghang
# of formulae: 999
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef   999     0   41.29   0.04   0.01   0.65    186      813          0     0         0        0
z3-noodler-eafdc5b-56780ef   999     0   41.69   0.04   0.01   0.66    186      813          0     0         0        0
cvc5-1.2.0                   982    17  371.55   0.38   0.00   4.64    170      812          0    17         0        0
z3-4.13.3                    883   116  284.84   0.32   0.06   2.27    186      697          0   103        13        0
--------------------------------------------------

Benchmark automatark
# of formulae: 15995
--------------------------------------------------
tool                           ✅    ❌     time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  -------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef  15990     5   615.64   0.04   0.01   0.44  10477     5513          0     5         0        0
z3-noodler-eafdc5b-56780ef  15990     5   608.40   0.04   0.01   0.43  10477     5513          0     5         0        0
cvc5-1.2.0                  15903    92  2196.84   0.14   0.00   2.77  10460     5443          0    84         8        0
z3-4.13.3                   15876   119  4833.58   0.30   0.01   3.28  10471     5405          0   118         1        0
--------------------------------------------------

Benchmark stringfuzz
# of formulae: 11618
--------------------------------------------------
tool                           ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef  11617     1    281.20   0.02   0.01   0.17   6355     5262          1     0         0        0
z3-noodler-eafdc5b-56780ef  11617     1    273.81   0.02   0.01   0.17   6355     5262          1     0         0        0
cvc5-1.2.0                  10946   672  30049.07   2.75   0.00  12.34   6184     4762          0   671         1        0
z3-4.13.3                   11150   468  39860.85   3.57   0.01  13.27   6064     5086          0   343       125        0
--------------------------------------------------

Benchmark redos
# of formulae: 3287
--------------------------------------------------
tool                          ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef  3285     2   1983.12   0.60   0.02   3.16   2445      840          0     2         0        0
z3-noodler-eafdc5b-56780ef  3285     2   1989.25   0.61   0.02   3.17   2445      840          0     2         0        0
cvc5-1.2.0                  1116  2171  27007.04  24.20   0.14  34.28    620      496          0  2171         0        0
z3-4.13.3                    890  2397    477.21   0.54   0.00   5.81     50      840          0  2094       303        0
--------------------------------------------------

Benchmark norn
# of formulae: 1027
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef  1027     0   11.49   0.01   0.01   0.01    712      315          0     0         0        0
z3-noodler-eafdc5b-56780ef  1027     0   11.36   0.01   0.01   0.01    712      315          0     0         0        0
cvc5-1.2.0                   943    84  247.71   0.26   0.01   2.41    705      238          0    84         0        0
z3-4.13.3                    904   123  218.66   0.24   0.01   4.24    712      192          0   123         0        0
--------------------------------------------------

Benchmark slog
# of formulae: 1976
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef  1976     0   21.48   0.01   0.01   0.02    808     1168          0     0         0        0
z3-noodler-eafdc5b-56780ef  1976     0   22.02   0.01   0.01   0.02    808     1168          0     0         0        0
cvc5-1.2.0                  1976     0    1.85   0.00   0.00   0.00    808     1168          0     0         0        0
z3-4.13.3                   1954    22  426.94   0.22   0.00   3.09    786     1168          0    22         0        0
--------------------------------------------------

Benchmark slent
# of formulae: 1128
--------------------------------------------------
tool                          ✅    ❌     time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  -------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef  1123     5   159.45   0.14   0.01   2.14    626      497          0     5         0        0
z3-noodler-eafdc5b-56780ef  1123     5   158.83   0.14   0.01   2.13    626      497          0     5         0        0
cvc5-1.2.0                  1104    24  1002.26   0.91   0.00   6.12    618      486          0    24         0        0
z3-4.13.3                   1057    71   402.18   0.38   0.00   4.12    573      484          0    70         1        0
--------------------------------------------------

Benchmark omark
# of formulae: 17
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef     9     8    0.13   0.01   0.01   0.01      2        7          0     6         2        0
z3-noodler-eafdc5b-56780ef     9     8    0.10   0.01   0.01   0.01      2        7          0     6         2        0
cvc5-1.2.0                     5    12    1.03   0.21   0.00   0.46      2        3          0    12         0        0
z3-4.13.3                      7    10    6.19   0.88   0.00   2.33      2        5          0     8         2        0
--------------------------------------------------

Benchmark kepler
# of formulae: 587
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef   584     3   13.76   0.02   0.01   0.06    284      300          0     3         0        0
z3-noodler-eafdc5b-56780ef   584     3   13.61   0.02   0.01   0.06    284      300          0     3         0        0
cvc5-1.2.0                   347   240    0.86   0.00   0.00   0.00     50      297          0   240         0        0
z3-4.13.3                    278   309   21.10   0.08   0.01   0.84    135      143          0   275        34        0
--------------------------------------------------

Benchmark woorpje
# of formulae: 809
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef   773    36  266.97   0.35   0.01   3.99    611      162          0    15        21        0
z3-noodler-eafdc5b-56780ef   775    34  277.61   0.36   0.01   4.23    613      162          0    15        19        0
cvc5-1.2.0                   755    54   47.07   0.06   0.01   0.25    591      164          0    54         0        0
z3-4.13.3                    782    27  223.57   0.29   0.01   3.58    618      164          0    27         0        0
--------------------------------------------------

Benchmark webapp
# of formulae: 681
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef   365   316  139.94   0.38   0.02   3.19    137      228        315     0         1        0
z3-noodler-eafdc5b-56780ef   365   316  133.71   0.37   0.01   3.21    137      228        315     0         1        0
cvc5-1.2.0                   588    93  767.89   1.31   0.02   8.73    165      423          0    93         0        0
z3-4.13.3                    458   223  336.97   0.74   0.01   6.87     43      415        121   101         1        0
--------------------------------------------------

Benchmark kaluza
# of formulae: 19432
--------------------------------------------------
tool                           ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef  19386    46  619.27   0.03   0.01   0.96  16277     3109         20    17         9        0
z3-noodler-eafdc5b-56780ef  19386    46  619.89   0.03   0.01   0.99  16277     3109         20    17         9        0
cvc5-1.2.0                  19426     6  459.94   0.02   0.00   0.42  16297     3129          0     6         0        0
z3-4.13.3                   19139   293  935.22   0.05   0.00   1.24  16022     3117          0   293         0        0
--------------------------------------------------

Benchmark transducer_plus
# of formulae: 91
--------------------------------------------------
tool                          ✅    ❌    time     avg     med     std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  ------  ------  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef     0    91    0.00  nan     nan     nan         0        0         91     0         0        0
z3-noodler-eafdc5b-56780ef     0    91    0.00  nan     nan     nan         0        0         91     0         0        0
cvc5-1.2.0                    42    49  482.14   11.48    3.11   22.06     41        1          0    49         0        0
z3-4.13.3                      1    90    0.01    0.01    0.01  nan         0        1         90     0         0        0
--------------------------------------------------

Benchmark leetcode
# of formulae: 2652
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef  2651     1   40.33   0.02   0.01   0.02    867     1784          0     0         1        0
z3-noodler-eafdc5b-56780ef  2651     1   40.25   0.02   0.01   0.02    867     1784          0     0         1        0
cvc5-1.2.0                  2652     0   36.00   0.01   0.00   0.03    867     1785          0     0         0        0
z3-4.13.3                   2652     0   14.06   0.01   0.00   0.07    867     1785          0     0         0        0
--------------------------------------------------

Benchmark str_small_rw
# of formulae: 1880
--------------------------------------------------
tool                          ✅    ❌    time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef  1816    64  253.11   0.14   0.01   1.82      4     1812          4    32        28        0
z3-noodler-eafdc5b-56780ef  1816    64  249.31   0.14   0.01   1.74      4     1812          4    32        28        0
cvc5-1.2.0                  1860    20   34.43   0.02   0.00   0.62     19     1841          2    18         0        0
z3-4.13.3                   1821    59  121.11   0.07   0.00   1.68     25     1796          0    59         0        0
--------------------------------------------------

Benchmark pyex
# of formulae: 23845
--------------------------------------------------
tool                           ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef  23748    97   3901.38   0.16   0.08   1.77  20376     3372          0    81        16        0
z3-noodler-eafdc5b-56780ef  23747    98   3757.47   0.16   0.08   1.52  20375     3372          0    82        16        0
cvc5-1.2.0                  23820    25   6002.12   0.25   0.07   2.51  20400     3420          0    25         0        0
z3-4.13.3                   23018   827  47047.85   2.04   0.08   9.41  19745     3273          0   827         0        0
--------------------------------------------------

Benchmark full_str_int
# of formulae: 16968
--------------------------------------------------
tool                           ✅    ❌      time    avg    med    std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  -----  ----  --------  -----  -----  -----  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef  16819   149   1072.38   0.06   0.01   1.47   2498    14321        117    32         0        0
z3-noodler-eafdc5b-56780ef  16819   149   1079.62   0.06   0.01   1.48   2498    14321        117    32         0        0
cvc5-1.2.0                  16963     5   5072.37   0.30   0.01   1.39   2521    14442          0     5         0        0
z3-4.13.3                   16739   229  15615.95   0.93   0.00   5.75   2482    14257          0   229         0        0
--------------------------------------------------

Benchmark snia
# of formulae: 70
--------------------------------------------------
tool                          ✅    ❌    time     avg     med     std    sat    unsat    unknown    TO    MO+ERR    other
--------------------------  ----  ----  ------  ------  ------  ------  -----  -------  ---------  ----  --------  -------
z3-noodler-2e6712d-56780ef    70     0    0.56    0.01    0.01    0.00     70        0          0     0         0        0
z3-noodler-eafdc5b-56780ef    70     0    0.70    0.01    0.01    0.01     70        0          0     0         0        0
cvc5-1.2.0                    70     0    0.03    0.00    0.00    0.00     70        0          0     0         0        0
z3-4.13.3                      0    70    0.00  nan     nan     nan         0        0          0    70         0        0
--------------------------------------------------

image

@jurajsic
Copy link
Member Author

jurajsic commented Dec 5, 2024

Seems like it is slower with the inefficient implementation, we could probably make it as fast as normal inclusion test, but it will not help us (on smt benchmarks).

@vhavlena
Copy link
Collaborator

vhavlena commented Dec 5, 2024

The enumeration of shortest words might be expensive. Maybe we could think about some compact representation of shortest words (e.g., by taking the original NFA, remove some states and transitions in order to get only shortest words). Then the inclusion checking should be similarly expensive as the original one (probably even faster).

@jurajsic
Copy link
Member Author

jurajsic commented Dec 5, 2024

Yeah, I was thinking about the compact representation too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants