Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deps: klab: use tag kevm-v1.0.0-3efa855-forked #46

Merged
merged 1 commit into from
Feb 26, 2020
Merged

Conversation

d-xo
Copy link
Contributor

@d-xo d-xo commented Feb 4, 2020

bumps klab to da172c, this includes the updated evm-semantics.

@d-xo d-xo requested a review from a team February 4, 2020 08:59
@d-xo
Copy link
Contributor Author

d-xo commented Feb 4, 2020

will merge if CI passes: https://buildbot.dapp.ci/#/buildrequests/1836

@d-xo
Copy link
Contributor Author

d-xo commented Feb 4, 2020

Lots of breakage: https://dapp.ci/k-uniswap/01ef46d5bcd9faf6e9a2.

Lots of [Error] Critical: (null) 😢

@d-xo
Copy link
Contributor Author

d-xo commented Feb 4, 2020

All failing specs are failing with pretty much the same error:

[Error] Critical:  (null)
  while evaluating function sizeWordStackAux * X
  while evaluating function #sizeWordStack
  while evaluating function #sizeByteArray

Where X is slightly different between specs.

@d-xo
Copy link
Contributor Author

d-xo commented Feb 4, 2020

UniswapV2Exchange_nonces_pass_rough was rejected on CI, but is accepted locally.

Both CI and my local repo were using d2fc81. My submodules are up to date, I'm using the nix-shell and I had run make to build a fresh klab before running klab prove

@d-xo
Copy link
Contributor Author

d-xo commented Feb 4, 2020

I ran klab prove-all against this branch on my hetzner box and I got this: https://ipfs.io/ipfs/QmafJG7cWPpaPeyzfo6TgMU9SH59bJP3ZSPBSwJu5DSeMR/

There are more (and different) failures on my hetzner box than on CI 😕

@MrChico
Copy link

MrChico commented Feb 4, 2020

also able to prove nonces_pass_rough locally (shell.nix, same proof hash)

@MrChico
Copy link

MrChico commented Feb 5, 2020

also, balanceOf specs that were failing on your hetzner machine are passing locally for me @xwvvvvwx

@d-xo
Copy link
Contributor Author

d-xo commented Feb 5, 2020

@d-xo
Copy link
Contributor Author

d-xo commented Feb 5, 2020

token0 is rejected in CI, but token1 is accepted. These specs are basically identical, I don't understand what's going on at all 😕

@asymmetric
Copy link
Contributor

@MrChico one thing to try when seeing ambiguous results is to nuke out. Me and @xwvvvvwx have already seen how this can remove sources of deviation.

@d-xo
Copy link
Contributor Author

d-xo commented Feb 5, 2020

@asymmetric iirc that was only an issue with klab fetch, where a failing proof fetched from CI did not overwrite the traces from the same proof that had already succeeded locally.

@asymmetric
Copy link
Contributor

asymmetric commented Feb 7, 2020

token1 is accepted for me locally, but is rejected on CI. Same proofhash, freshly cloned submodules, freshly nuked ./out.

It also passes on @xwvvvvwx's server, so it's buildbot that's having issues here IMO.


NOTE: the klab report gets rewritten after rebuilds, so the links above are not pointing to the same thing they were pointing at when I wrote the comment.

@asymmetric
Copy link
Contributor

asymmetric commented Feb 8, 2020

Re-triggered a build with persistDirs=False on buildbot, here are the results.

Buildbot before, after.

@d-xo
Copy link
Contributor Author

d-xo commented Feb 8, 2020

wonder if it's worth trying with nix-shell --pure?

@asymmetric
Copy link
Contributor

Yeah, that's probably a good idea, but then we should do it on our local machines as well.

@asymmetric
Copy link
Contributor

I've enabled --pure on buildbot.

@asymmetric
Copy link
Contributor

asymmetric commented Feb 10, 2020

Possible sources of indeterminism and whether they've been ruled out:

  • persistent KLAB_OUT
  • gnu parallel settings
  • something else with prove-all
  • something in K's build system
  • something exclusive to the bulidbot machine
  • K OOM
  • hardware
  • z3 random seed
  • something in upstream evm-semantics
  • something in dapphub evm-semantics
  • something in rules.k
  • the error is also in master (impossible to be certain about)

@asymmetric
Copy link
Contributor

asymmetric commented Feb 10, 2020

Currently investigating if running prove-all without shared state fixes things.


EDIT: It doesn't.

@asymmetric
Copy link
Contributor

Current results of debugging inconsistent failures on token0:

  • running prove-all on sanctum gets all token0 specs accepted
  • running klab prove UniswapV2Exchange_token0_pass_rough again after that fails with the null error
  • running klab prove UniswapV2Exchange_token0_pass_rough on a tmp KLAB_OUT gets the spec accepted again

So it seems to me that the results of a prove-all leave KLAB_OUT in a weird state.

Currently tring to run token0 on the CI server outside of buildbot, to rule out weird architecture issues.

@asymmetric
Copy link
Contributor

Correction to the above: klab prove UniswapV2Exchange_token0_pass_rough on sanctum failed once, and passed once, suggesting that there's something fundamentally indeterministic going on.

I'd be inclined to think it's the z3 tactic, but I don't understand why it would only show up in this branch, since it has been there for a while.

Also, klab prove UniswapV2Exchange_token0_pass_rough consistently fails on buildbot.

@asymmetric
Copy link
Contributor

asymmetric commented Feb 11, 2020

Currently testing this branch with this change on top.


EDIT: As expected, no change.

@asymmetric
Copy link
Contributor

For reference, here's a failure with --log-basic enabled.

@asymmetric
Copy link
Contributor

asymmetric commented Feb 14, 2020

I just got the following error, but it could be due to my printing out large debug messages.

SPEC ERROR: /home/asymmetric/k-uniswap-v2-debug-java/out/specs/4cc138f9060099b2fa045436cf2b8766d9fdc3bf27118c96879e8b6617ffa414.k Location(12,3,186,38)
==================================
Success execution paths: 0
Failed execution paths: 0
Paths in progress: 1
Longest path: 1 steps
Stats for each phase, time, used memory, implicit main GC time percentage:
Total                 :  353.568 s,      7958 MB, gc: 16.818 %,          post-gc mem:  7957 MB,   exp gc time:  1.142 s
  JVM init            :    0.032 s
  Parsing             :  163.224 s,      1121 MB, gc:  0.464 %
  Rewriter init       :    1.866 s,      1228 MB, gc:  9.271 %
  Execution           :  188.447 s,      7958 MB, gc: 31.061 %,          post-gc mem:  7957 MB,   exp gc time:  1.142 s

Init+Execution time:     190.313 s
  query build time                 :    0.021 s,      #          9
  Z3 Function rule implication time:    0.107 s,      #          9
      unknown       :       8
      unsat (proved):       1

  Time and top-level event counts:
  resolveFunctionAndAnywhere time  :  189.975 s,      #        994
    evaluateFunction time            :  189.967 s,      #        290
      builtin evaluation               :           ,      #         54
      function rule                    :           ,      #         19
      no rule applicable               :           ,      #        216
      other                            :           ,      #          1
    applyAnywhereRules time          :    0.000 s,      #        320
      no anywhere rules                :           ,      #        320
    remaining time & # cached        :    0.007 s,      #        384
  impliesSMT time                  :    0.134 s,      #          9

  Recursive event counts:
  resolveFunctionAndAnywhere time  :           ,      #     282388
    evaluateFunction time            :           ,      #     112496
      builtin evaluation               :           ,      #      54831
      function rule                    :           ,      #      36097
      no rule applicable               :           ,      #        100
      other                            :           ,      #      21380
    applyAnywhereRules time          :           ,      #      12329
      no anywhere rules                :           ,      #      12329
    # cached                         :           ,      #     157563

Max memory : 9102 MB
Function cache size:     0 MB,      177 entries
Formula cache size :     0 MB,        0 entries
toString cache size:  7542 MB,    31099 entries

==================================

[Error] Internal: Uncaught exception thrown of type OutOfMemoryError.
Please rerun your program with the --debug flag to generate a stack trace, and file a bug report at https://github.com/kframework/k/issues (GC overhead limit exceeded)
Proof 1 REJECT: 4cc138f9060099b2fa045436cf2b8766d9fdc3bf27118c96879e8b6617ffa414.k [UniswapV2Exchange_token0_pass_rough]

@asymmetric
Copy link
Contributor

asymmetric commented Feb 14, 2020

Currently figuring out the cause of the null in the error, which might have something to do with our problem.

The exception is being thrown from this function.


UPDATE: When we end up here the exception is a StackOverFlowError.

@asymmetric asymmetric force-pushed the bump-klab branch 3 times, most recently from ce7223e to 0ab92a6 Compare February 16, 2020 16:00
@desaperados
Copy link
Contributor

Updated to latest klab, which includes latest evm-semantics from upstream master.

@asymmetric
Copy link
Contributor

Yep, which AFAIU happens when the proof has already been accepted in a previous run, or did I misunderstand? Either way, there's something fishy going on - the proof wasn't accepted (I looked in out/accepted) and so it should've been run.

@d-xo
Copy link
Contributor Author

d-xo commented Feb 21, 2020

@asymmetric I think it also happens if the proof was rejected in a previous run and it's hash has not changed (in this case because the change to the Unlocked storage block did not actually change the generated K spec).

@asymmetric
Copy link
Contributor

Ah I see, it makes sense that it would cache failures too, yes.

@asymmetric
Copy link
Contributor

But then it shouldn’t return status 0 and therefore we shouldn’t have a green build.

@d-xo
Copy link
Contributor Author

d-xo commented Feb 22, 2020

But then it shouldn’t return status 0 and therefore we shouldn’t have a green build.

yes I would agree with that

@d-xo
Copy link
Contributor Author

d-xo commented Feb 24, 2020

I think that once this: dapphub/klab#365 is merged we can bump klab on this branch, and should hopefully have a green CI run with working reports.

Very happy that this saga seems to be coming to an end. Super impressive work on this one @asymmetric 👏 🎉

@desaperados
Copy link
Contributor

I think we could also merge and bump to latest semantics? dapphub/klab#358

@asymmetric
Copy link
Contributor

@xwvvvvwx @desaperados How did you debug this? I found it very hard to see what was going on with skim due to the many branches being created and klab-debug therefore stopping being usable.

@d-xo
Copy link
Contributor Author

d-xo commented Feb 25, 2020

@asymmetric I stopped the proof run as soon as I saw a few Halt! Terminating branch. In that state there were not too many branches and I could use klab debug. I then observed that there were lots of unsimplified memory expressions in the behaviour tree and on the stack. I stepped back from the first appearance of these big unsimplified memory lookups until I found the MSTORE that was messing up memory in the first place (I determine this by looking at the memory view and seeing when it stops being pretty).

Once I found the problematic MSTORE I saw that one of the packed memory lemmas that we added together when we worked on skim previously was no longer being applied. I saw that it was left unsimplified as an expression containing #padRightToWidth and then guessed that perhaps the issue was due to it's inclusion in the concrete-rules.

@d-xo
Copy link
Contributor Author

d-xo commented Feb 25, 2020

Also to be clear the passing run on this branch in commit d797b47 was pointing to a klab branch and we should retarget this branch to a commit on master before we merge.

I should have done this on a separate test branch. Sorry.

@asymmetric
Copy link
Contributor

Ah OK, I instead ended up chasing a red herring about SSTOREs and origStorage that turned out to be fruitless but was nevetheless quite interesting.

@d-xo
Copy link
Contributor Author

d-xo commented Feb 26, 2020

rebased onto master and bumped klab to latest master (d8784cf)

@asymmetric asymmetric merged commit 413e73a into master Feb 26, 2020
@asymmetric asymmetric deleted the bump-klab branch February 26, 2020 15:56
@d-xo
Copy link
Contributor Author

d-xo commented Feb 26, 2020

exhaustiveness spec for the Factory was rejected: https://dapp.ci/k-uniswap/d3b64406a31937cc2f1b/

@d-xo
Copy link
Contributor Author

d-xo commented Feb 26, 2020

but was accepted locally 🤕

@asymmetric
Copy link
Contributor

asymmetric commented Feb 26, 2020

Fuck. I only looked at the build status here on GitHub which was green. We need to work on dapphub/klab#368 and/or generally on the reliability of the exit status of klab-prove-all.

@d-xo
Copy link
Contributor Author

d-xo commented Feb 26, 2020

I don't see either of the exhaustiveness specs in the stdio for that run....

@d-xo
Copy link
Contributor Author

d-xo commented Feb 26, 2020

error log: https://ipfs.io/ipfs/QmPQnVqHDgjJ4nf4ygxDMpvBrdEEaLM9ekzxfhW3YY7WbE

wondering if this is just a cache invalidation issue...

@asymmetric
Copy link
Contributor

What happens if you try running with --deterministic-functions? The error looks like the ones we had when our semantics were ambiguous.

@d-xo
Copy link
Contributor Author

d-xo commented Feb 26, 2020

What happens if you try running with --deterministic-functions?

Spec is still accepted locally.

@d-xo
Copy link
Contributor Author

d-xo commented Feb 26, 2020

I suspect this is the issue. The proof hash of the exhaustiveness specs does not depend on the klab version, and so an old exhaustiveness spec run with an old klab version is being cached and not rerun.

@d-xo
Copy link
Contributor Author

d-xo commented Feb 27, 2020

dapphub/klab#373

@d-xo d-xo restored the bump-klab branch April 20, 2020 06:00
@d-xo d-xo deleted the bump-klab branch April 20, 2020 06:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants