Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't run ACAS Xu example #735

Open
ckessler2 opened this issue Oct 16, 2023 · 15 comments
Open

Can't run ACAS Xu example #735

ckessler2 opened this issue Oct 16, 2023 · 15 comments
Labels
bug Something isn't working upstream

Comments

@ckessler2
Copy link

When running these commands:

vehicle verify
--specification acasXu.vcl
--verifier Marabou
--network acasXu:acasXu_1_7.onnx
--property property3

I get this error:

Verifying properties:
property3 [......................................................] 0/1 queries
Error: Marabou was automatically killed by the OS with signal '4'.
The most common reason for this is an out-of-memory-error.

When running compile and verify separately, compile works but verify gives same error. The files passed are attached, although they should be identical to the originals.
acasXu.zip

Module versions: maraboupy==1.0.0, vehicle-lang==0.11.0

OS is Ubuntu 22.04.3 LTS (jammy) running in WSL 2 on Windows 10 Pro, and I have 16gb of RAM available. I have set WSL to use all 16gb in its config (grep MemTotal /proc/meminfo returns 16gb) and tried WSL 1.

According to time the command runs for 0.186s and there is no visible increase in memory on task manager - compared to 44s and a significant increase when running the mnist-robustness example (which runs fine).

@MatthewDaggitt
Copy link
Collaborator

MatthewDaggitt commented Oct 17, 2023

Hmm so it looks like this is a problem on the Marabou end. But Vehicle's diagnosis is wrong, Marabou isn't running out of memory. Error code 4 is "illegal instruction". So there's something buggy in your build of Marabou.

Can you run

vehicle compile 
  --target MarabouQueries
  --specification acasXu.vcl
  --network acasXu:acasXu_1_7.onnx
  --property property3
  --output queries

and then run

Marabou
  acasXu_1_7.onnx
  queries/property3-query1.txt

and report the output? (I haven't tested that command myself. The first one compiles the queries, and the second one runs Marabou explicitly to capture it's full output).

@MatthewDaggitt MatthewDaggitt transferred this issue from vehicle-lang/tutorial Oct 17, 2023
@MatthewDaggitt
Copy link
Collaborator

I've also transferred this issue to the main Vehicle repo.

@MatthewDaggitt MatthewDaggitt added bug Something isn't working upstream labels Oct 17, 2023
@ckessler2
Copy link
Author

It outputs this:

Network: acasXu_1_7.onnx
Property: queries/property3-query1.txt

Engine::processInputQuery: Input query (before preprocessing): 314 equations, 615 variables
Engine::processInputQuery: Input query (after preprocessing): 614 equations, 861 variables

Input bounds:
        x0: [ -0.3035,  -0.2986]
        x1: [  0.0000,   0.0095]
        x2: [  0.4934,   1.0000]
        x3: [  0.3000,   0.4091]
        x4: [  0.3000,   0.5000]

Illegal instruction (core dumped)

I will try reinstalling marabou in case it changes anything

@wenkokke
Copy link
Collaborator

wenkokke commented Oct 17, 2023

Try compiling Marabou from source. This error implies that the binaries that you are using were compiled for a newer CPU.

Would you be able to post the contents of /proc/cpuinfo?

@ckessler2
Copy link
Author

I did compile Marabou from source, as per these instructions https://neuralnetworkverification.github.io/Marabou/Setup/0_Installation.html

/proc/cpuinfo returns this 12 times:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 151
model name      : 12th Gen Intel(R) Core(TM) i5-12400F
stepping        : 2
microcode       : 0xffffffff
cpu MHz         : 2496.000
cache size      : 18432 KB
physical id     : 0
siblings        : 12
core id         : 0
cpu cores       : 6
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 21
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves umip gfni vaes vpclmulqdq rdpid fsrm flush_l1d arch_capabilities
bugs            : spectre_v1 spectre_v2 spec_store_bypass swapgs retbleed eibrs_pbrsb
bogomips        : 4992.00
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

@wenkokke
Copy link
Collaborator

wenkokke commented Oct 19, 2023

It appears that the binaries distributed via PyPI require AVX-512, which your CPU does not support. (The full list of CPU extensions that are required but unsupported is AVX512, MODE64, BWI, NOVLX, VLX, and DQI, but I'm no expert, so some of these could be due to aliasing.)

As you're compiling from source, I'm unsure if the same is true for the binaries you've built. You can verify the CPU extensions required by the binary you've built by running elfx86exts.

Either way, this appears to be an issue with the way in which Marabou is built. I can submit an upstream issue, if you can provide me with some information:

  1. Run the following command from examples/acasXu:

    vehicle compile -s acasXu.vcl -t MarabouQueries -n acasXu:acasXu_1_7.onnx -e property3 -o property3

    Please attach the file property3/property3-query1.txt to your response.

  2. Run the following command from examples/acasXu to verify that it results in a similar error message:

    Marabou --input-query property3/property3-query1.txt

    Please include the output and error message in your response.

  3. Run the following command:

    file $(which Marabou)

    If the command outputs "ASCII text", "Bourne-Again shell script", "Python script", or anything that gives you the impression it's a text file, find the path to your Marabou executable with the following command:

    python -c "import maraboupy.MarabouCore; print(maraboupy.MarabouCore.__file__)"

    Otherwise, find the path to your Marabou executable with the following command:

    which Marabou

    Then, either:

    • Attach the Marabou executable to your response.
    • Determine the CPU extensions required by the Marabou executable, and include those in your response.

    To determine the CPU extensions required by the Marabou executable:

    • Install elfx86exts.

      # install rust
      curl https://sh.rustup.rs -sSf | sh
      # install elfx86exts
      cargo install elfx86exts
    • Run the following command, replacing $PATH_TO_MARABOU with the path you found:

      elfx86exts $PATH_TO_MARABOU

      Please include the output in your response.

@wenkokke
Copy link
Collaborator

Follow up:

  • DQI is AVX-512 Doubleword and Quadword Instructions
  • BWI is AVX-512 Byte and Word Instructions
  • VLX is AVX-512 Vector Length eXtensions

So we can pretty confidently say the problem is due to the lack of AVC-512 support.

@ckessler2
Copy link
Author

1: property3-query1.txt

2: It doesn't result in the same error I think, but here is the full output: Step_2_Output.txt

3:

File format and CPU architecture: Elf, X86_64
MODE64 (call)
SSE2 (movdqa)
SSE1 (movaps)
CMOV (cmove)
AVX (vmovsd)
NOVLX (vxorps)
AVX512 (vpmaxsq)
VLX (vpmaxsq)
BMI2 (shlx)
AVX2 (vpxor)
DQI (kmovb)
BWI (kmovd)
Instruction set extensions used: AVX, AVX2, AVX512, BMI2, BWI, CMOV, DQI, MODE64, NOVLX, SSE1, SSE2, VLX
CPU Generation: Unknown

@wenkokke
Copy link
Collaborator

wenkokke commented Oct 20, 2023

That's a succesful run... That's... confusing.

AND the executable you're running uses AVX-512 instructions... which your processor doesn't seem to support?

@ckessler2
Copy link
Author

I got the same 'out-of-memory' error when running the example on the Linux machine in the office, which has all the libraries installed and up to date.

However, Ben can run the example on his laptop, which has a different version of Marabou - "origin/negative-dimensions-fix 8d26b936" retrieved on June 19th 2023. We haven't been able to find that version on the Marabou repository

@wenkokke
Copy link
Collaborator

I got the same 'out-of-memory' error when running the example on the Linux machine in the office, which has all the libraries installed and up to date.

However, Ben can run the example on his laptop, which has a different version of Marabou - "origin/negative-dimensions-fix 8d26b936" retrieved on June 19th 2023. We haven't been able to find that version on the Marabou repository

Do you think that's a genuine out of memory error this time?

@ckessler2
Copy link
Author

No, because it has 29gb available and exits after <1s as before

@wenkokke
Copy link
Collaborator

And in both cases you've built Marabou from source?

@ckessler2
Copy link
Author

After reinstalling everything I have solved the problem - although the output:

Verifying properties:
  property3 [======================================================] 1/1 queries
    result: ✗ - counterexample found
      x: [1799.988667, 1.950928632e-2, 3.09999732192, 980.0, 1017.6036]

is different to the tutorial example:

Verifying properties:
  property3 [======================================================] 1/1 queries
    result: ✗ - counterexample found
      x: [1799.9886669999978, 5.6950779776e-2, 3.09999732192, 980.0, 960.0]

I'm working on recreating the issue so I can find the exact cause

@MatthewDaggitt
Copy link
Collaborator

I wouldn't worry about too much. I really doubt that Marabou (and all the downstream tools it uses) make much effort in to ensuring stability of counter-examples across different platforms. As long as the overall answer is the same, i.e. counterexample or proved, then I think it's fine 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working upstream
Projects
None yet
Development

No branches or pull requests

3 participants