Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sail Crosscheck Testing with ACT Against Spike #481

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Abdulwadoodd
Copy link

@Abdulwadoodd Abdulwadoodd commented May 20, 2024

This PR should complete this SOW.

PR adds the RISCOF-based CI infrastructure to test sail-riscv against Spike using ACTs. Both Sail and Spike are configured using ISA yaml configuration (provided under ci-tests/riscof/) which are validated by riscv-config using RISCOF

Current infrastructure uses 3.8.10 release of arch-test-suites. The later releases of arch-tests (that add Zfa and Zfh tests) are partially broken for RISCOF - and should be fixed shortly.

CI runs ACTS in RV32I, RV64I, and RV32E configurations. Details are as follows. After successful completion, test reports for each configuration are uploaded to GitHub Workspace.

Kindly review, and make suggestions how the CI should be pulled in to this repo. I'll be happy to make the requested changes.

against spike using ACTs

sail-riscv is tested against spike in CI using RISCOF, for rv32, rv64
and rv32e configurations.

Signed-off-by: Abdul Wadood <[email protected]>
@jrtc27
Copy link
Collaborator

jrtc27 commented May 20, 2024

Most of this looks like it belongs in a centralised location so code can be reused?

@jrtc27
Copy link
Collaborator

jrtc27 commented May 20, 2024

That or each project should put its riscof config in a place that other projects can use. The Sail model shouldn’t have to have a copy of how to build tests for Spike.

@Abdulwadoodd
Copy link
Author

this looks like it it belongs in a centralised location

Not exactly sure what this means but most of the code (python plugins, linkers, model-specific headers or yaml, etc.) here is either produced while setting up RISCOF (configuring Spike as DUT and SAIL as reference model) or added/modified manually as per the requirement by following the RISCOF documentation.

Each project should put its riscof config in a place that other projects can use

What do you mean by "project" here?

@jrtc27
Copy link
Collaborator

jrtc27 commented May 20, 2024

this looks like it it belongs in a centralised location

Not exactly sure what this means but most of the code (python plugins, linkers, model-specific headers or yaml, etc.) here is either produced while setting up RISCOF (configuring Spike as DUT and SAIL as reference model) or added/modified manually as per the requirement by following the RISCOF documentation.

I don't really care about the background, I care about the end result, which does not look like good software engineering to me.

Each project should put its riscof config in a place that other projects can use

What do you mean by "project" here?

The Sail model, the Spike simulator, etc.

@@ -0,0 +1,23 @@
# !bin/bash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space in between '#' and '!' here


# workaround to avoid clang format error in sail-riscv CI.
# *clang suggest style that is syntactically incorrect*
# Insert the following macro at run-time.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just use clang-format-ignore, rather than work around it like this?

@@ -0,0 +1,179 @@
import os
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is, the files in this directory, especially this one, have absolutely nothing to do with the Sail model. They should not be copied to every project/repository/whatever that needs to use Spike, because that quickly becomes an unmaintainable mess where everybody has a differently-outdated version, and you end up with O(n^2) of these files rather than O(n) due to the cross-product between reference and DUT. There should be some way to only have one copy of "how do I build and run tests for platform X", and the most obvious (to me) way to achieve that is to put that code in platform X's repository at a stable published location.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but my understanding is the current RISCOF setup essentially expects the user to bring along a certain amount of boilerplate for both the thing you are testing and the thing you are testing against. I think we need to be careful that we review this PR as an attempt to improve our lack of testing by integrating RISCOF rather than look at larger issues in the RISC-V ecosystem that may not be easily actionable by the PR author.

@Alasdair
Copy link
Collaborator

If we want to run ACT cross-checks against spike in our CI, which I think is a good idea, then at least some of the infrastructure for running that needs to be in this repository. While we could pull some of that in from a third-party repository, that isn't necessarily a good idea as it means the CI scripts/config would no-longer be versioned together with the model which creates a bunch of issues.

Overall, I think it would be better if the ci-tests directory was moved into the existing tests folder as a subdirectory and renamed (the existing tests are already CI tests!), maybe as act_cross_check or similar. I'd also change the formatting of cSim to just csim, I don't see any reason for the weird capitalisation of the the 'S'.

self.work_dir = work_dir
self.objdump_cmd = 'riscv64-unknown-elf-objdump -D {0} > {1};'
# self.archtest_env = archtest_env.replace("riscv-arch-test", "../my-repo/riscv-arch-test")
self.compile_cmd = 'riscv64-unknown-elf-gcc -march={0} \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I said it years ago for the previous attempt to add something like this to the Sail model and I'll say it again:

  1. Do not hard-code GCC/binutils. Clang/LLVM should be just as well supported, and is arguably a better experience for the end user because they don't need to install a special cross-compiler, as every Clang build can compile for every architecture by default.
  2. The differences between compiling for platforms X and Y are usually minimal, typically some linker script tweaks and providing model_test.h. That should be reflected in the software implementation by having common library code provided by RISCOF to deal with compilers etc, rather than duplicating all that for every single project. Each project should just provide the project-specific compilation information, as well as how to run the test on it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agreed. These riscof generated files are awfully non-generic and the whole design of the framework makes no sense to me. Unfortunately there is no alternative so I guess we will have to include these files in the repository until there is a better solution. Maybe in some hidden .github directory?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been almost 3 years since I reviewed #97 and made the same comments. I don't see why we should accept poor code here just because people responsible for the RISC-V testing frameworks don't action feedback. All this tells me is that the only way to get it done is to push back hard and force them to, otherwise I have no faith that it will ever get fixed and instead just be declared as done and working despite the poor design.

To quote a popular adage, "poor planning on your part does not necessitate an emergency on mine".

-I ' + archtest_env

# workaround to avoid clang format error in sail-riscv CI.
# *clang suggest style that is syntactically incorrect*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what it was complaining about, but you can suppress clang-format changes when needed. Please elaborate. But this is not the right approach.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've run into this bug too. clang-format will sometimes reformat assembly into invalid code.

In this case it changes the first two lines to

#define RVMODEL_DATA_SECTION                                                   \
  .pushsection.tohost, "aw", @progbits;                                        \

which I guess is wrong. Anyway I agree, just use // clang-format off and // clang-format on.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 20, 2024

If we want to run ACT cross-checks against spike in our CI, which I think is a good idea, then at least some of the infrastructure for running that needs to be in this repository. While we could pull some of that in from a third-party repository, that isn't necessarily a good idea as it means the CI scripts/config would no-longer be versioned together with the model which creates a bunch of issues.

Overall, I think it would be better if the ci-tests directory was moved into the existing tests folder as a subdirectory and renamed (the existing tests are already CI tests!), maybe as act_cross_check or similar. I'd also change the formatting of cSim to just csim, I don't see any reason for the weird capitalisation of the the 'S'.

The Sail side should be, but the Spike side shouldn't, otherwise those files won't be versioned with Spike, creating the same kinds of issues you talk about.

Copy link
Collaborator

@jrtc27 jrtc27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NAK. Nothing's changed since #97 3 years ago it seems.

Copy link
Collaborator

@Timmmm Timmmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite a proliferation of bash scripts here. I really think we should avoid that - they tend to be extremely fragile and difficult to maintain. IMO we should use Python.

-I ' + archtest_env

# workaround to avoid clang format error in sail-riscv CI.
# *clang suggest style that is syntactically incorrect*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've run into this bug too. clang-format will sometimes reformat assembly into invalid code.

In this case it changes the first two lines to

#define RVMODEL_DATA_SECTION                                                   \
  .pushsection.tohost, "aw", @progbits;                                        \

which I guess is wrong. Anyway I agree, just use // clang-format off and // clang-format on.

self.isa = ' ' # 'rv' + self.xlen
ilp32 = 'ilp32e ' if "E" in ispec["ISA"] else 'ilp32 '
self.compile_cmd = self.compile_cmd+' -mabi='+('lp64 ' if 64 in ispec['supported_xlen'] else ilp32)
# if "I" in ispec["ISA"]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of commented out code around here.

--env=riscv-arch-test/riscv-test-suite/env \
--no-browser --work-dir "$1"

if grep -rniq "$1"/report.html -e '>0failed<'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of thing is very fragile in my experience. It's going to say it succeeded if it can't find "failed" which could be for any number of reasons.

Surely riscof has a more robust output method?

@@ -10,8 +10,8 @@ function test_build () {
fi
}

test_build make ARCH=RV32 ocaml_emulator/riscv_ocaml_sim_RV32
test_build make ARCH=RV64 ocaml_emulator/riscv_ocaml_sim_RV64
test_build make -j"$(nproc 2> /dev/null || sysctl -n hw.ncpu)" ARCH=RV32 ocaml_emulator/riscv_ocaml_sim_RV32
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a comment at least. Why not just nproc?

billmcspadden-riscv pushed a commit that referenced this pull request May 24, 2024
That made sense when mostly people used riscv-tools, but now they get
tools from all sorts of places and most of them are suitable for the
debug tests.

Also document RISCV_TESTS_DEBUG_GCC and RISCV_TESTS_DEBUG_GDB
environment variables in the README.

The github workflows that rely on these tests don't use the Makefile,
but instead invoke gdbserver.py directly, so they're not affected by
this change.

Fixes #481
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants