Skip to content

intrinsic-test: Final code cleanup for the arm and common module #1884

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

madhav-madhusoodanan
Copy link
Contributor

@madhav-madhusoodanan madhav-madhusoodanan commented Jul 24, 2025

Summary

  1. Changed from IntrinsicType::target (String) to IntrinsicType::metadata (HashMap<String, String>) for better support for differing architectures
  2. Added Constraint::Set(Vec<i64>) for support for distinct constant argument values (which may be of types similar to the IMM type)

Context

This PR is part 3 of the changes that were originally made in the x86 extension PR #1814 for intrinsic-test, and is intended to accomodate the changes made in PRs #1862 and #1863.

cc: @Amanieu @folkertdev

@madhav-madhusoodanan
Copy link
Contributor Author

I'm coming across a weird compilation issue on the CI:

clang++: warning: argument unused during compilation: '-fuse-ld=lld' [-Wunused-command-line-argument]
mod_3.cpp:138:43: error: use of undeclared identifier 'vld1q_f16'

Copy link
Contributor

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This'll need some work, and may benefit from splitting it up even further.

I'm not convinced about metadata being a hashmap. What do you actually need to store in there? (I'm really trying to avoid generalizing this codebase before there is a concrete need).


re

clang++: warning: argument unused during compilation: '-fuse-ld=lld' [-Wunused-command-line-argument]
mod_3.cpp:138:43: error: use of undeclared identifier 'vld1q_f16'

the first part you can ignore, the second part is due to an expect that this PR introduces, and it is emitted so often that CI gives up.

Comment on lines +1 to +12
use crate::arm::intrinsic::ArmIntrinsicType;
use crate::common::argument::Argument;

impl Argument<ArmIntrinsicType> {
pub fn type_and_name_from_c(arg: &str) -> (&str, &str) {
let split_index = arg
.rfind([' ', '*'])
.expect("Couldn't split type and argname");

(arg[..split_index + 1].trim_end(), &arg[split_index + 1..])
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this really need its own file?

.add_extra_flags(vec!["-ffp-contract=off", "-Wno-narrowing"]);
.add_extra_flags(vec!["-ffp-contract=off", "-Wno-narrowing", "-mfpu=neon"]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was debugging the cause of the issue error: use of undeclared identifier 'vld1q_f16' on the CI tool.
This specific change was only experimental.

Comment on lines -71 to +73
let cpp_compiler = compile::build_cpp_compilation(&self.cli_options).unwrap();
let cpp_compiler_wrapped = compile::build_cpp_compilation(&self.cli_options);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we really should just panic if there is no compiler configured. What are we supposed to do in that scenario? We should not fail silently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could be

let Some(cpp_compiler) = compile::build_cpp_compilation(&self.cli_options) else { 
    panic!("no C compiler configured");
}

I guess.

Copy link
Contributor Author

@madhav-madhusoodanan madhav-madhusoodanan Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a CLI option to only generate the C and Rust test files (without building or linking).
This is useful if I only need to get the test files without needing to setup the compilation or linking toolchains.

The relevant flag is --generate-only.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, ok, can you add a comment mentioning that flag?

(I think we should clean that up with a custom enum Mode { Compile(PathBuf), GenerateOnly } at some point)

Comment on lines 79 to 82
let mut file = File::create(&c_filename).unwrap();
let mut file = fs::File::create(&c_filename).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just keep this as File, you can use use std::fs::{self, File} to import file and the fs module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.
I made this change, since I thought importing {self, File} would look awkward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a bit weird at first but we do this all the time e.g. in the compiler.

Comment on lines 91 to 92
let ty = ArmIntrinsicType::from_c(type_name)
.unwrap_or_else(|_| panic!("Failed to parse argument '{arg}'"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, this line fails a bunch in CI, so the logic here is incorrect for this target.

Comment on lines 11 to 16
pub fn to_range(&self) -> Range<i64> {
pub fn to_vector(&self) -> Vec<i64> {
match self {
Constraint::Equal(eq) => *eq..*eq + 1,
Constraint::Range(range) => range.clone(),
Constraint::Equal(eq) => vec![*eq],
Constraint::Range(range) => range.clone().collect::<Vec<i64>>(),
Constraint::Set(values) => values.clone(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use this instead to avoid clones/allocations:

impl Constraint {
    /// Iterate over the values of this constraint.
    fn iter<'a>(&'a self) -> impl Iterator<Item = i64> + 'a {
        match self {
            Constraint::Equal(i) => std::slice::Iter::default().copied().chain(*i..*i + 1),
            Constraint::Range(range) => std::slice::Iter::default().copied().chain(range.clone()),
            Constraint::Set(items) => items.iter().copied().chain(std::ops::Range::default()),
        }
    }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not super happy with the name "constraint" (probably made sense at the time, but now?) but we can resolve that later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, this actually is present to support IMM type values, since Rust uses const generics to pass the value.

Example:

#[stable(feature = "simd_x86", since = "1.27.0")]
pub fn _mm256_blend_ps<const IMM8: i32>(a: __m256, b: __m256) -> __m256

Comment on lines -124 to +125
pub target: String,
pub metadata: HashMap<String, String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this a hashmap? Couldn't this be a struct instead? what values do you actually need to store in here for x86?

Copy link
Contributor Author

@madhav-madhusoodanan madhav-madhusoodanan Jul 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X86 has a “TYPE” and “ETYPE” values for each argument of an intrinsic.

The “TYPE” tag is the type of the argument/return value as it appears in the C definition, while “ETYPE” gives information about how the argument will be used (eg: when using vector arguments, whether the operation would be U64 or FP16 based)

Both the information would be required, but it doesn’t make sense to create explicit struct members for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I didn't see what type this was in. Hmm, shouldn't the target-specific intrinsic type hold such information?

I'll say I am a bit surprised by the target being a part of IntrinsicType, that doesn't seem right.

@@ -5,13 +5,15 @@ use std::ops::Range;
pub enum Constraint {
Equal(i64),
Range(Range<i64>),
Set(Vec<i64>),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an example here of a set that is not a range (but that will actually come up)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example X86 intrinsic: https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=_mm_i32gather_pd&ig_expand=3720

The function definition is:

__m128d _mm_i32gather_pd (double const* base_addr, __m128i vindex, const int scale)

and scale can only take the values 1, 2, 4, and 8.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an x86 related change though, would it be recommended to keep it in the PR that will add x86 to the behavioural testing test suite?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, can you change this type to something like:

/// Describes the values to test for a const generic parameter.
#[derive(Debug, PartialEq, Clone, Deserialize)]
pub enum Constraint {
    /// Test a single value.
    Equal(i64),
    /// Test a range of values, e.g. `0..16`.
    Range(Range<i64>),
    /// Test some specific values, e.g. `vec![1, 2, 4, 8]`.
    Set(Vec<i64>),
}

We might make Equal just a singleton set at some point too, maybe.

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-intrinsictype-metadata-migration branch 3 times, most recently from e1a7243 to 58f200e Compare July 25, 2025 11:15
@madhav-madhusoodanan
Copy link
Contributor Author

madhav-madhusoodanan commented Jul 25, 2025

I was debugging this issue: mod_3.cpp:138:43: error: use of undeclared identifier 'vld1q_f16'.

The command that was being run was:
clang++ -march=armv8.6-a+crypto+crc+dotprod+fp16 -O2 -ffp-contract=off -Wno-narrowing -mfpu=neon --target=armv7-unknown-linux-gnueabihf mod_3.cpp -c -o mod_3.o.

I suspected that armv8.6-a and --target=armv7-unknown-linux-gnueabihf might be clashing (armv8-a vs armv7), so i removed the -march tag itself.

This led me to the following error (showing a sample, there were more intrinsics):

mod_3.cpp:861:43: error: use of undeclared identifier 'vld1_f16'
        float16x4_t a = cast<float16x4_t>(vld1_f16(&a_vals[i]));
                                          ^
mod_3.cpp:862:43: error: use of undeclared identifier 'vld1_f16'
        float16x4_t b = cast<float16x4_t>(vld1_f16(&b_vals[i]));

The intrinsic vld1_f16 is present on v7, A64 and A32 architectures, so not like we can add it (and the others) to the exclusion list:

  {
    "SIMD_ISA": "Neon",
    "name": "vld1_f16",
    "arguments": [
      "float16_t const * ptr"
    ],
    "return_type": {
      "value": "float16x4_t"
    },
    "Arguments_Preparation": {
      "ptr": {
        "register": "Xn"
      }
    },
    "Architectures": [
      "v7",
      "A32",
      "A64"
    ],
    "instructions": [
      [
        "LD1"
      ]
    ]
  },

@folkertdev
Copy link
Contributor

Can you isolate the issue into a godbolt.org snippet? I vaguely remember there is some full-fp16 flag though I can't find much about it now.

@madhav-madhusoodanan
Copy link
Contributor Author

madhav-madhusoodanan commented Jul 25, 2025

@folkertdev Here is a minimal reproduction: https://godbolt.org/z/e1r3x9xfo

@folkertdev
Copy link
Contributor

using -fpu=neon-fp16 helps a bit.

also I don't really see why these changes cause this issue? was this code not generated before?

Copy link
Contributor

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at this point it's probably best to make separate PRs for

  • the changes to Constraint (incorporating my comments)
  • the changes to crates/intrinsic-test/src/common/argument.rs and related

I think that is the core here, any by having just those changes there should be fewer weird CI problems etc.

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-intrinsictype-metadata-migration branch from 49c517a to 22a763e Compare July 26, 2025 13:06
@madhav-madhusoodanan
Copy link
Contributor Author

Yeah you're right.
I'm looking at the minimal set of changes that could waive the undeclared identifier issue away.

@madhav-madhusoodanan madhav-madhusoodanan force-pushed the intrinsic-test-intrinsictype-metadata-migration branch from 22a763e to 1f39d69 Compare July 26, 2025 14:34
@madhav-madhusoodanan madhav-madhusoodanan marked this pull request as ready for review July 26, 2025 16:44
@rustbot
Copy link
Collaborator

rustbot commented Jul 26, 2025

r? @Amanieu

rustbot has assigned @Amanieu.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@madhav-madhusoodanan
Copy link
Contributor Author

Finally fixed that weird CI issue.
Had to commit from scratch

Copy link
Contributor

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make separate PRs for those two changes I mentioned (and maybe also the --generate-only changes) first? Assuming the commits are atomic that should be straightforward with some giet cherry-pick ing

Comment on lines -124 to +125
pub target: String,
pub metadata: HashMap<String, String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I didn't see what type this was in. Hmm, shouldn't the target-specific intrinsic type hold such information?

I'll say I am a bit surprised by the target being a part of IntrinsicType, that doesn't seem right.

@madhav-madhusoodanan
Copy link
Contributor Author

Sure, let me do it right away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants