-
Notifications
You must be signed in to change notification settings - Fork 42
Long vector test plan #421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Long vector test plan #421
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far.
We try and format these specs to 80-columns to make them easier to review in source form.
Are the tables of operations a list of all the things that'll need tests added or is that still being determined?
- For tests where we only add new test cases we do not need to update HLK test GUIDs. | ||
- If we want to add a new HLK requirement, then we will probably want to create new tests which consume that requirement. Individual test cases can not have individual HLK requirements. | ||
- The existing HLK reqs we have look pretty dated. "Device.Graphics.WDDM27.AdapterRender.D3D12.DXILCore.ShaderModel65.CoreRequirement". Add a requirement for SM 6.9? We're requiring Long Vectort support for SM 6.9 | ||
3. Do the work in two 'phases'? Add the tests. Then do the HLK side. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should build up a plan that involves multiple deliveries to the IHVs. If we can identify a set of high-priority tests (eg those needed by a particular demo) we could try and get those out first to support IHVs building drivers for the preview release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this and if IHVs are OK with it I think the easiest way is to just share stand-alone test binaries/collateral. Can you help me sync up with the right people to figure out what works for IHVs? What have we done in the past?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, we can sort that out, but I'm more interested in what and when we'll deliver these rather than the mechanism used to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, given that the DXC repo is public we should be able to just share build/run instructions for the tests once checked in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss this further offline. I'll make a note to ask you.
Will update the formatting. The intrinsics in the table are all of the explicitly listed intrinsics in the hlsl long vector. I think some operators are missing from this table. I'll look to get everything added so we have an exhaustive list. |
'ExecTests' source code in the DXC repo. There is a script in the WinTools repo | ||
which generates and annotates the HLK tests. | ||
|
||
There are three test categories we are concerned with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need special tests for groupshared?
Are there any other types of storage that might be interesting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, I was thinking about this and I need to learn/explore more. I suspect I'm actually missing details on test case granularity. Does testing across data types matter? Do we care to verify the output of operations for correctness?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably be testing for correctness, yes. Sizes, especially around interesting alignment cases, are probably interesting. It might be interesting to build cases that could conceivably generate overruns and check for them. For example, most GPUs will operate on at least 32-bits at once, so if you have 16-bit values then what happens if you have an odd number of elements - could it accidentally overwrite the next variable if it is assuming alignment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And so the size of a type, and the element count, gets interesting. I expect you could come up with some interesting boundary cases and reuse that across all the types without having to think hard about each combination individually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just trying to add detail and answer questions where I can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good - there's still some open issues.
I'm also keen to see a prioritization of which tests will be implemented first.
These operations are good candidates for high-priority tests I think:
- Initializing a vector with another.
- Multiply all components of a vector with a scalar value
- Add all components of a vector with a scalar value
- Clamp all components of a vector to the range [c, t]
- Component wise minimum between 2 vectors
- Component wise maximum between 2 vectors
- Component wise multiply between 2 vectors
- Component wise add between 2 vectors
- Subscript access, vec[i] = c
vectors across buffer types. __TODO: Add details about which types of | ||
buffers to test and why.__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Add details about which types of buffers to test and why
This comment here to remind us that there's a TODO. If we want to complete the PR without doing the TODO then we should get a follow-up issue filed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this include vector load/stores as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to fill in the details before completing this PR.
| atan2 | Atan | CreateFDiv, CreateFAdd, CreateFSub, CreateFCmpOLT, | ||
||| CreateFCpmOEQ, CreateFCmpOGE, CreateFCmpOLT, CreateAnd, CreateSelect | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing that these "CreateXXX" function names came from the C++ code. The LLVM instructions are things like fdiv
, fadd
, fcmp
, and
, select
etc.
I also note that if I try atan2
in godbolt it doesn't actually generate fsub
. https://godbolt.org/z/7xqz6448P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they're the C++ helper functions. I was intending to distill down further on a subsequent iteration. Wanted to sanity check that they do what I would expect (always return the llvm instruction in the name).
I'll have to peel a little more in TranslateAtan2, for this 'inventory' I went through the various Translate* functions in HLSLOperationLower and noted the possible operations and instructions. For TranslateAtan2 it looks like FSub is always used, so it must be optimized or simplified out along the way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asked for some help. The FSub is optimized out to normalize to FAdd. If you pass '-O0' instead you'll see that the FSub is there.
But this makes me wonder about the right testing approach to ensure coverage. Given that the goal is to test that DXIL Ops and LLVM Instructions are functional with vectors, maybe we want to disable optimizations for the test cases?
4568301
to
518753e
Compare
.