Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix algorithm method binding #405

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

koubaa
Copy link
Contributor

@koubaa koubaa commented Dec 9, 2024

This fixes two issues in the python binding for the mgr.algorithm method:

  1. There was a problem with the binding that used the runtime type for the spec_consts to handle the push_consts
  2. There wasn't enough overload sets defined to allow dispatching differing permutations of numpy.array and python lists. Prior to this change, if either of the arguments was a python list, pybind11 would choose the overload where both arguments were python lists

As part of the change, I refactored how the arguments were dispatched. It isn't perfectly generic, but is more concise and less error-prone than before

((double*)specInfo.ptr) +
specInfo.size);
if (spec_consts.dtype().is(py::dtype::of<std::float_t>())) {
std::vector<float> pushconstsvec((float*)pushInfo.ptr,
Copy link
Contributor Author

@koubaa koubaa Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see for example here. We use spec_consts.dtype().is(py::dtype::of<std::float_t>() to choose the type for the pushconstvec to be std::vector<float> In this case, we should be checking for push_consts.dtype().is(py::dtype::of<std::float_t>().

@axsaucedo
Copy link
Member

Thanks for the contribution. Before review just to make sure I understand, could you provide some examples for 1. and 2. where you saw these limitations and issues? Namely to make sure I understand what were the issues you were seeing, and also to validate whether/what tests we should add to avoid regression

@koubaa
Copy link
Contributor Author

koubaa commented Dec 10, 2024

1

I show an example of a problem in the code in my comment on main.cpp.

Suppose you have the following python code:

spec_data = [1,2,3]
push_data = [5,6,7]
if CASE == 1:
   spec_type, push_type = (np.uint32, np.float64)
elif CASE == 2:
   spec_type, push_type = (np.uint64, np.float32)
elif CASE == 3:
   spec_type, push_type = (np.float32, np.int64)
algo = mgr.algorithm(
    [tensor1], spirv, workgroup,
    np.array(spec_data, dtype=spec_type),
    np.array(push_data, dtype=push_type)
)

The declaration of the generic algorithm call is:

template<typename S = float, typename P = float>
    std::shared_ptr<Algorithm> algorithm(
      const std::vector<std::shared_ptr<Memory>>& memObjects,
      const std::vector<uint32_t>& spirv,
      const Workgroup& workgroup,
      const std::vector<S>& specializationConstants,
      const std::vector<P>& pushConstants)

In case 1, S and P would both be uint32_t
In case 2, S and P would both be uint64_t
In case 3, S and P would both be float

Fundamentally, it is because the np.array dtype for push_const was never read, the dtype for spec_type was always used to define both arrays. This actually causes memory errors, since the pointer arithmetic done by, for example the old code would do this for case 2:

 std::vector<uint64_t> pushConstsVec(
                      (uint64_t*)pushInfo.ptr,
                      ((uint64_t*)pushInfo.ptr) + pushInfo.size);

which leads to out-of-bounds memory access since the underlying type of pushInfo is a 32 bit float, which is narrower than 64 bits.

2

The pybind11 overloads defined were one of these two:

A:

(kp::Manager& self,
const std::vector<std::shared_ptr<kp::Memory>>& tensors,
const py::bytes& spirv,
const kp::Workgroup& workgroup,
const py::array& spec_consts,
const py::array& push_consts)

B:

(kp::Manager& self,
const std::vector<std::shared_ptr<kp::Memory>>& tensors,
const py::bytes& spirv,
const kp::Workgroup& workgroup,
const std::vector<float>& spec_consts,
const std::vector<float>& push_consts)

I didn't look into the pybind11 overload set resolution rules in detail, but I observed at runtime when I did this in python:

mgr.algorithm(
        tensors,
        spirvVec,
        workgroup,
        np.array([1,2,3], dtype=np.uint32),
        []
      );

It ended up picking overload B. I added the flexibility to the overload set to allow a mix of vector<float> and py::array.

Test recommendation

I think we should have tests that exercise all 4 of these overloads, as well as tests that mix dtypes between spec and push consts to fully cover this.

@koubaa
Copy link
Contributor Author

koubaa commented Dec 10, 2024

The original overload resolution fix wasn't bulletproof here. I found an issue just now where it was picking the wrong constructor. After reading the overload resolution rules for pybind11, I fixed it.

We have to use py::list() rather than vector<float> when expecting a python list, and cast it to vector<float> explicitly. We can also aadd the noconvert attribute to the py::array argument to forbit implicit conversions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants