fix algorithm method binding #405

koubaa · 2024-12-09T20:17:25Z

This fixes two issues in the python binding for the mgr.algorithm method:

There was a problem with the binding that used the runtime type for the spec_consts to handle the push_consts
There wasn't enough overload sets defined to allow dispatching differing permutations of numpy.array and python lists. Prior to this change, if either of the arguments was a python list, pybind11 would choose the overload where both arguments were python lists

As part of the change, I refactored how the arguments were dispatched. It isn't perfectly generic, but is more concise and less error-prone than before

Signed-off-by: koubaa <[email protected]>

koubaa · 2024-12-09T20:18:45Z

python/src/main.cpp

-                                                  ((double*)specInfo.ptr) +
-                                                    specInfo.size);
-                if (spec_consts.dtype().is(py::dtype::of<std::float_t>())) {
-                    std::vector<float> pushconstsvec((float*)pushInfo.ptr,


see for example here. We use spec_consts.dtype().is(py::dtype::of<std::float_t>() to choose the type for the pushconstvec to be std::vector<float> In this case, we should be checking for push_consts.dtype().is(py::dtype::of<std::float_t>().

axsaucedo · 2024-12-10T06:19:50Z

Thanks for the contribution. Before review just to make sure I understand, could you provide some examples for 1. and 2. where you saw these limitations and issues? Namely to make sure I understand what were the issues you were seeing, and also to validate whether/what tests we should add to avoid regression

koubaa · 2024-12-10T14:41:50Z

1

I show an example of a problem in the code in my comment on main.cpp.

Suppose you have the following python code:

spec_data = [1,2,3]
push_data = [5,6,7]
if CASE == 1:
   spec_type, push_type = (np.uint32, np.float64)
elif CASE == 2:
   spec_type, push_type = (np.uint64, np.float32)
elif CASE == 3:
   spec_type, push_type = (np.float32, np.int64)
algo = mgr.algorithm(
    [tensor1], spirv, workgroup,
    np.array(spec_data, dtype=spec_type),
    np.array(push_data, dtype=push_type)
)

The declaration of the generic algorithm call is:

template<typename S = float, typename P = float>
    std::shared_ptr<Algorithm> algorithm(
      const std::vector<std::shared_ptr<Memory>>& memObjects,
      const std::vector<uint32_t>& spirv,
      const Workgroup& workgroup,
      const std::vector<S>& specializationConstants,
      const std::vector<P>& pushConstants)

In case 1, S and P would both be uint32_t
In case 2, S and P would both be uint64_t
In case 3, S and P would both be float

Fundamentally, it is because the np.array dtype for push_const was never read, the dtype for spec_type was always used to define both arrays. This actually causes memory errors, since the pointer arithmetic done by, for example the old code would do this for case 2:

 std::vector<uint64_t> pushConstsVec(
                      (uint64_t*)pushInfo.ptr,
                      ((uint64_t*)pushInfo.ptr) + pushInfo.size);

which leads to out-of-bounds memory access since the underlying type of pushInfo is a 32 bit float, which is narrower than 64 bits.

2

The pybind11 overloads defined were one of these two:

A:

(kp::Manager& self,
const std::vector<std::shared_ptr<kp::Memory>>& tensors,
const py::bytes& spirv,
const kp::Workgroup& workgroup,
const py::array& spec_consts,
const py::array& push_consts)

B:

(kp::Manager& self,
const std::vector<std::shared_ptr<kp::Memory>>& tensors,
const py::bytes& spirv,
const kp::Workgroup& workgroup,
const std::vector<float>& spec_consts,
const std::vector<float>& push_consts)

I didn't look into the pybind11 overload set resolution rules in detail, but I observed at runtime when I did this in python:

mgr.algorithm(
        tensors,
        spirvVec,
        workgroup,
        np.array([1,2,3], dtype=np.uint32),
        []
      );

It ended up picking overload B. I added the flexibility to the overload set to allow a mix of vector<float> and py::array.

Test recommendation

I think we should have tests that exercise all 4 of these overloads, as well as tests that mix dtypes between spec and push consts to fully cover this.

koubaa · 2024-12-10T16:50:00Z

The original overload resolution fix wasn't bulletproof here. I found an issue just now where it was picking the wrong constructor. After reading the overload resolution rules for pybind11, I fixed it.

We have to use py::list() rather than vector<float> when expecting a python list, and cast it to vector<float> explicitly. We can also aadd the noconvert attribute to the py::array argument to forbit implicit conversions.

Signed-off-by: koubaa <[email protected]>

fix algorithm method binding

59d1426

Signed-off-by: koubaa <[email protected]>

koubaa commented Dec 9, 2024

View reviewed changes

Merge branch 'mine' into fix-constant-binding

977f07f

noconvert pyarray arguments

8e0289b

Signed-off-by: koubaa <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix algorithm method binding #405

fix algorithm method binding #405

koubaa commented Dec 9, 2024

koubaa Dec 9, 2024 •

edited

Loading

axsaucedo commented Dec 10, 2024

koubaa commented Dec 10, 2024 •

edited

Loading

koubaa commented Dec 10, 2024 •

edited

Loading

fix algorithm method binding #405

Are you sure you want to change the base?

fix algorithm method binding #405

Conversation

koubaa commented Dec 9, 2024

koubaa Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

axsaucedo commented Dec 10, 2024

koubaa commented Dec 10, 2024 • edited Loading

1

2

A:

B:

Test recommendation

koubaa commented Dec 10, 2024 • edited Loading

koubaa Dec 9, 2024 •

edited

Loading

koubaa commented Dec 10, 2024 •

edited

Loading

koubaa commented Dec 10, 2024 •

edited

Loading