Skip to content

Ranges: tuple<wide,...> is not that simple

Denis Yaroshevskiy edited this page May 20, 2021 · 1 revision

The original idea of zip_iterator is that it would be like a regular iterator but load/store tuple<wide...>. This is naturally extended into zip_iterator<zip_iterator<unaligned_ptr_iterator, unaligned_ptr_iterator>, unaligned_ptr_iterator> operating on tuple<tuple<wide, wide>, wide>

However, this does not quite work out.

NOTE: wide_value_type_t<I> is decltype(load(I{})) for readable iterators.

Consider reduce.

Interface of reduce is: reduce(I f, I l, Plus plus), where plus operates on wide_value_type<I>.

However in the final stage of reduce it needs to eve::reduce(wide_value_type<I>, plus). Which is not a thing for a tuple. (I tend to think that it should return tuple<value_type...> for every wide in the wide_value_type

This is not unique for reduce we need to shuffle for a lot of algorithms: reverse, sort etc.

What to do

I think that eve should not only operate on wide/logical but also on some custom type on top of tuple<wide/logical> where all of the components have the same cardinality. tuple<tuple...> should also be OK. The result should be tuple of results for individual components.

This does not make total sense for all of the operations, like all/any/first_true for example. For them I don't know what it should do. But for most operations it does.

How to do this

As a first approximation let's have a wide_tuple type that hooks into all of eve operations using tagged_dispatch. Under the hood it'd use the kumi::map. When the result of the individual operations is simd_value it should wrap the result into wide_tuple.

We need some sort of opt out mechanism though for some eve operations, I don't know what it should be.

There is also a bit of a question with two operand operations.

Alternatively we can start by enabling this only in some individual operations. We do not need that much: all shuffles, reduce come to mind. equal, operators etc are not crazy as well.

What about load/store

Would be cool to do load/store to support wide_tuple. It's not as simple as forward for tagged dispatch, because our pointers is a tuple of individual components. Doable though.